lundi 26 janvier 2015

Extremely puzzling behavior when evaluating a tuple object that holds multiple DataFrames

I have some statements where I invoke a function that I defined myself:



sim_extracted_dfs = extract_dataframes(sim_queue_total_df_sim)
print (sim_extracted_dfs is tuple)


where extract_dataframes() is a function that accepts a large DataFrame as an argument and processes that DataFrame to return me a tuple consisting of 4 smaller DataFrames, as this is evident from the tuple that it returns to itself:



return ( pd.concat(objs = df_list_first_param, ignore_index = True),
pd.concat(objs = df_list_second_param, ignore_index = True),
pd.concat(objs = df_list_third_param, ignore_index = True),
pd.concat(objs = df_list_fourth_param, ignore_index = True) )


As sim_extracted_dfs is a tuple object, I am going to use it later on in my codes in some for loops where I would iterate over each item (DataFrame in this case) of this tuple. However, I ran into some problems trying to do so, and I just realized that somehow, sim_extracted_dfs does not seem to be regarded as a tuple when I execute my codes non-interactively. With the following debugging statements:



print (sim_extracted_dfs is tuple)
print type(sim_extracted_dfs)


I get these very puzzling and contradictory corresponding outputs in Terminal upon executing ipython data_analysis.py, where data_analysis is the name of the module:



False
<type 'tuple'>


I went one step further to launch IPython and imported my module (the name of my module is data_analysis) so that I can do my debugging interactively, and this is what I got:



In [108]: type(data_analysis.sim_extracted_dfs)
Out[108]: tuple

In [109]: data_analysis.sim_extracted_dfs is tuple
Out[109]: True

In [110]: print (data_analysis.sim_extracted_dfs is tuple)
True

In [111]: print data_analysis.sim_extracted_dfs is tuple
True

In [112]:


This is really driving me nuts. Is this a bug or something? Why is it that now sim_extracted_dfs is tuple is True? I've been stuck at just this one problem for almost the entire day now and I can't move forward with the rest of my module because everything else depends on this conditional to evaluate my tuple of DataFrames correctly. I will really appreciate any help on this.


Thank you very much.


Aucun commentaire:

Enregistrer un commentaire