if-statement: How to remove rows from pandas df based on values of two different columns

lundi 15 avril 2019

How to remove rows from pandas df based on values of two different columns

I'm reading in a large CSV file with flight records, and I would like to remove all of the rows that do not have either 'Origin_Airport_Code' or 'Destination_Airport_Code' as ORD. After that I would also like to combine the 'Year' and 'Flight Date' Columns into date time and I suppose index flights by the date time.

I'm not sure what to try since I'm new to python and pandas

''' data = pd.read_csv("groundhog_query.csv") '''

''' data.columns '''

''' Index(['Year', 'Flight_Date', 'Day_Of_Year', 'Unique_Carrier_ID', 'Airline_ID', 'Tail_Number', 'Flight_Number', 'Origin_Airport_ID', 'Origin_Market_ID', 'Origin_Airport_Code', 'Origin_State', 'Destination_Airport_ID', 'Destination_Market_ID', 'Destination_Airport_Code', 'Dest_State', 'Scheduled_Dep_Time', 'Actual_Dep_Time', 'Dep_Delay', 'Pos_Dep_Delay', 'Scheduled_Arr_Time', 'Actual_Arr_Time', 'Arr_Delay', 'Pos_Arr_Delay', 'Combined_Arr_Delay', 'Can_Status', 'Can_Reason', 'Div_Status', 'Scheduled_Elapsed_Time', 'Actual_Elapsed_Time', 'Carrier_Delay', 'Weather_Delay', 'Natl_Airspace_System_Delay', 'Security_Delay', 'Late_Aircraft_Delay', 'Div_Airport_Landings', 'Div_Landing_Status', 'Div_Elapsed_Time', 'Div_Arrival_Delay', 'Div_Airport_1_ID', 'Div_1_Tail_Num', 'Div_Airport_2_ID', 'Div_2_Tail_Num', 'Div_Airport_3_ID', 'Div_3_Tail_Num', 'Div_Airport_4_ID', 'Div_4_Tail_Num', 'Div_Airport_5_ID', 'Div_5_Tail_Num'], dtype='object') '''

This is how the columns are organized. Would I be able to do some if than statements or a loop? Thanks for the help

if-statement

lundi 15 avril 2019

How to remove rows from pandas df based on values of two different columns

Aucun commentaire:

Enregistrer un commentaire