I'm reading in a large CSV file with flight records, and I would like to remove all of the rows that do not have either 'Origin_Airport_Code' or 'Destination_Airport_Code' as ORD. After that I would also like to combine the 'Year' and 'Flight Date' Columns into date time and I suppose index flights by the date time.
I'm not sure what to try since I'm new to python and pandas
''' data = pd.read_csv("groundhog_query.csv") '''
''' data.columns '''
''' Index(['Year', 'Flight_Date', 'Day_Of_Year', 'Unique_Carrier_ID', 'Airline_ID', 'Tail_Number', 'Flight_Number', 'Origin_Airport_ID', 'Origin_Market_ID', 'Origin_Airport_Code', 'Origin_State', 'Destination_Airport_ID', 'Destination_Market_ID', 'Destination_Airport_Code', 'Dest_State', 'Scheduled_Dep_Time', 'Actual_Dep_Time', 'Dep_Delay', 'Pos_Dep_Delay', 'Scheduled_Arr_Time', 'Actual_Arr_Time', 'Arr_Delay', 'Pos_Arr_Delay', 'Combined_Arr_Delay', 'Can_Status', 'Can_Reason', 'Div_Status', 'Scheduled_Elapsed_Time', 'Actual_Elapsed_Time', 'Carrier_Delay', 'Weather_Delay', 'Natl_Airspace_System_Delay', 'Security_Delay', 'Late_Aircraft_Delay', 'Div_Airport_Landings', 'Div_Landing_Status', 'Div_Elapsed_Time', 'Div_Arrival_Delay', 'Div_Airport_1_ID', 'Div_1_Tail_Num', 'Div_Airport_2_ID', 'Div_2_Tail_Num', 'Div_Airport_3_ID', 'Div_3_Tail_Num', 'Div_Airport_4_ID', 'Div_4_Tail_Num', 'Div_Airport_5_ID', 'Div_5_Tail_Num'], dtype='object') '''
This is how the columns are organized. Would I be able to do some if than statements or a loop? Thanks for the help
Aucun commentaire:
Enregistrer un commentaire