lundi 19 août 2019

Selection of rows by comparing two non-identically labelled columns

I am trying to work with two files: The first one being:

Dataset 1

The second one being this: Second dataset

I am using the following code:

def Get_PHD(row):
    if (row.Detection_Location == 'CV22'):
           PHD_df = pd.read_excel(r'C:\Users\s.gaur\Desktop\AWS machine learning project\LS1 - Edited file.xlsx', sheet_by_name = "Sheet1", index = False)
           PHD_df.loc(PHD_df['Timestamp'] >= df['Detection Start Time']) & ( Final_PHD['Timestamp'] <= df['Detection end time'])
           return (PHD_df)
    elif (row.Detection_Location == 'CV23'):
           PHD_df = pd.read_excel(r'C:\Users\s.gaur\Desktop\AWS machine learning project\trial phd import\LS2_Edited File.xlsx', sheet_by_name = "Sheet1")
           PHD_df.loc(PHD_df['Timestamp'] >= df['Detection Start Time']) & ( Final_PHD['Timestamp'] <= df['Detection end time'])
           return (PHD_df)


for index, row in df.iterrows():
    Final_PHD = Get_PHD(row)

This code is trying to pull the second dataset's version based on the value of the column 'detection_location' in first dataset. I am trying to pull only those rows from second dataset which fall between the timeframe of first dataset "detection start time' and 'detection end time'

i tried :

PHD_df.loc(PHD_df['Timestamp'] >= df['Detection Start Time']) & ( Final_PHD['Timestamp'] <= df['Detection end time'])

but getting the following error:

Can only compare identically-labeled Series objects.

how can i achieve the desired result.

Thanks in advance.

Aucun commentaire:

Enregistrer un commentaire