jeudi 9 avril 2020

pandas - converting one column to three

I have one pandas series with different survey text on each row. As an example:

df = df.read_csv('survey_data.csv', header=None)

0 a comment
1 another comment
2 this what the person thought
3 what they felt
4 some more

For I want to change the series into a dataframe with three columns and save it as a csv file.

so new df would be:

a comment       another comment   this what the person thought
what they felt  some more

I actually don't care if the order gets mixed up. I will then output it to a csv file.

I have tried many different approaches my current one is this:

col_cnt = 1
df.dropna(inplace = True)  # removing null values to avoid errors
new_df = pd.DataFrame()
data = []

for index, row in df.iterrows():
    data.append(row)
    if col_cnt == 3: # we have done the three rows
        new_df.loc[len(new_df)]=list(data[1], data[2], data[3])
        col_cnt = 0
        data = [] # clear the list now that you have written it to the new df
    col_cnt = col_cnt + 1 #increment col counter for next row

    # need to write the remainder somehow

I get the error: IndexError: list index out of range

Update

This code that I found and modified works! But I can only get two columns in the right order. Not three as I would like. Change the 2 in the range to 3 returns only one column.

new_df = pd.DataFrame()

index = 1
for i in range(0, len(df), 2):
    new_df['Column' + str(index)] = df[0].iloc[i:i+3].reset_index(drop=True)
    index += 1

Aucun commentaire:

Enregistrer un commentaire