Below is the example of my dataset:
╔═══╦════════════╦═══════════════╗
║ ║ col_1 ║ col_2 ║
╠═══╬════════════╬═══════════════╣
║ 1 ║ 106 ║ I am Alex. ║
║ 2 ║ 106 ║ I'm a student ║
║ 3 ║ 106 ║ I like apple ║
║ 4 ║ 1786 ║ Dog is a pet ║
║ 5 ║ 1786 ║ Jack is my pet║
╚═══╩════════════╩═══════════════╝
and I would like to first groupby "col_1" and then join the string in "col_2" with the if-else condition of finding the last character in the string whether it is ended with "."
If it is ended with a fullstop, join the next string of the same group with " ".join (join them with a space). Else, join them with a fullstop.
End result will look something like this:
╔═══╦════════════╦══════════════════════════════════════════╗
║ ║ col_1 ║ col_2 ║
╠═══╬════════════╬══════════════════════════════════════════╣
║ 1 ║ 106 ║ I am Alex. I'm a student. I like apple ║
║ 2 ║ 1786 ║ Dog is a pet. Jack is my pet ║
╚═══╩════════════╩══════════════════════════════════════════╝
My code is stated as below:
new_df = df.groupby(['col_1'])['col_2'].apply(lambda x: ' '.join(x) if x[-1:] == '.' else '. '.join(x)).reset_index()
However I got this error instead:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Your help is much appreciated!
Aucun commentaire:
Enregistrer un commentaire