vendredi 16 octobre 2020

add a column to my dataframe based on information in two columns

I want to add a column to my dataframe based on information in two columns of the dataframe.

In my example dataframe, I have two lines of entries for one sample, which are rows 3 & 4. I want to code that writes a new column "main" and fills in "1" each row that has a unique tag number. For rows with duplicate tags numbers, I need the row with the highest weight to be "1" in main, and all other rows to be filled with "0".

df
       sp    weight   tag
1   green        70     1
2  yellow        63     2
3     red        41     3
4     red        25     3
5     red         9     3
df with "main" column added
       sp    weight   tag  main
1   green        70     1     1
2  yellow        63     2     1
3     red        41     3     1
4     red        25     3     0
5     red         9     3     0

Here's what I have so far:

df$is.uniq <- duplicated(df$tag) | duplicated(df$tag), fromLast = TRUE) 
df$main <- ifelse(is.uniq==TRUE, "1", ifelse(is.uniq==FALSE, "0", NA  )) 

I know i need to change the second ifelse statement to reference the weight column and fill 1 for the greatest weight and 0 for all else, but I haven't figured out how to do that yet.

Aucun commentaire:

Enregistrer un commentaire