mardi 22 août 2017

How do I average values from one dataframe using conditions in a row of another dataframe of different size?

I'm struggling a bit with conditional functions

I have 2 datasets: predator and fish

predator with the following structure:

$ Type             : Factor w/ 4 levels "delipplasma",..: 4 3 2 1 4 3 2 4 3 4 ...
$ Group1           : Factor w/ 6 levels "five","four",..: 3 3 3 3 3 3 3 3 3 3 ...
$ Group2           : Factor w/ 4 levels "","five","four",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Group3           : Factor w/ 2 levels "","five": 1 1 1 1 1 1 1 1 1 1 ...
$ avgNwf           : logi  NA NA NA NA NA NA ...

fish with the following structure:

$ Type2 : Factor w/ 5 levels "flesh","fleshdelip",..: 1 2 1 2 1 2 3 2 1 2 ...
$ Group : Factor w/ 7 levels "five","four",..: 3 3 3 3 3 3 3 3 3 3 ...
$ N     : num  10.9 11.2 11.1 11.4 11 ...

I want to calculate a value for predator$avgNwf according to conditions of each sample (one sample per row). Each sample has a influencing factor in at least Group1 but may also have additional influencing factors in Group2 and Group3. I want to calculate the average fish$N if fish$Type2 == "wholefish" & fish$Group matches predator$Group1, predator$Group2 and predator$Group3. Not all predator$Groups have entries so I was running into a problem with excel where I couldn't make it ignore the #N/A.

IE for the first row in the image of the predator dataframe (see below), I want to average N (of the fish df) for all wholefish in groups four and five (as assigned in fish df) because they influence the results of the first entry in the predator df. (note: the 11.1183 value comes from having run an average of fish$N and needs to be replaced with the correct value with individual conditions).

I have tried the following with fewer conditions just to see if I am on track but to no avail:

predator$avgNwf <-mean(ifelse(fish$Group==predator$Group1,fish$N, "nope"))

predator$avgNwf <-mean(fish$N[fish$Group==predator$Group1,])

The following is not what I want to achieve:

N = summaryBy(N ~Group+Type2, data=fish, FUN=c(mean, sd), na.rm=T)

as it brings back a summary version of information instead of an individual result for each entry with its own conditions.

predator$avgNwf <-mean(fish$N)

as it lacks the conditions for each individual sample.

In excel I would use cell references to work with conditions unique to a row.

Predator df Fish df

Thanx in advance

Aucun commentaire:

Enregistrer un commentaire