I have a dataframe (abund
) similar to the dune
and dune.env
datasets used in the vegan
package, except my first three columns summarise the sampleID and two methods used to collect the data. Data are the abundances of each species collected.
SampleID MethodA MethodB SpA SpB SpC ...
18001 A1 B1 0 3 4
18001 A1 B2 1 5 0
18001 A2 B1 0 7 0
18001 A2 B2 0 11 0
18002 A1 B1 4 1 0
18002 A1 B2 0 0 3
18002 A2 B1 0 0 0
18002 A2 B2 0 8 2
18003 A1 B1 0 9 0
....
I would like to create a new dataset (whole
) based on this data, but with only SampleID and MethodA as row identifiers.
SampleID MethodA MethodB SpA SpB SpC ...
18001 A1 B3 1 8 4
18001 A2 B3 0 18 0
18002 A1 B3 4 1 3
18002 A2 B3 0 8 2
18003 A1 B3 0 9 1
....
An extra twist is that instead of just adding data from B1 + B2, I first want to multiply B1 by 15 (ie: B3 = 15*B1 + B2).
There are two problems that I have.
- multiplying only certain rows of data.
I tried using an if statement:
wholeCalc <- function(MethodB, multiplier=15){
whole <- MethodB* multiplier
if(MethodB= "B2") {
whole <- whole/multiplier
}
}
-> there were a bunch of errors which indicate that I am way off the mark!
- figuring out how to group the data depending on SampleID and MethodA.
I have tried multiply ways to group the data, with not much success.
aggregate(abund,
list(Group=replace(rownames(abund$MethodB),
rownames(abund$MethodB)
%in%
c("B1","B2"),
"B3")),
sum)
-> I receive an error that arguments must be the same length (i.e. B1 and B2).
whole <- abund %>%
group_by(SampleID, MethodA)
whole
-> this gives me the same dataset that I began with.
rbind(abund,
c(MethodB = "B3",
abund[abund$MethodB == "B1", -3] +
abund[abund$MethodB == "B2", -3]))
-> this gives me an error because the number of rows for B1 and B2 do not match.
As you may see, I'm completely lost and need some help! I've been in the lab for the past couple of months and let my R skills get rusty.
Aucun commentaire:
Enregistrer un commentaire