lundi 23 novembre 2015

Add fake data to a data frame based on variable condition

Good afternoon,

I have to add dummy data to a dataframe whenever a specific variable is absent of several given intervals.

require(plyr)
df <- data.frame(length = c(1.5e+07, 2.5e+07), grade = c(1000, 1000), company = "TEST")
for(x in df$length){
if (x<=0|x>1e+07) {
df <- rbind.fill(df, data.frame(length = c(5000000), grade = c(1000)))
...

This works fine but I am having trouble to check if x is absent in each “length” interval from 0 to 1e+08, with a step of 1e+07, and add “1000“ in “grade” if that is the case. I tried all lot of things, and the end my data frame is only 1 row larger.

After that, I will create subgroups based on these intervals and I need a value for each subgroup.

df$length <- cut(df$length, breaks = seq(0, 1e+08, 1e+07))

In the end, the objective is to still get an empty space on a boxplot for each condition where there is no data, as the “1000“ I added is way above the limit threshold. The next step will be to do the same but for each “company” variable.

I hope I am clear, sorry for my English.

Thanks

Aucun commentaire:

Enregistrer un commentaire