vendredi 15 mars 2019

change multiple columns based on a condition and column name string

i have a very sparse data set - below is a example of the format. I want to make changes to specific columns based on the logic explained below

# create dummy data set
pb=c('1','0','0','0','0','1','Not_ans','1','0','Not_ans')
qa=c('1','1','0','0','1','0','Not_ans','1','Not_ans','Not_ans')
#zy=c('1','Not_ans','0','1','Not_ans','0','1','1','1','Not_ans')

#sub questions for pb
pb.abr=c('1','0','0','0','0','1','0','1','0','0')
pb.ras=c('0','0','0','0','1','0','0','1','0','0')
pb.sfg=c('1','0','0','0','0','0','0','1','0','0')

#sub questions for qa
qa.fgs=c('1','0','0','0','0','0','0','1','0','0')
qa.sdf=c('0','1','0','0','0','0','0','0','0','0')
qa.tyu=c('0','0','0','0','1','0','0','1','0','0')

df=data.frame(pb,qa,pb.abr,pb.ras,pb.sfg,qa.fgs,qa.sdf,qa.tyu)
df

        pb      qa pb.abr pb.ras pb.sfg qa.fgs qa.sdf qa.tyu
1        1       1      1      0      1      1      0      0
2        0       1      0      0      0      0      1      0
3        0       0      0      0      0      0      0      0
4        0       0      0      0      0      0      0      0
5        0       1      0      1      0      0      0      1
6        1       0      1      0      0      0      0      0
7  Not_ans Not_ans      0      0      0      0      0      0
8        1       1      1      1      1      1      0      1
9        0 Not_ans      0      0      0      0      0      0
10 Not_ans Not_ans      0      0      0      0      0      0

The two columns pb and qa are called base columns, and they have further sub columns for with naming convention as pb. and qa. - so we see three sub columns for pa and 3 for qa. I want to make changes to these sub columns based on a condition to the base column ( pa or qa) .

Condition is if column pb =='Not_ans' then make all sub columns (pb.abr,pb.ras and pb.sfg) = 'Not_applicable'

how do i write a function which achieves this? where i specify the base column name i.e. pb and naming of sub columns example 'pb.' below - would it be something like below but it wont give the result

data.frame(ifelse(df['base_q']=='Not_ans',
df[ , grepl( paste('base_q','.') , names(df) )]=='Not_applicable',df[,grepl( 
paste('base_q','.') , names(df)) ])

How do i write a generic function which takes the base column numbers as inputs for example 1,2 here - applies the function i.e whereever pb is Not_ans it changes sub_columns ( pb.abr,pb.ras,pb.sfg) to Not applicable and then moves to column 2 ( qa) and applies the same logic?

Aucun commentaire:

Enregistrer un commentaire