I have a dataframe (containing 200k Rows):
DF1>
ID SR1 SR2 DRC1 DX2
1 123 as#12.c ABC-1 SXI
2 124 ae&14.v ABC-1 SXI
3 125 at$19.e AXX-1
4 125 at$19.e AXX-1 SCV
5 785 ab&22.n AWZ-2 DDF
6 849 ab&22.n AWZ-5 DDF
For this, I want to add new column to DF1 as status based on below conditions all together:
- Check Every
DX2Value we have same value inDRC-1(i.e For ID 1 and 2 we have sameDRC1value asABC-1). - For some cases i don't have
DX2Value, for those checkSR-1andSR-2to compareDRC-1value throughout the dataframe, if its same showTruein Status elseFalse.
Note: if any value either SR-1 or SR-2 Match with any row in the entire dataframe, (i.e. Row No. 4 in desired output)
- Where we don't have
DX2Value but when compare through dateframe usingSR-1andSR-2, and found some where we haveDX2value corresponding toSR-1andSR-2than give Status asTrue-IDorFalse-IDbased on condition.
Desired Output:
ID SR1 SR2 DRC1 DX2 Status
1 123 as#12.c ABC-1 SXI True
2 124 ae&14.v ABC-1 SXI True
3 125 at$19.e AXX-1 True-4
4 125 at$19.d AXX-1 SCV True
5 785 ab&22.n AWZ-2 DDF False
6 849 ab&22.n AWZ-5 DDF False
So far i could compare only one column with below code:
New_DF<-transform(DF_1, Status = ave(as.character(DF_1$DRC1), DF_1$DX2, FUN = function(x)
if(length(unique(x)) == 1) "True" else "False" ))
In addition, Just wondering if the same can be done in MySQL.??
Aucun commentaire:
Enregistrer un commentaire