samedi 22 février 2020

RDD replace in lambda

Im strugling with one problem RDD file got filename as on example below and if filename start with 'spmsg' then should be replace by 1 else 0 i should adjust my code in first lambda expresion however i cannot find out what im doing wrong

#RDD file

print(rdd3.take(1))

[('3-1msg1', DenseVector([0.0, 0.1255, 0.5695, 0.377, 0.0, 0.2196, 0.4721, 0.2823, 0.2614, 0.3142]))]

RDD4 = rdd3.map(lambda x: 1 if x[0].startswith('spmsg') else 0)# how to change this line to return AS below RDD5 = RDD4.map(lambda cls_vec: LabeledPoint(cls_vec[0], cls_vec[1]) ) print(RDD5.take(1))

expected output

[LabeledPoint(0.0, [0.0,0.16290896085571283,0.6826175329317583,0.0,0.0,0.0,0.40170165983309447...

enter code here

Aucun commentaire:

Enregistrer un commentaire