mardi 2 juin 2020

if else condition in pyspark

I have below df which I have split into two functionalities 1) to filter accounts and 2) perform the operations Query: The second operation needs to be completed only for accounts mentioned in df;it basically if these accounts do next operations else leave it like that.

How can I combine these two sets into one if and do condition? I am deleting other accounts which I want to avoid

df = df1.where((f.col('Account') != "Acc1") | 
         (f.col('Account') != "Acc2")  |
         (f.col('Account') != "Acc3")  | 
         (f.col('Account') != "Acc4")  | 
         (f.col('Account') != "Acc4")  | 
         (f.col('Account') != "Acc5"))

from pyspark.sql import functions as F
from pyspark.sql.window import Window

w1=Window().orderBy("mono_id")
df2 = df\
.withColumn("mono_id", F.monotonically_increasing_id())\
.withColumn("Price1", (col('rate') * col('amt1')))\
.withColumn("Price2", (col('rate') * col('amt2')))\
.withColumn("price3", lag(col('amt3')))

Aucun commentaire:

Enregistrer un commentaire