Let's say I have the following tuple
(colType, colDocV)
Where colType is a boolean and colDocV is a String Depending on those two values, I will apply some chunk of code that applies transformations to a Dataframe.
Now, this code works. However, I am not convinced this is the proper way to write functional programming code. I don't know which of these 3 approaches will improve the quality of the code and remove all if-if else-else : Should I apply some kind of design pattern and which one? Should I use some kind of pattern matching? Should I use some anonymous function?
if (colDocV) {
val newCol = udf(UDFHashCode.udfHashCode).apply(col(columnName))
dataframe.withColumn(columnName, newCol)
} else if (colType.contains("string") || colType.contains("text")) {
val newCol = udf(Entropy.stringEntropyFunc).apply(col(columnName)).cast(DoubleType)
dataframe.withColumn(columnName, newCol)
} else if (colType.contains("date")) {
val newCol = udf(DateUtils.getTimeAsDoubleFunc).apply(col(columnName)).cast(DoubleType)
dataframe.withColumn(columnName, newCol)
} else if (colType.contains("long")) {
dataframe.withColumn(columnName, dataframe(columnName).cast(DoubleType) )
} else {
dataframe.drop(columnName) //Dropping column that cannot be processed
}
Aucun commentaire:
Enregistrer un commentaire