samedi 2 décembre 2017

Combine multiple sequential entries in Spark

I have an array of numbers separated by comma as shown:

a:{108,109,110,112,114,115,116,118}

I need the output something like this:

a:{108-110, 112, 114-116, 118}

I am doing this in Spark. I wrote the following code:

import scala.collection.mutable.ArrayBuffer

def Sample(x:String):ArrayBuffer[String]={
  val x1 = x.split(",")
  var a:Int = 0
  var present=""
  var next:Int = 0
  var yrTemp = ""
  var yrAr= ArrayBuffer[String]()
  var che:Int = 0
  var storeV = ""
  var p:Int = 0 
  var q:Int = 0

  var count:Int = 1

  while(a < x1.length)
  {
      yrTemp = x1(a)

      if(x1.length == 1)
      {
          yrAr+=x1(a)
      }
      else
      if(a < x1.length - 1)
       {
           present = x1(a)
          if(che == 0)
          {
                storeV = present
          }

          p = x1(a).toInt
          q = x1(a+1).toInt

          if(p == q)
          {
              yrTemp = yrTemp
              che = 1
          }
          else
          if(p != q)
             {
                 yrTemp = storeV + "-" + present 
                 che = 0
                 yrAr+=yrTemp
             }

       }
       else
            if(a == x1.length-1)
            {
                present = x1(a)
                yrTemp = present 
                che = 0
                yrAr+=yrTemp
            }
      a = a+1
  }
yrAr
}
val SampleUDF = udf(Sample(_:String))

I am getting the output as follows:

a:{108-108, 109-109, 110-110, 112, 114-114, 115-115, 116-116, 118}

I am not able to figure out where I am going wrong. Can you please help me in correcting this. TIA.

Aucun commentaire:

Enregistrer un commentaire