*Reading thhe file from lookup file and location and country,state column for each record
step 1:*
for line into lines:
SourceDf = sqlContext.read.format("csv").option("delimiter","|").load(line)
SourceDf.withColumn("Location",lit("us"))\
.withColumn("Country",lit("Richmnd"))\
.withColumn("State",lit("NY"))
*step 2:
looping each column from above DF and doing split operation but am getting only two column in KeyValueDf.*
for col_num in SopurceDf.column:
InterDF = pyspark.sql.fucntion.split(SourceDf[col_num],":")
KeyValueDF = SourceDf.withColumn("Column_Name",InterDF.get(0))\
.withColumn("Column_value",InterDf.get(1))
*in step 1 : Data Splited with Pipe and created 60 columns
in Step 2: again i want to split output of step1 with Semicolon.*
*Can any one help me please how get expected result. .*
*File format :
ABC:"MobileData"|XYZ:"TableData"|ZXC:"MacData"|MNB:"WindowData"
result:
ABC | XYZ |ZXC |MNB
MobileData TabletData MacData WindowData*