Member since
12-17-2020
3
Posts
0
Kudos Received
0
Solutions
12-18-2020
08:02 AM
*Reading thhe file from lookup file and location and country,state column for each record step 1:* for line into lines: SourceDf = sqlContext.read.format("csv").option("delimiter","|").load(line) SourceDf.withColumn("Location",lit("us"))\ .withColumn("Country",lit("Richmnd"))\ .withColumn("State",lit("NY")) *step 2: looping each column from above DF and doing split operation but am getting only two column in KeyValueDf.* for col_num in SopurceDf.column: InterDF = pyspark.sql.fucntion.split(SourceDf[col_num],":") KeyValueDF = SourceDf.withColumn("Column_Name",InterDF.get(0))\ .withColumn("Column_value",InterDf.get(1)) *in step 1 : Data Splited with Pipe and created 60 columns in Step 2: again i want to split output of step1 with Semicolon.* *Can any one help me please how get expected result. .* *File format : ABC:"MobileData"|XYZ:"TableData"|ZXC:"MacData"|MNB:"WindowData" result: ABC | XYZ |ZXC |MNB MobileData TabletData MacData WindowData*
... View more