About VamshiPadig

VamshiPadig · ‎12-18-2020

*Reading thhe file from lookup file and location and country,state column for each record step 1:* for line into lines: SourceDf = sqlContext.read.format("csv").option("delimiter","|").load(line) SourceDf.withColumn("Location",lit("us"))\ .withColumn("Country",lit("Richmnd"))\ .withColumn("State",lit("NY")) *step 2: looping each column from above DF and doing split operation but am getting only two column in KeyValueDf.* for col_num in SopurceDf.column: InterDF = pyspark.sql.fucntion.split(SourceDf[col_num],":") KeyValueDF = SourceDf.withColumn("Column_Name",InterDF.get(0))\ .withColumn("Column_value",InterDf.get(1)) *in step 1 : Data Splited with Pipe and created 60 columns in Step 2: again i want to split output of step1 with Semicolon.* *Can any one help me please how get expected result. .* *File format : ABC:"MobileData"|XYZ:"TableData"|ZXC:"MacData"|MNB:"WindowData" result: ABC | XYZ |ZXC |MNB MobileData TabletData MacData WindowData*

VamshiPadig · ‎12-17-2020

Can any one help me on my request please. Input File Records "ABC":"Mobile"|"XYZ":"Tablet"|"LKJ":"MAC"|"TIME":"US" Need output like below ABC |XYZ |LKJ |TIME Mobile |Tablet |MAC | US Am reading with databricks with pipe delimiter and its giving number of columns from there onwards how can move forward?? I used pyspark.sql.function.split method. am not getting requried ouput format. I worked on all possible scenario's . Could you please me.

Online	Offline
Last Visited	‎01-26-2021 03:32 AM

Member Since	‎12-17-2020 10:04 AM
Last Visited	‎01-26-2021 03:32 AM
Posts	3

Cloudera Community

Re: How Read data with Pipe delimiter and semicolo...

How Read data with Pipe delimiter and semicolon us...