Member since
12-17-2020
3
Posts
0
Kudos Received
0
Solutions
12-18-2020
08:02 AM
*Reading thhe file from lookup file and location and country,state column for each record step 1:* for line into lines: SourceDf = sqlContext.read.format("csv").option("delimiter","|").load(line) SourceDf.withColumn("Location",lit("us"))\ .withColumn("Country",lit("Richmnd"))\ .withColumn("State",lit("NY")) *step 2: looping each column from above DF and doing split operation but am getting only two column in KeyValueDf.* for col_num in SopurceDf.column: InterDF = pyspark.sql.fucntion.split(SourceDf[col_num],":") KeyValueDF = SourceDf.withColumn("Column_Name",InterDF.get(0))\ .withColumn("Column_value",InterDf.get(1)) *in step 1 : Data Splited with Pipe and created 60 columns in Step 2: again i want to split output of step1 with Semicolon.* *Can any one help me please how get expected result. .* *File format : ABC:"MobileData"|XYZ:"TableData"|ZXC:"MacData"|MNB:"WindowData" result: ABC | XYZ |ZXC |MNB MobileData TabletData MacData WindowData*
... View more
12-17-2020
10:19 AM
Can any one help me on my request please. Input File Records "ABC":"Mobile"|"XYZ":"Tablet"|"LKJ":"MAC"|"TIME":"US" Need output like below ABC |XYZ |LKJ |TIME Mobile |Tablet |MAC | US Am reading with databricks with pipe delimiter and its giving number of columns from there onwards how can move forward?? I used pyspark.sql.function.split method. am not getting requried ouput format. I worked on all possible scenario's . Could you please me.
... View more
Labels:
- Labels:
-
Apache Spark