Support Questions
Find answers, ask questions, and share your expertise

How to convert spark dataframes into xml files?

Explorer

How to convert spark data frames into xml files in scala. Very large data need to change from oracle database into xml files how can we do that?

1 ACCEPTED SOLUTION

Accepted Solutions

@Gundrathi babu

https://spark-packages.org/package/HyukjinKwon/spark-xml has been moved to databricks: https://github.com/databricks/spark-xml for Spark 2.0 or for older: https://github.com/databricks/spark-xml/tree/branch-0.3

You should start the shell like this (check the proper version of spark-xml package):

spark-shell --packages com.databricks:spark-xml:0.1.1-s_2.10	

+++

If this helped, please vote/accept best answer

View solution in original post

5 REPLIES 5

Guru

@Gundrathi babu

You should use this package:

https://spark-packages.org/package/HyukjinKwon/spark-xml

val selectedData = df.select("author", "_id")
selectedData.write
    .format("com.databricks.spark.xml")
    .option("rootTag", "books")
    .option("rowTag", "book")
    .save("newbooks.xml")

Explorer

its working fine.But out put xml file saved in 10 files but I need in single file.How can we do that?

@Gundrathi babu

https://spark-packages.org/package/HyukjinKwon/spark-xml has been moved to databricks: https://github.com/databricks/spark-xml for Spark 2.0 or for older: https://github.com/databricks/spark-xml/tree/branch-0.3

You should start the shell like this (check the proper version of spark-xml package):

spark-shell --packages com.databricks:spark-xml:0.1.1-s_2.10	

+++

If this helped, please vote/accept best answer

View solution in original post

Explorer

Thank you staca ...its working fine.But out put xml file saved in 10 files but I need in single file.How can we do that?

New Contributor

you can use .repartition(1)

DF..repartition(1) .....