Support Questions

Find answers, ask questions, and share your expertise

How to convert spark dataframes into xml files?

avatar
Explorer

How to convert spark data frames into xml files in scala. Very large data need to change from oracle database into xml files how can we do that?

1 ACCEPTED SOLUTION

avatar
Super Guru

@Gundrathi babu

https://spark-packages.org/package/HyukjinKwon/spark-xml has been moved to databricks: https://github.com/databricks/spark-xml for Spark 2.0 or for older: https://github.com/databricks/spark-xml/tree/branch-0.3

You should start the shell like this (check the proper version of spark-xml package):

spark-shell --packages com.databricks:spark-xml:0.1.1-s_2.10	

+++

If this helped, please vote/accept best answer

View solution in original post

5 REPLIES 5

avatar
Guru

@Gundrathi babu

You should use this package:

https://spark-packages.org/package/HyukjinKwon/spark-xml

val selectedData = df.select("author", "_id")
selectedData.write
    .format("com.databricks.spark.xml")
    .option("rootTag", "books")
    .option("rowTag", "book")
    .save("newbooks.xml")

avatar
Explorer

its working fine.But out put xml file saved in 10 files but I need in single file.How can we do that?

avatar
Super Guru

@Gundrathi babu

https://spark-packages.org/package/HyukjinKwon/spark-xml has been moved to databricks: https://github.com/databricks/spark-xml for Spark 2.0 or for older: https://github.com/databricks/spark-xml/tree/branch-0.3

You should start the shell like this (check the proper version of spark-xml package):

spark-shell --packages com.databricks:spark-xml:0.1.1-s_2.10	

+++

If this helped, please vote/accept best answer

avatar
Explorer

Thank you staca ...its working fine.But out put xml file saved in 10 files but I need in single file.How can we do that?

avatar
New Contributor

you can use .repartition(1)

DF..repartition(1) .....