Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

I need to edit my parquet files, and change field name, replacing space by underscore

Highlighted

I need to edit my parquet files, and change field name, replacing space by underscore

New Contributor

Hello,

I am facing trouble as mentioned in following topics in stackoverflow,

https://stackoverflow.com/questions/45804534/pyspark-org-apache-spark-sql-analysisexception-attribute-name-contains-inv

https://stackoverflow.com/questions/38191157/spark-dataframe-validating-column-names-for-parquet-writes-scala

I have tried all the solution mentioned there, but I am getting same error every time. Its like spark cannot read fields with space in them.

So, I am trying to find any other solution just to rename my fields, and save the parquet files back.

After that I will continue my transformation with spark.

 

Anyone can help me out.. Loads of love and thanks

PS I am using pyspark

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here