Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Run .hql file in Spark and store result as txt or csv

Run .hql file in Spark and store result as txt or csv

Rising Star


I am looking to export extracts from hive databases in csv or txt format using hql files. In Hive, you can achieve this by running below command:

hive -f test.hql >> test.csv

I tried the below command in Spark which works for a smaller dataset but fails if I try to run a complex query on a bigger data. I get out of memory exception from Driver.

spark-sql --name my-extract --num-executors 20 --master yarn --executor-memory 5G --driver-memory 10G --queue batch -S -f test.hql >> test.csv

I saw some solutions using Scala code and submit it through spark-submit, but users are more comfortable with SQL and running hive -f kinda export.

Thanks in advance!!

Don't have an account?
Coming from Hortonworks? Activate your account here