Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark-sql command not found

avatar
New Contributor

Hello Folks!

 

I´m trying to use spark (with this command: sudo -i -u hdfs spark-sql --conf spark.executor.memory=1g --conf spark.akka.frameSize=1024 --conf spark.driver.memory=10g --conf spark.dynamicAllocation.initialExecutors=10 --conf spark.dynamicAllocation.maxExecutors=10 --conf spark.executor.extraJavaOptions="-XX:MaxPermSize=1024M" --conf spark.kryoserializer.buffer.max=1024m -f hdfs:////uploadFiles/BI/sinapse.sql)

But terminal returns: "-bash: spark-sql: command not found"

 

What i doing wrong?

 

I´m using CDH 5.10.0

2 REPLIES 2

avatar
Champion

I have not seen or heard of a spark-sql binary in which to launch spark jobs.  My best guess is that it is used in conjunction with the Spark Thrift server.  This feature of Spark is not included or supported in CDH (that is not saying you can't but the spark-sql binary will not exist by default).

 

If you already installed the Spark Thrift server then you need to add the Spark SQL CLI as well (and add it to your $PATH if you want to use it without listing the full path).

 

https://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_rn_spark_ki.html

avatar
Champion

@Akira191

 

 

1. Go to Cloudera Manager -> Spark -> Instance -> Identify the node where you have Spark server installed

2. Login to the above identified node using CLI and go to path "/opt/cloudera/parcels/CDH-<version>/lib/spark/bin" , it will list binaries for "spark-shell, pyspark, spark-submit, etc".  It helps us to login to spark & submit jobs. if it has spark-sql, then you can run the command that you have mentioned. In your case, spark-sql binary should be missing, so you are getting this error. You need to talk to your admin