Created 06-20-2019 04:20 AM
For Hadoop 3.1 i am using HWC Connector to connect to Hive and execute the queries.Below is the code snippet used for testing.
spark-shell --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar import com.hortonworks.hwc.HiveWarehouseSession import com.hortonworks.hwc.HiveWarehouseSession._ val hive = HiveWarehouseSession.session(spark).build() val query = sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").foreach(qry => hive.executeQuery(qry)) query.show()
This is throwing error as given below.
<console>:30: error: value show is not a member of Unit
Is this not the proper way to run a hql file using Spark.
Please suggest.
Created 06-20-2019 08:37 PM
This is a known issue in spark reported in this Jira SPARK-24260 and not yet resolved.
One way of doing this is to execute each query at a time i.e after reading .hql file we can access array of elemets by their indexes (0),(1)
val df1=sc.sql(sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").collect()(0))
val df2=sc.sql(sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").collect()(1))
(or)
If you want to just execute the queries and see the results on console then try this approach.
sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").map(x => sc.sql(x).show())
Now we are executing all queries in hql script and displaying results in console.
Created 06-20-2019 08:37 PM
This is a known issue in spark reported in this Jira SPARK-24260 and not yet resolved.
One way of doing this is to execute each query at a time i.e after reading .hql file we can access array of elemets by their indexes (0),(1)
val df1=sc.sql(sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").collect()(0))
val df2=sc.sql(sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").collect()(1))
(or)
If you want to just execute the queries and see the results on console then try this approach.
sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").map(x => sc.sql(x).show())
Now we are executing all queries in hql script and displaying results in console.
Created 06-21-2019 04:20 AM
Thanks Shu. The below command worked for me.
sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").map(x => hive.executeQuery(x).show())
Created 07-02-2019 06:09 PM
The above question was originally posted in the Community Help track. On Tue Jul 2 18:08 UTC 2019, a member of the HCC moderation staff moved it to the Data Science & Advanced Analytics track. The Community Help Track is intended for questions about using the HCC site itself, not technical questions about The HWC Connector.