- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hql to Sparksql: Hive query execution giving error
- Labels:
-
Apache Spark
Created ‎06-20-2019 04:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For Hadoop 3.1 i am using HWC Connector to connect to Hive and execute the queries.Below is the code snippet used for testing.
spark-shell --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar import com.hortonworks.hwc.HiveWarehouseSession import com.hortonworks.hwc.HiveWarehouseSession._ val hive = HiveWarehouseSession.session(spark).build() val query = sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").foreach(qry => hive.executeQuery(qry)) query.show()
This is throwing error as given below.
<console>:30: error: value show is not a member of Unit
Is this not the proper way to run a hql file using Spark.
Please suggest.
Created ‎06-20-2019 08:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is a known issue in spark reported in this Jira SPARK-24260 and not yet resolved.
One way of doing this is to execute each query at a time i.e after reading .hql file we can access array of elemets by their indexes (0),(1)
val df1=sc.sql(sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").collect()(0))
val df2=sc.sql(sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").collect()(1))
(or)
If you want to just execute the queries and see the results on console then try this approach.
sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").map(x => sc.sql(x).show())
Now we are executing all queries in hql script and displaying results in console.
Created ‎06-20-2019 08:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is a known issue in spark reported in this Jira SPARK-24260 and not yet resolved.
One way of doing this is to execute each query at a time i.e after reading .hql file we can access array of elemets by their indexes (0),(1)
val df1=sc.sql(sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").collect()(0))
val df2=sc.sql(sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").collect()(1))
(or)
If you want to just execute the queries and see the results on console then try this approach.
sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").map(x => sc.sql(x).show())
Now we are executing all queries in hql script and displaying results in console.
Created ‎06-21-2019 04:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Shu. The below command worked for me.
sc.textFile("/user/temp/hive.hql").collect().mkString.split(";").map(x => hive.executeQuery(x).show())
Created ‎07-02-2019 06:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The above question was originally posted in the Community Help track. On Tue Jul 2 18:08 UTC 2019, a member of the HCC moderation staff moved it to the Data Science & Advanced Analytics track. The Community Help Track is intended for questions about using the HCC site itself, not technical questions about The HWC Connector.
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
