Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to connect and run Hive query from Apache Spark in JAVA

avatar
Rising Star

In Sprig I am running Spark Application. Now I want to connect to HIVE to run HIVE query in Spring Suite itself.

How to do this?

I learned that HiveContext could be used but am clueless how to use this?

1 ACCEPTED SOLUTION

avatar

A simple Spark1 Java application to show a list of tables in Hive Metastore is as follows:

import org.apache.spark.SparkContext;
import org.apache.spark.SparkConf;
import org.apache.spark.sql.hive.HiveContext;
import org.apache.spark.sql.DataFrame;

public class SparkHiveExample {
  public static void main(String[] args) {
    SparkConf conf = new SparkConf().setAppName("SparkHive Example");
    SparkContext sc = new SparkContext(conf);
    HiveContext hiveContext = new org.apache.spark.sql.hive.HiveContext(sc);
    DataFrame df = hiveContext.sql("show tables");
    df.show();
  }
}

Note that Spark pulls metadata from Hive metastore and also uses hiveql for parsing queries but the execution of queries as such happens in the Spark execution engine.

View solution in original post

1 REPLY 1

avatar

A simple Spark1 Java application to show a list of tables in Hive Metastore is as follows:

import org.apache.spark.SparkContext;
import org.apache.spark.SparkConf;
import org.apache.spark.sql.hive.HiveContext;
import org.apache.spark.sql.DataFrame;

public class SparkHiveExample {
  public static void main(String[] args) {
    SparkConf conf = new SparkConf().setAppName("SparkHive Example");
    SparkContext sc = new SparkContext(conf);
    HiveContext hiveContext = new org.apache.spark.sql.hive.HiveContext(sc);
    DataFrame df = hiveContext.sql("show tables");
    df.show();
  }
}

Note that Spark pulls metadata from Hive metastore and also uses hiveql for parsing queries but the execution of queries as such happens in the Spark execution engine.