Master Guru

I have written a small Java 8 Spring Boot application to view Twitter tweets that I have ingested with NiFi and stored as JSON files in HDFS. I have an external Hive table on top of those, from those raw tweets I ran I Spark Scala job that added Stanford CoreNLP Sentiment and saved it to an ORC Hive Table. That is the table I am querying in my Spring Boot visualization program. To show something in a simple AngularJS HTML5 page, I have also queried that microservice which has a method for calling Spring Social Twitter to get live tweets. For this you will need JDK 1.8 and Maven installed on your machine or VM. I used Eclipse as my IDE, but I usually use IntelliJ, either will work fine.

Java Bean

I have a few of the fields specified, this is to put our Hive data into and transport to AngularJS as JSON serialized.

public class Twitter2 implements Serializable {
 private static final long serialVersionUID = 7409772495079484269L;
 private String geo;
 private String unixtime;
 private String handle;
 private String location;
 private String tag;
 private String tweet_id; .... }

Core Spring Boot App

package com.dataflowdeveloper;

import javax.sql.DataSource;
import org.apache.commons.dbcp.BasicDataSource;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.EnableAutoConfiguration;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.ComponentScan;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Profile;

public class HiveApplication {
 public static void main(String[] args) {, args);

 static class LocalConfiguration {
  Logger logger = LoggerFactory.getLogger(LocalConfiguration.class);
     private String consumerKey;
     private String consumerSecret;

     private String accessToken;

     private String accessTokenSecret;

  public Twitter twitter() {
   Twitter twitter = null;

   try {
    twitter = new TwitterTemplate(consumerKey, consumerSecret, accessToken, accessTokenSecret);
   } catch (Exception e) {
    logger.error("Error:", e);
   return twitter;

     private String databaseUri;

           private String password;

     private String username;

  public DataSource dataSource() {
   BasicDataSource dataSource = new BasicDataSource();
   logger.error("Initialized Hive");
   return dataSource;

Rest Controller

This is a Spring Boot class annotated with @RestController. A pretty simple query that can be called from curl or any REST client like AngularJS via $http({method: 'GET', url: '/query/' + $query).success(function(data) {$scope.tweetlist = data; // response data});

    public List<Twitter2> query(@PathVariable(value="query") String query) 

Datasource Service

Just regular plain old JDBC.

package com.dataflowdeveloper;

import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.util.ArrayList;
import java.util.List;
import javax.sql.DataSource;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;

public class DataSourceService {
 Logger logger = LoggerFactory.getLogger(DataSourceService.class);

 public DataSource dataSource;

 public Twitter2 defaultValue() {
  return new Twitter2();

 private String querylimit;

 public List<Twitter2> search(String query) {

Under src/main/resources I have a properties file (could be YAML or properties style) with a few name/value pairs like hivepassword=secretstuff.

Maven Build Script

I had some issues with Spring Boot, Hadoop and Hive having multiple copies of log4j, so see my POM exclusions to prevent build issues.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="" xmlns:xsi="" xsi:schemaLocation="">
 <description>Apache Hive Spring Boot</description>
  <relativePath /> <!-- lookup parent from repository -->
































































































To Build

mvn package -DskipTests

To Run

I have included Jetty in my POM, so the server runs with Jetty.

java -Xms512m -Xmx512m -Dhdp.version= -jar target/hive-0.0.1-SNAPSHOT.jar

IPv4 Stack is required in some networking environments and I set HDP version to my current Sandbox version I am calling. If you are using the sandbox make sure the Thrift port is open and available. You may need more RAM depending on what you are doing. A few gigabytes wouldn't hurt if you have it.

[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.982 s
[INFO] Finished at: 2016-08-26T16:35:56-04:00
[INFO] Final Memory: 28M/447M
[INFO] ------------------------------------------------------------------------


Spring Boot let's you set an ASCII art banner as seen above with src/main/resources/banner.txt. You can see I set the port to 9999 as to not collide with Ambari or other HDP services.

08-26 17:11:28.721  INFO 38783 --- [tp1841396611-12] org.apache.hive.jdbc.Utils               : Supplied authorities: localhost:10000
2016-08-26 17:11:28.722  INFO 38783 --- [tp1841396611-12] org.apache.hive.jdbc.Utils               : Resolved authority: localhost:10000
2016-08-26 17:11:28.722  INFO 38783 --- [tp1841396611-12] org.apache.hive.jdbc.HiveConnection      : Will try to open client transport with JDBC Uri: jdbc:hive2://localhost:10000/default
2016-08-26 17:12:24.768 ERROR 38783 --- [tp1841396611-12] com.dataflowdeveloper.DataSourceService  : Size=1
2016-08-26 17:12:24.768 ERROR 38783 --- [tp1841396611-12] com.dataflowdeveloper.DataController     : Query:hadoop,IP: Browser:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36

URLs Made Available By Application

http://localhost:9999/timeline/<twitter handle>

http://localhost:9999/profile/<twitter handle>

http://localhost:9999/query/<hive query text>




New Contributor

Is it possible that input can be from search bar on html page of a website that query into HDFS with above micro-service?

What I got from the above micro-service implementation is that it provides maven based CLI to query into HDFS database.

Correct me, if i am wrong. I am beginner in HADOOP.

Thanks & Regards

Vikram Pal

Master Guru

Sure, can be from anywhere you want for REST. GET or POST.

