Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Is my hive table stucture correct?

Highlighted

Is my hive table stucture correct?

New Contributor

I am trying to save streamed twitter data into a table. Here is the json format

 

CREATE EXTERNAL TABLE twitter (
    id BIGINT,
    created_at STRING,
    source STRING,
    favorited BOOLEAN,
    retweet_count INT,
    retweeted_status STRUCT<
    text:STRING,
    `user`:STRUCT<screen_name:STRING,name:STRING>>,
    entities STRUCT<
    urls:ARRAY<STRUCT<expanded_url:STRING>>,  user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
    hashtags:ARRAY<STRUCT<text:STRING>>>,
    text STRING,
    `user` STRUCT<
    screen_name:STRING,
    name:STRING,
    friends_count:INT,
    followers_count:INT,
    statuses_count:INT,
    verified:BOOLEAN,
    utc_offset:INT,
    time_zone:STRING>,
    in_reply_to_screen_name STRING
    )
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/user/cloudera/flume/tweets';
{
  "type": "record",
  "name": "Doc",
  "doc": "adoc",
  "fields": [
    {
      "name": "id",
      "type": "string"
    },
    {
      "name": "user_friends_count",
      "type": [
        "int",
        "null"
      ]
    },
    {
      "name": "user_location",
      "type": [
        "string",
        "null"
      ]
    },
    {
      "name": "user_description",
      "type": [
        "string",
        "null"
      ]
    },
    {
      "name": "user_statuses_count",
      "type": [
        "int",
        "null"
      ]
    },
    {
      "name": "user_followers_count",
      "type": [
        "int",
        "null"
      ]
    },
    {
      "name": "user_name",
      "type": [
        "string",
        "null"
      ]
    },
    {
      "name": "user_screen_name",
      "type": [
        "string",
        "null"
      ]
    },
    {
      "name": "created_at",
      "type": [
        "string",
        "null"
      ]
    },
    {
      "name": "text",
      "type": [
        "string",
        "null"
      ]
    },
    {
      "name": "retweet_count",
      "type": [
        "long",
        "null"
      ]
    },
    {
      "name": "retweeted",
      "type": [
        "boolean",
        "null"
      ]
    },
    {
      "name": "in_reply_to_user_id",
      "type": [
        "long",
        "null"
      ]
    },
    {
      "name": "source",
      "type": [
        "string",
        "null"
      ]
    },
    {
      "name": "in_reply_to_status_id",
      "type": [
        "long",
        "null"
      ]
    },
    {
      "name": "media_url_https",
      "type": [
        "string",
        "null"
      ]
    },
    {
      "name": "expanded_url",
      "type": [
        "string",
        "null"
      ]
    }
  ]
}

 

I keep getting this error when trying to query using hue in cloudera

 

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce

and I get this error when trying the same query using terminal

 

Caused by: org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Unexpected character ('O' (code 79)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')

 

 This is what my data file looks like

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here