Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

tweets sentinment hive 2.0 problem when I try to calculate polarity

about tutorial: https://hortonworks.com/tutorial/analyzing-social-media-and-customer-sentiment-with-apache-nifi-and-...

I have been following above tutorial and everything worked fine but when calculating whether a tweet was positive, neutral, or negative using this next Hive command :

create table IF NOT EXISTS tweets_sentiment stored as orc as select 
tweet_id, case when sum( polarity ) > 0 then 'positive' when sum( 
polarity ) < 0 then 'negative' else 'neutral' end as sentiment from 
l3 group by tweet_id;

I got the following error message:

tanks a million

 java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1501058816878_0021_1_01, diagnostics=[Task failed, taskId=task_1501058816878_0021_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more




Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1501058816878_0021_1_01 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1501058816878_0021_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:2, Vertex vertex_1501058816878_0021_1_02 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
          (less...)



1 ACCEPTED SOLUTION

Rising Star

If you read carefully the exception message, it says:

Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]

This is likely to be caused by some CR-LF characters inside your "msg" object. Hive interprets each line as a full JSON. If you JSON contains newlines, Hive cannot parse it.

Thus you have to clean/reformat you data before being able to analyze it with Hive.

View solution in original post

2 REPLIES 2

Rising Star

If you read carefully the exception message, it says:

Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]

This is likely to be caused by some CR-LF characters inside your "msg" object. Hive interprets each line as a full JSON. If you JSON contains newlines, Hive cannot parse it.

Thus you have to clean/reformat you data before being able to analyze it with Hive.

Dear @Marco Gaido,

I deleted my Json files in hdfs and then I ingested some other tweets in json format. Now, hive is working well.

thanks a lot