Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

tweets sentinment hive 2.0 problem when I try to calculate polarity

avatar
Contributor

about tutorial: https://hortonworks.com/tutorial/analyzing-social-media-and-customer-sentiment-with-apache-nifi-and-...

I have been following above tutorial and everything worked fine but when calculating whether a tweet was positive, neutral, or negative using this next Hive command :

create table IF NOT EXISTS tweets_sentiment stored as orc as select 
tweet_id, case when sum( polarity ) > 0 then 'positive' when sum( 
polarity ) < 0 then 'negative' else 'neutral' end as sentiment from 
l3 group by tweet_id;

I got the following error message:

tanks a million

 java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1501058816878_0021_1_01, diagnostics=[Task failed, taskId=task_1501058816878_0021_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more




Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1501058816878_0021_1_01 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1501058816878_0021_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:2, Vertex vertex_1501058816878_0021_1_02 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
          (less...)



1 ACCEPTED SOLUTION

avatar
Expert Contributor

If you read carefully the exception message, it says:

Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]

This is likely to be caused by some CR-LF characters inside your "msg" object. Hive interprets each line as a full JSON. If you JSON contains newlines, Hive cannot parse it.

Thus you have to clean/reformat you data before being able to analyze it with Hive.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

If you read carefully the exception message, it says:

Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]

This is likely to be caused by some CR-LF characters inside your "msg" object. Hive interprets each line as a full JSON. If you JSON contains newlines, Hive cannot parse it.

Thus you have to clean/reformat you data before being able to analyze it with Hive.

avatar
Contributor

Dear @Marco Gaido,

I deleted my Json files in hdfs and then I ingested some other tweets in json format. Now, hive is working well.

thanks a lot