Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

tweets sentinment hive 2.0 problem when I try to calculate polarity

avatar
New Member

about tutorial: https://hortonworks.com/tutorial/analyzing-social-media-and-customer-sentiment-with-apache-nifi-and-...

I have been following above tutorial and everything worked fine but when calculating whether a tweet was positive, neutral, or negative using this next Hive command :

create table IF NOT EXISTS tweets_sentiment stored as orc as select 
tweet_id, case when sum( polarity ) > 0 then 'positive' when sum( 
polarity ) < 0 then 'negative' else 'neutral' end as sentiment from 
l3 group by tweet_id;

I got the following error message:

tanks a million

 java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1501058816878_0021_1_01, diagnostics=[Task failed, taskId=task_1501058816878_0021_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more




Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"tweet_id":890076549655003136,"created_unixtime":1501045760917,"created_time":"Wed Jul 26 05:09:20 +0000 2017","lang":"en","displayname":"2ne1legend21","time_zone":"","msg":"RT hyung_rose_bts BTS_twt The cup couldnt have said it better 
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
	... 17 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]
	at org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:424)
	at org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:183)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:149)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:113)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:554)
	... 18 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1501058816878_0021_1_01 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1501058816878_0021_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:2, Vertex vertex_1501058816878_0021_1_02 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
          (less...)



1 ACCEPTED SOLUTION

avatar
Expert Contributor

If you read carefully the exception message, it says:

Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]

This is likely to be caused by some CR-LF characters inside your "msg" object. Hive interprets each line as a full JSON. If you JSON contains newlines, Hive cannot parse it.

Thus you have to clean/reformat you data before being able to analyze it with Hive.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

If you read carefully the exception message, it says:

Row is not a valid JSON Object - JSONException: Unterminated string at 237 [character 238 line 1]

This is likely to be caused by some CR-LF characters inside your "msg" object. Hive interprets each line as a full JSON. If you JSON contains newlines, Hive cannot parse it.

Thus you have to clean/reformat you data before being able to analyze it with Hive.

avatar
New Member

Dear @Marco Gaido,

I deleted my Json files in hdfs and then I ingested some other tweets in json format. Now, hive is working well.

thanks a lot