Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sentiment Analysis - empty tweets_text table

Sentiment Analysis - empty tweets_text table

New Contributor

Hello together,

aim doing the tutorial, and did everything like explained. now iam at that point that i created all the tables. dictionary time_zone_map and tweets_text. but my tweets_text is empty. i connected the data flow and i have my data in the banana dashboard. which step i missed ? how can i get my twitter feeds now into my table ?

Greets,

Martin

14 REPLIES 14

Re: Sentiment Analysis - empty tweets_text table

Expert Contributor

Can you confirm you have tweet data on HDFS, in the /tmp/tweets_staging/ directory (I believe that is used in the article).

hdfs dfs -ls /tmp/tweets_staging

to check (as hdfs user)

You also need to run the Add JAR command when creating the table (note: the file path might be different):

ADD JAR /usr/hdp/2.5.3.0-37/hive2/lib/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar

Re: Sentiment Analysis - empty tweets_text table

New Contributor

Hey,

no after the normal steps there was no data. then i saved the data in solr banana dashboard at the top right as a json file and uploaded this file to the dirctory and created the table again. but then it was full of rubbish. but i did this without the Add Jar Command ( its not in the tutorial at this point).

So now i did again with the Add Jar cmd. i didnt get a error when creating the table but when i select on the table now this is the output:

{"trace":"java.lang.Exception: Cannot fetch result for job. Job with id: 428 for instance: AUTO_HIVE_INSTANCE has either not started or has expired.\n\njava.lang.Exception: Cannot fetch result for job. Job with id: 428 for instance: AUTO_HIVE_INSTANCE has either not started or has expired.\n\tat org.apache.ambari.view.hive2.actor.message.job.FetchFailed.\u003cinit\u003e(FetchFailed.java:28)\n\tat org.apache.ambari.view.hive2.actor.OperationController.fetchResultActorRef(OperationController.java:200)\n\tat org.apache.ambari.view.hive2.actor.OperationController.handleMessage(OperationController.java:135)\n\tat org.apache.ambari.view.hive2.actor.HiveActor.onReceive(HiveActor.java:38)\n\tat akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)\n\tat akka.actor.Actor$class.aroundReceive(Actor.scala:467)\n\tat akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)\n\tat akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)\n\tat akka.actor.ActorCell.invoke(ActorCell.scala:487)\n\tat akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)\n\tat akka.dispatch.Mailbox.run(Mailbox.scala:220)\n\tat akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)\n\tat scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)\n\tat scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)\n\tat scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)\n\tat scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)\n","message":"Cannot fetch result for job. Job with id: 428 for instance: AUTO_HIVE_INSTANCE has either not started or has expired.","status":500}

Re: Sentiment Analysis - empty tweets_text table

New Contributor

if i try to create the table from the json file without the add jar cmd this is the result of the select on the table:

{"trace":"org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: A JSONObject text must end with \u0027}\u0027 at 2 [character 3 line 1]\n\norg.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: A JSONObject text must end with \u0027}\u0027 at 2 [character 3 line 1]\n\tat org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:264)\n\tat org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:250)\n\tat org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:373)\n\tat org.apache.ambari.view.hive2.actor.ResultSetIterator.getNext(ResultSetIterator.java:119)\n\tat org.apache.ambari.view.hive2.actor.ResultSetIterator.handleMessage(ResultSetIterator.java:79)\n\tat org.apache.ambari.view.hive2.actor.HiveActor.onReceive(HiveActor.java:38)\n\tat akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)\n\tat akka.actor.Actor$class.aroundReceive(Actor.scala:467)\n\tat akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)\n\tat akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)\n\tat akka.actor.ActorCell.invoke(ActorCell.scala:487)\n\tat akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)\n\tat akka.dispatch.Mailbox.run(Mailbox.scala:220)\n\tat akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)\n\tat scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)\n\tat scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)\n\tat scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)\n\tat scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)\nCaused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: A JSONObject text must end with \u0027}\u0027 at 2 [character 3 line 1]\n\tat org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:411)\n\tat org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:233)\n\tat org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:780)\n\tat org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:478)\n\tat org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:692)\n\tat org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1557)\n\tat org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1542)\n\tat org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)\n\tat org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\nCaused by: java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: A JSONObject text must end with \u0027}\u0027 at 2 [character 3 line 1]\n\tat org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:520)\n\tat org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:427)\n\tat org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)\n\tat org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1762)\n\tat org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:406)\n\t... 13 more\nCaused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object - JSONException: A JSONObject text must end with \u0027}\u0027 at 2 [character 3 line 1]\n\tat org.openx.data.jsonserde.JsonSerDe.onMalformedJson(JsonSerDe.java:412)\n\tat org.openx.data.jsonserde.JsonSerDe.deserialize(JsonSerDe.java:174)\n\tat org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:501)\n\t... 17 more\n","message":"Failed to fetch next batch for the Resultset","status":500}

Re: Sentiment Analysis - empty tweets_text table

New Contributor

ok now i see i didnt save the date through banana. it only saves the layout of the dashboard. so it cant work-_-

but how can i save the data then. which one in the tutorial is the step to save the data ? cause when i was at this step:

sudo -u hdfs hadoop fs -chown -R maria_dev /tmp/tweets_staging
	sudo -u hdfs hadoop fs -chmod -R 777/tmp/tweets_staging

i got the message that this directory doesnt exist, so i created this directory. is this correct ?

Re: Sentiment Analysis - empty tweets_text table

Expert Contributor

First things first - there should be data getting written to /tmp/tweets_staging on HDFS.

If the Banana UI dashboard is working, and you are getting tweets showing up in real-time - your Ni-Fi flow is working, but not the branch that writes to HDFS. There is also a branch that writes to the OS filesystem. What is showing up in Banana, is the branch that writes into Solr, and gets indexed.

Check the Ni-Fi flow for errors, as Hive will just create an empty table if you run the command (assuming you do not get the errors above).

Moving onto Hive, are you doing this in the Hive view in Ambari. Can you also sanity check Hive - create a dummy table -

create table example1 (ID int, col1 string)

insert into example1 values (1,"hello")

select * from example1

Is Hive working as expected?

Re: Sentiment Analysis - empty tweets_text table

New Contributor

Yes Hive is working as expected. the example table is ok. And i also get the real-time data shown in banana and in solar where i can run querys on it. Only not to hdfs. What is this branch which should write the data to the hdfs folder /tmp/tweets_staging ?

Re: Sentiment Analysis - empty tweets_text table

New Contributor

Ok now i know what you mean. i downloaded the template in the tutorial for NiFi. when i check the template now and check the process put hdfs it sais failure, 104 queued.

Re: Sentiment Analysis - empty tweets_text table

Expert Contributor

At a guess the failure is likely the processor is not pointing to the right directory, and permissions might not be correct. So check this by (as hdfs):

hdfs dfs -mkdir /tmp/tweets_staging

hdfs dfs -chmod 777 /tmp/tweets_staging

you could even test that you can right to HDFS. For example......create a dummy file in /tmp on Linux.

hdfs dfs -put /tmp/dummyfile /tmp/

The processor in Ni-FI needs to be pointing to the /tmp/tweets_staging/ directory

Re: Sentiment Analysis - empty tweets_text table

New Contributor

ok i will check this points now.

since i installed NiFi, i have also the red symbols and cant restart the services. It is the SNameNode from HDFS, Falcon, Storm, Ambari Infra and Atlas. Check the image.

Could this also be a problem according to this ?

13183-2017-03-03-11h20-34.png

Don't have an account?
Coming from Hortonworks? Activate your account here