Created on 11-28-2016 07:26 PM - edited 11-28-2016 07:28 PM
So I am trying to query data that I used flume to fetch from twitter in hive but I keep getting this error when trying to do a select statement
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce
and i keep getting this unexpected character error as well
Caused by: org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Unexpected character ('O' (code 79)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
If I goto the hue file browser and try to view the data file it looks weird I think my data is corrupt
Here is my flumetwitter.conf file
TwitterAgent.sources = Twitter TwitterAgent.channels = MemChannel TwitterAgent.sinks = HDFS # Describing/Configuring the source TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource TwitterAgent.sources.Twitter.consumerKey=uX0TWqkx0okYEjjqLzxIx6mD6 TwitterAgent.sources.Twitter.consumerSecret=rzHIs3TMJnADbZNvdGU7LQUo0kPxPISq3RGSLfqcBip39X5END TwitterAgent.sources.Twitter.accessToken=559516596-yDA9xqOljo4CV32wSnqsx2BXh4RBIRKFxZGSZrPC TwitterAgent.sources.Twitter.accessTokenSecret=zDxePILZitS5tIWBhre0GWqps0FIj9OadX8RZb6w8ZCwz TwitterAgent.sources.Twitter.maxBatchSize = 50000 TwitterAgent.sources.Twitter.maxBatchDurationMillis = 100000 TwitterAgent.sources.Twitter.keywords=hadoop, bigdata, mapreduce, mahout, hbase, nosql # Describing/Configuring the sink TwitterAgent.sinks.HDFS.channel=MemChannel TwitterAgent.sinks.HDFS.type=hdfs TwitterAgent.sinks.HDFS.hdfs.path=hdfs://localhost:8020/user/cloudera/flume/tweets TwitterAgent.sinks.HDFS.hdfs.fileType=DataStream TwitterAgent.sinks.HDFS.hdfs.writeformat=Text TwitterAgent.sinks.HDFS.hdfs.batchSize=1000 TwitterAgent.sinks.HDFS.hdfs.rollSize=0 TwitterAgent.sinks.HDFS.hdfs.rollCount=10000 TwitterAgent.sinks.HDFS.hdfs.rollInterval=600 TwitterAgent.channels.MemChannel.type=memory TwitterAgent.channels.MemChannel.capacity=10000 TwitterAgent.channels.MemChannel.transactionCapacity=1000 TwitterAgent.sources.Twitter.channels = MemChannel TwitterAgent.sinks.HDFS.channel = MemChannel
and the command im using to run it
flume-ng agent -n TwitterAgent -f /usr/lib/flume-ng/conf/flumetwitter.conf
Here is my full error log when I try to query using hive. its too big to paste in full:
Error when I try to run it from terminal
pastebin error log when trying to query from table
What could my issue be? Is it my flumetwitter.conf file?
Created 11-29-2016 08:50 AM
anybody?
Created 12-13-2016 10:57 AM