Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Read flume twitter files with HIVE

Solved Go to solution

Read flume twitter files with HIVE

Rising Star

Hello guys, I've used flume to catch a few twitts. My agent and whole flume configuration run pretty well, my key point is when I need to read the outcome file with HIVE.

I create an avro schema file to reuse it in hive to create the table to store flume data. (flume outcome file comes in avro format) Once the table in hive is ready I tried to check it out and confirm the right format and that looks good as you can see in the attached file tweettable.jpg.

Then, I perform the command to load the flume data into this table and according to result message in hive that's also performed as expected. Even if the numRows is marked as 0. (attached file load.jpg)

Now, finally when I try to read the data is when I got an error message saying that is not possible read this data. Unfortunately, I don't understand why guys. Please if someone can give me a hand with that I really appreciate. (attached file result.jpg)

34615-load.jpg

34616-result.jpg

If you need more details about the scripts and everything I have used in this test I've posted all info in a GitHub repository https://github.com/AndresUrregoAngel/Flume_Twitter

thanks so much

34614-tweettable.jpg

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Read flume twitter files with HIVE

Rising Star

Hi guys I want to posted the solution , finally I have added in my flume file the options below:

TwitterAgent.sources.Twitter.maxBatchSize = 50000

TwitterAgent.sources.Twitter.maxBatchDurationMillis = 100000

thanks

View solution in original post

1 REPLY 1
Highlighted

Re: Read flume twitter files with HIVE

Rising Star

Hi guys I want to posted the solution , finally I have added in my flume file the options below:

TwitterAgent.sources.Twitter.maxBatchSize = 50000

TwitterAgent.sources.Twitter.maxBatchDurationMillis = 100000

thanks

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here