Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Read flume twitter files with HIVE

avatar
Expert Contributor

Hello guys, I've used flume to catch a few twitts. My agent and whole flume configuration run pretty well, my key point is when I need to read the outcome file with HIVE.

I create an avro schema file to reuse it in hive to create the table to store flume data. (flume outcome file comes in avro format) Once the table in hive is ready I tried to check it out and confirm the right format and that looks good as you can see in the attached file tweettable.jpg.

Then, I perform the command to load the flume data into this table and according to result message in hive that's also performed as expected. Even if the numRows is marked as 0. (attached file load.jpg)

Now, finally when I try to read the data is when I got an error message saying that is not possible read this data. Unfortunately, I don't understand why guys. Please if someone can give me a hand with that I really appreciate. (attached file result.jpg)

34615-load.jpg

34616-result.jpg

If you need more details about the scripts and everything I have used in this test I've posted all info in a GitHub repository https://github.com/AndresUrregoAngel/Flume_Twitter

thanks so much

34614-tweettable.jpg

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi guys I want to posted the solution , finally I have added in my flume file the options below:

TwitterAgent.sources.Twitter.maxBatchSize = 50000

TwitterAgent.sources.Twitter.maxBatchDurationMillis = 100000

thanks

View solution in original post

1 REPLY 1

avatar
Expert Contributor

Hi guys I want to posted the solution , finally I have added in my flume file the options below:

TwitterAgent.sources.Twitter.maxBatchSize = 50000

TwitterAgent.sources.Twitter.maxBatchDurationMillis = 100000

thanks