Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to segericate the twitter json file based on #tag and store into different location Using NiFi?

Solved Go to solution
Highlighted

How to segericate the twitter json file based on #tag and store into different location Using NiFi?

Explorer
  • I can able to receive the twitter files as Json format from my twitter account but
  • i need to filter these files based on #tags (Ex, #fourseasondallas)
  • stored in a separate directory

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to segericate the twitter json file based on #tag and store into different location Using NiFi?

Looks like the Twitter API puts hash tags into the following JSON:

"entities": { "hashtags":[], "urls":[], "user_mentions":[] }

You could use EvaluteJsonPath to extract the value of the hashtags into FlowFile attributes, and then use RouteOnAttribute to route the ones matching your tag to a PutFile processor.

This blog shows an example of extracting values from the Twitter JSON and making routing decisions:

https://blogs.apache.org/nifi/entry/indexing_tweets_with_nifi_and

View solution in original post

1 REPLY 1

Re: How to segericate the twitter json file based on #tag and store into different location Using NiFi?

Looks like the Twitter API puts hash tags into the following JSON:

"entities": { "hashtags":[], "urls":[], "user_mentions":[] }

You could use EvaluteJsonPath to extract the value of the hashtags into FlowFile attributes, and then use RouteOnAttribute to route the ones matching your tag to a PutFile processor.

This blog shows an example of extracting values from the Twitter JSON and making routing decisions:

https://blogs.apache.org/nifi/entry/indexing_tweets_with_nifi_and

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here