Created 07-27-2016 11:45 AM
Hi. I am trying to extract all hashtags for a tweet. As of now it gives me 1st hashtag using Pull Key Attributes:
$.entities.hashtags[0].text
How to get all hashtags for a particular tweet?
I tried using $.entities.hashtags[*].text , but it is not working.
Created 07-27-2016 01:33 PM
Hi @Utkarsh Garg I was able extract a list of hashtags using the EvaluateJsonPath processor with the JsonPath you posted $.entities.hashtags[*].text
If your goal is to only further process the hash tag data you can configure the EvaluateJsonPath processor to place the matched hashtags from the tweet and populate it into flowfile-content. If you want more than just hashtags you can configure the processor to populate the results in flow-file attritbutes and create an attribute for all the data you want to extract from the tweet. That can be later paired with the AttributesToJson processor where you can pull all the attributes you matched in EvaluateJsonPath into a json object which can then become flowfile content.
Another approach could be using the JoltTransformJSON processor which will allow you to convert your incoming tweet to another structure. In your case if you wish to simply extract the hashtags you can use that processor's "Shift" transformation operation with the following specification (which simply defines how you want output to look):
{ "entities": { "hashtags": { "*": { "text": "hashtext.[]" } } } }
The above spec would extract the hashtags and create a single json object with an array of hashtags called hashtext.
I hope this helps! Also I can try to post a snapshot of my configuration (the system's preventing me to upload at the moment).
Created 07-27-2016 01:06 PM
What do you want to do with the hashtags?
If you want to get a new flow file for each hashtag you can use the SplitJson processor with a JSONPath value of $.twitter.hashtags
Created 07-27-2016 01:33 PM
Hi @Utkarsh Garg I was able extract a list of hashtags using the EvaluateJsonPath processor with the JsonPath you posted $.entities.hashtags[*].text
If your goal is to only further process the hash tag data you can configure the EvaluateJsonPath processor to place the matched hashtags from the tweet and populate it into flowfile-content. If you want more than just hashtags you can configure the processor to populate the results in flow-file attritbutes and create an attribute for all the data you want to extract from the tweet. That can be later paired with the AttributesToJson processor where you can pull all the attributes you matched in EvaluateJsonPath into a json object which can then become flowfile content.
Another approach could be using the JoltTransformJSON processor which will allow you to convert your incoming tweet to another structure. In your case if you wish to simply extract the hashtags you can use that processor's "Shift" transformation operation with the following specification (which simply defines how you want output to look):
{ "entities": { "hashtags": { "*": { "text": "hashtext.[]" } } } }
The above spec would extract the hashtags and create a single json object with an array of hashtags called hashtext.
I hope this helps! Also I can try to post a snapshot of my configuration (the system's preventing me to upload at the moment).