Support Questions

Find answers, ask questions, and share your expertise

Getting all hashtags in a single array

avatar
New Contributor

Hi. I am trying to extract all hashtags for a tweet. As of now it gives me 1st hashtag using Pull Key Attributes:

$.entities.hashtags[0].text

How to get all hashtags for a particular tweet?

I tried using $.entities.hashtags[*].text , but it is not working.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi @Utkarsh Garg I was able extract a list of hashtags using the EvaluateJsonPath processor with the JsonPath you posted $.entities.hashtags[*].text

If your goal is to only further process the hash tag data you can configure the EvaluateJsonPath processor to place the matched hashtags from the tweet and populate it into flowfile-content. If you want more than just hashtags you can configure the processor to populate the results in flow-file attritbutes and create an attribute for all the data you want to extract from the tweet. That can be later paired with the AttributesToJson processor where you can pull all the attributes you matched in EvaluateJsonPath into a json object which can then become flowfile content.

Another approach could be using the JoltTransformJSON processor which will allow you to convert your incoming tweet to another structure. In your case if you wish to simply extract the hashtags you can use that processor's "Shift" transformation operation with the following specification (which simply defines how you want output to look):

 {
 	"entities": {
 		"hashtags": {
 			"*": {
 				"text": "hashtext.[]"
 			}
 		}
 	}
 }

The above spec would extract the hashtags and create a single json object with an array of hashtags called hashtext.

I hope this helps! Also I can try to post a snapshot of my configuration (the system's preventing me to upload at the moment).

View solution in original post

2 REPLIES 2

avatar
Master Guru

What do you want to do with the hashtags?

If you want to get a new flow file for each hashtag you can use the SplitJson processor with a JSONPath value of $.twitter.hashtags

avatar
Expert Contributor

Hi @Utkarsh Garg I was able extract a list of hashtags using the EvaluateJsonPath processor with the JsonPath you posted $.entities.hashtags[*].text

If your goal is to only further process the hash tag data you can configure the EvaluateJsonPath processor to place the matched hashtags from the tweet and populate it into flowfile-content. If you want more than just hashtags you can configure the processor to populate the results in flow-file attritbutes and create an attribute for all the data you want to extract from the tweet. That can be later paired with the AttributesToJson processor where you can pull all the attributes you matched in EvaluateJsonPath into a json object which can then become flowfile content.

Another approach could be using the JoltTransformJSON processor which will allow you to convert your incoming tweet to another structure. In your case if you wish to simply extract the hashtags you can use that processor's "Shift" transformation operation with the following specification (which simply defines how you want output to look):

 {
 	"entities": {
 		"hashtags": {
 			"*": {
 				"text": "hashtext.[]"
 			}
 		}
 	}
 }

The above spec would extract the hashtags and create a single json object with an array of hashtags called hashtext.

I hope this helps! Also I can try to post a snapshot of my configuration (the system's preventing me to upload at the moment).