- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Nifi twitter term
- Labels:
-
Apache NiFi
Created on
‎01-21-2020
11:28 PM
- last edited on
‎01-22-2020
12:00 AM
by
VidyaSargur
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I want to ask what is the proper way to define the terms in Gettwitter processor terms to filter.
I'm trying to retrieve tweets that include the word bt by just writing bt in terms to filter however i'm getting tweets that include the two letters bt in a word such as bt*** or ****bt or ***bt***.
Here is a screenshot of the processor
and this is a sample tweet retrieved by the processor
I just need tweets with the two letters word bt. I tried to use single quote like 'bt' and double quote "bt" but i'm not getting any data
Thank you
Created ‎01-26-2020 04:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Though many of these kind of fields allow regular expressions, the doc for this one does not mention it.
I would try to use a regex, but probably it will not work because the field allows comma separated input.
From here you would need to get creative.
First of all you could definitely add a filter based on a regex afterwards (in Route text).
If this does not perform sufficiently you can perhaps try spaceBTspace but this is obviously a shortcut that will miss some results. (Also I did not test it).
In this particular case BT preceded or followed by a space may capture most cases.
---
So far the regular recommendations, if you really want to go all the way you could probably create your own get twitter, but I would start with the two step filter and see if it works for you.
- Dennis Jaheruddin
If this answer helped, please mark it as 'solved' and/or if it is valuable for future readers please apply 'kudos'.
Created ‎01-26-2020 04:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Though many of these kind of fields allow regular expressions, the doc for this one does not mention it.
I would try to use a regex, but probably it will not work because the field allows comma separated input.
From here you would need to get creative.
First of all you could definitely add a filter based on a regex afterwards (in Route text).
If this does not perform sufficiently you can perhaps try spaceBTspace but this is obviously a shortcut that will miss some results. (Also I did not test it).
In this particular case BT preceded or followed by a space may capture most cases.
---
So far the regular recommendations, if you really want to go all the way you could probably create your own get twitter, but I would start with the two step filter and see if it works for you.
- Dennis Jaheruddin
If this answer helped, please mark it as 'solved' and/or if it is valuable for future readers please apply 'kudos'.
Created ‎01-27-2020 09:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tried to use regex but it didn't work.
Yes the space bt space did really work, I'm thinking to cover other cases by adding space bt & bt space in case they are at the beginning or end of sentence.
Thank you for your help, really appreciate it!
