- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Problems configuring FetchElasticSearch processor
- Labels:
-
Apache NiFi
Created ‎05-31-2017 04:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am trying to build a flow that gets a file from elasticsearch, however since it is my first time using the FetchElasticSearch processor I am having a doubt about how to configure it. I don't understand what the Document Identifier means and what value am I suppose to give in this configuration. If someone could help in this I would greatly appreciate it.
Created ‎06-01-2017 03:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FetchElasticsearch is used to get a single document from an ES cluster. Each document in ES has a document identifier (or "_id") associated with it, and that identifier is what should be supplied to the Document Identifier property.
If you don't know the document identifier for the document(s) you're looking for, then QueryElasticsearchHttp is your best bet. It allows you to use the Query String "mini-language" to search for fields with desired values (see here for more information). You can then parse the results using any number of processors, such as EvaluateJsonPath to get individual fields from the results, SplitJson if there are multiple results, etc.
Created ‎06-01-2017 03:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FetchElasticsearch is used to get a single document from an ES cluster. Each document in ES has a document identifier (or "_id") associated with it, and that identifier is what should be supplied to the Document Identifier property.
If you don't know the document identifier for the document(s) you're looking for, then QueryElasticsearchHttp is your best bet. It allows you to use the Query String "mini-language" to search for fields with desired values (see here for more information). You can then parse the results using any number of processors, such as EvaluateJsonPath to get individual fields from the results, SplitJson if there are multiple results, etc.
Created ‎06-01-2017 03:43 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Matt, the QueryElasticsearchHTTP processor was very useful and helped me with the problem.
Created ‎08-22-2017 08:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I want to get all the documents in the Index, what is the query parameter I should be giving in QueryElasticsearchHTTP processor.
Created ‎08-22-2017 01:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try * as the value for the Query property.
Created ‎08-23-2017 08:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That worked Matt, Thanks a lot.
I am facing one more problem here.
My flow is like this:
Twitter --> ElasticSearch --> Kafka
Processor Flows:
1. GetTwitter --> PutElasticsearch5
2. ScrollElasticsearchHttp --> PublishKafka_0_10
ScrollElasticsearchHttp does fine job of fetching all the records from elastic index, but when the index content is changed(new tweets added to elasticsearch index from GetTwitter-->PutElasticsearch5), ScrollElasticsearchHttp not sending the updated content to Kafka.
In ScrollElasticsearchHttp processor If I clear state, all the documents from Elasticsearch is been sent to kafka again (same documents sent again).
Matt could you please help me out to sort this issue.
