Created 05-31-2017 04:45 PM
Hello,
I am trying to build a flow that gets a file from elasticsearch, however since it is my first time using the FetchElasticSearch processor I am having a doubt about how to configure it. I don't understand what the Document Identifier means and what value am I suppose to give in this configuration. If someone could help in this I would greatly appreciate it.
Created 06-01-2017 03:27 PM
FetchElasticsearch is used to get a single document from an ES cluster. Each document in ES has a document identifier (or "_id") associated with it, and that identifier is what should be supplied to the Document Identifier property.
If you don't know the document identifier for the document(s) you're looking for, then QueryElasticsearchHttp is your best bet. It allows you to use the Query String "mini-language" to search for fields with desired values (see here for more information). You can then parse the results using any number of processors, such as EvaluateJsonPath to get individual fields from the results, SplitJson if there are multiple results, etc.
Created 06-01-2017 03:27 PM
FetchElasticsearch is used to get a single document from an ES cluster. Each document in ES has a document identifier (or "_id") associated with it, and that identifier is what should be supplied to the Document Identifier property.
If you don't know the document identifier for the document(s) you're looking for, then QueryElasticsearchHttp is your best bet. It allows you to use the Query String "mini-language" to search for fields with desired values (see here for more information). You can then parse the results using any number of processors, such as EvaluateJsonPath to get individual fields from the results, SplitJson if there are multiple results, etc.
Created 06-01-2017 03:43 PM
Thanks Matt, the QueryElasticsearchHTTP processor was very useful and helped me with the problem.
Created 08-22-2017 08:50 AM
If I want to get all the documents in the Index, what is the query parameter I should be giving in QueryElasticsearchHTTP processor.
Created 08-22-2017 01:23 PM
Try * as the value for the Query property.
Created 08-23-2017 08:21 AM
That worked Matt, Thanks a lot.
I am facing one more problem here.
My flow is like this:
Twitter --> ElasticSearch --> Kafka
Processor Flows:
1. GetTwitter --> PutElasticsearch5
2. ScrollElasticsearchHttp --> PublishKafka_0_10
ScrollElasticsearchHttp does fine job of fetching all the records from elastic index, but when the index content is changed(new tweets added to elasticsearch index from GetTwitter-->PutElasticsearch5), ScrollElasticsearchHttp not sending the updated content to Kafka.
In ScrollElasticsearchHttp processor If I clear state, all the documents from Elasticsearch is been sent to kafka again (same documents sent again).
Matt could you please help me out to sort this issue.