Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Problems configuring FetchElasticSearch processor

avatar
Expert Contributor

Hello,

I am trying to build a flow that gets a file from elasticsearch, however since it is my first time using the FetchElasticSearch processor I am having a doubt about how to configure it. I don't understand what the Document Identifier means and what value am I suppose to give in this configuration. If someone could help in this I would greatly appreciate it.

1 ACCEPTED SOLUTION

avatar
Master Guru

FetchElasticsearch is used to get a single document from an ES cluster. Each document in ES has a document identifier (or "_id") associated with it, and that identifier is what should be supplied to the Document Identifier property.

If you don't know the document identifier for the document(s) you're looking for, then QueryElasticsearchHttp is your best bet. It allows you to use the Query String "mini-language" to search for fields with desired values (see here for more information). You can then parse the results using any number of processors, such as EvaluateJsonPath to get individual fields from the results, SplitJson if there are multiple results, etc.

View solution in original post

5 REPLIES 5

avatar
Master Guru

FetchElasticsearch is used to get a single document from an ES cluster. Each document in ES has a document identifier (or "_id") associated with it, and that identifier is what should be supplied to the Document Identifier property.

If you don't know the document identifier for the document(s) you're looking for, then QueryElasticsearchHttp is your best bet. It allows you to use the Query String "mini-language" to search for fields with desired values (see here for more information). You can then parse the results using any number of processors, such as EvaluateJsonPath to get individual fields from the results, SplitJson if there are multiple results, etc.

avatar
Expert Contributor

Thanks Matt, the QueryElasticsearchHTTP processor was very useful and helped me with the problem.

avatar

@Matt Burgess

If I want to get all the documents in the Index, what is the query parameter I should be giving in QueryElasticsearchHTTP processor.

avatar
Master Guru

Try * as the value for the Query property.

avatar

@Matt Burgess

That worked Matt, Thanks a lot.

I am facing one more problem here.

My flow is like this:

Twitter --> ElasticSearch --> Kafka

Processor Flows:

1. GetTwitter --> PutElasticsearch5

2. ScrollElasticsearchHttp --> PublishKafka_0_10

ScrollElasticsearchHttp does fine job of fetching all the records from elastic index, but when the index content is changed(new tweets added to elasticsearch index from GetTwitter-->PutElasticsearch5), ScrollElasticsearchHttp not sending the updated content to Kafka.

In ScrollElasticsearchHttp processor If I clear state, all the documents from Elasticsearch is been sent to kafka again (same documents sent again).

Matt could you please help me out to sort this issue.