Support Questions

Adda_Fuentes2 · ‎05-31-2017

Hello,

I am trying to build a flow that gets a file from elasticsearch, however since it is my first time using the FetchElasticSearch processor I am having a doubt about how to configure it. I don't understand what the Document Identifier means and what value am I suppose to give in this configuration. If someone could help in this I would greatly appreciate it.

mburgess · ‎06-01-2017

FetchElasticsearch is used to get a single document from an ES cluster. Each document in ES has a document identifier (or "_id") associated with it, and that identifier is what should be supplied to the Document Identifier property.

If you don't know the document identifier for the document(s) you're looking for, then QueryElasticsearchHttp is your best bet. It allows you to use the Query String "mini-language" to search for fields with desired values (see here for more information). You can then parse the results using any number of processors, such as EvaluateJsonPath to get individual fields from the results, SplitJson if there are multiple results, etc.

View solution in original post

mburgess · ‎06-01-2017

FetchElasticsearch is used to get a single document from an ES cluster. Each document in ES has a document identifier (or "_id") associated with it, and that identifier is what should be supplied to the Document Identifier property.

If you don't know the document identifier for the document(s) you're looking for, then QueryElasticsearchHttp is your best bet. It allows you to use the Query String "mini-language" to search for fields with desired values (see here for more information). You can then parse the results using any number of processors, such as EvaluateJsonPath to get individual fields from the results, SplitJson if there are multiple results, etc.

Adda_Fuentes2 · ‎06-01-2017

Thanks Matt, the QueryElasticsearchHTTP processor was very useful and helped me with the problem.

naveen_mittemar · ‎08-22-2017

@Matt Burgess

If I want to get all the documents in the Index, what is the query parameter I should be giving in QueryElasticsearchHTTP processor.

mburgess · ‎08-22-2017

Try * as the value for the Query property.

naveen_mittemar · ‎08-23-2017

@Matt Burgess

That worked Matt, Thanks a lot.

I am facing one more problem here.

My flow is like this:

Twitter --> ElasticSearch --> Kafka

Processor Flows:

1. GetTwitter --> PutElasticsearch5

2. ScrollElasticsearchHttp --> PublishKafka_0_10

ScrollElasticsearchHttp does fine job of fetching all the records from elastic index, but when the index content is changed(new tweets added to elasticsearch index from GetTwitter-->PutElasticsearch5), ScrollElasticsearchHttp not sending the updated content to Kafka.

In ScrollElasticsearchHttp processor If I clear state, all the documents from Elasticsearch is been sent to kafka again (same documents sent again).

Matt could you please help me out to sort this issue.

Cloudera Community

Support Questions

Problems configuring FetchElasticSearch processor

Nifi error with fetchelasticsearch processor

How does the FetchElasticSearch Processor Work?

Getting "Jaas configuration not found " in Consume...

Spark 3 legacy configurations list ( Spark 2 behav...

How to configure PublishMQTT and ConsumeMQTT proce...

Configuring passphraseless problem

how to configure listenerhttp processor in NIFI

Jolt quick reference for Nifi Jolt Processors

QueryDatabaseTable Problem

Build Custom Nifi Processor