Support Questions

Find answers, ask questions, and share your expertise

How to search and remove null values in elastic search index using nifi

avatar
Contributor

Hi

I have elastic search index in below format. here the fields are populated with null values if the data are not supplied to fields .

Is there any way by using nifi we can remove null values contain fields before elastic search index loading

Details

Sample index

if all data populated

{ "_index": "bookdb_index", "_type": "book", "_id": "1", "_score": 0.28168046, "title": "Elasticsearch: The Definitive Guide", "summary": "A distibuted real-time search and analytics engine", "publish_date": "2015-02-07", "num_reviews": 20, "publisher": "manning" }

Sample index if all not data populated

{ "_index": "bookdb_index", "_type": "book", "_id": "1", "_score": 0.28168046, "title": "Elasticsearch: The Definitive Guide", "summary": null, "publish_date": null, "num_reviews": 20, "publisher":null }

Avro Scheam registery

{ "namespace": "ingestion", "type": "record", "name": "quest", "fields": [ { "name": "_type", "type": "string" }, { "name": "_id", "type": "string" }, { "name": "_score", "type": ["string", "null"] }, { "name": "title", "type": ["string", "null"] }, { "name": "summary", "type": ["string", "null"] }, { "name": "publish_date", "type": ["null", "string"] }, { "name": "num_reviews", "type": ["string", "null"] }, { "name": "publisher", "type": ["string", "null"] }, ]

I am using Putelasticsearchhttprecord processor to load data to elasticsearch . In nifi 1.6 there is "Suppress Null Values" feature is available , but i am using nifi 1.5 version , Is there any way to resolve the issue

I am trying to change Avro schema registry but fields are populated null values some time and sometimes not.

Please help me to resolve the issue

4 REPLIES 4

avatar
Contributor

You should keep them as "null" inside the elastic index. However, if you believe you need to actually forcibly remove them so there is no such key... then you need to create a custom processor.

I created a Nar that is called "deleteEmptyAttributes", it basically destroys them before putting it into the elastic index. But usually you want to consistently have the same fields for all parts of an index.

avatar
Contributor

@BX thanks for your response.. But my requirement consistence not required in elastic search index . Could you please provide your "deleteEmptyAttributes" nar

avatar
Master Guru

Your schema says that null values are allowed. If you don't want to allow nulls for particular fields, try a ValidateRecord processor using a schema that does not allow null values for the desired fields. I can't remember whether the "non-null" schema would be set on the Reader or Writer for ValidateRecord, but I believe it is the Reader. In that case, use the current schema (that allows nulls) for the Writer so the valid and invalid records can be output from the processor. Then you can send the "valid" relationship to the Elasticsearch processor, and handle the flowfiles/records on the "invalid" relationship however you choose.

avatar
Contributor

@matt Thanks for your response . I have already used the option validate record processor ,and applied the option "suppress null records" ==> Allays suppress in record writer . but it was not working as expected any other thoughts appreciated