Created 04-26-2018 04:51 PM
Hi
I have elastic search index in below format. here the fields are populated with null values if the data are not supplied to fields .
Is there any way by using nifi we can remove null values contain fields before elastic search index loading
Details
Sample index
if all data populated
{ "_index": "bookdb_index", "_type": "book", "_id": "1", "_score": 0.28168046, "title": "Elasticsearch: The Definitive Guide", "summary": "A distibuted real-time search and analytics engine", "publish_date": "2015-02-07", "num_reviews": 20, "publisher": "manning" }
Sample index if all not data populated
{ "_index": "bookdb_index", "_type": "book", "_id": "1", "_score": 0.28168046, "title": "Elasticsearch: The Definitive Guide", "summary": null, "publish_date": null, "num_reviews": 20, "publisher":null }
Avro Scheam registery
{ "namespace": "ingestion", "type": "record", "name": "quest", "fields": [ { "name": "_type", "type": "string" }, { "name": "_id", "type": "string" }, { "name": "_score", "type": ["string", "null"] }, { "name": "title", "type": ["string", "null"] }, { "name": "summary", "type": ["string", "null"] }, { "name": "publish_date", "type": ["null", "string"] }, { "name": "num_reviews", "type": ["string", "null"] }, { "name": "publisher", "type": ["string", "null"] }, ]
I am using Putelasticsearchhttprecord processor to load data to elasticsearch . In nifi 1.6 there is "Suppress Null Values" feature is available , but i am using nifi 1.5 version , Is there any way to resolve the issue
I am trying to change Avro schema registry but fields are populated null values some time and sometimes not.
Please help me to resolve the issue
Created 04-26-2018 07:08 PM
You should keep them as "null" inside the elastic index. However, if you believe you need to actually forcibly remove them so there is no such key... then you need to create a custom processor.
I created a Nar that is called "deleteEmptyAttributes", it basically destroys them before putting it into the elastic index. But usually you want to consistently have the same fields for all parts of an index.
Created 04-26-2018 07:47 PM
@BX thanks for your response.. But my requirement consistence not required in elastic search index . Could you please provide your "deleteEmptyAttributes" nar
Created 04-26-2018 08:02 PM
Your schema says that null values are allowed. If you don't want to allow nulls for particular fields, try a ValidateRecord processor using a schema that does not allow null values for the desired fields. I can't remember whether the "non-null" schema would be set on the Reader or Writer for ValidateRecord, but I believe it is the Reader. In that case, use the current schema (that allows nulls) for the Writer so the valid and invalid records can be output from the processor. Then you can send the "valid" relationship to the Elasticsearch processor, and handle the flowfiles/records on the "invalid" relationship however you choose.
Created 04-26-2018 08:13 PM
@matt Thanks for your response . I have already used the option validate record processor ,and applied the option "suppress null records" ==> Allays suppress in record writer . but it was not working as expected any other thoughts appreciated