- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to search and remove null values in elastic search index using nifi
- Labels:
-
Apache NiFi
-
Apache Phoenix
Created ‎04-26-2018 04:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
I have elastic search index in below format. here the fields are populated with null values if the data are not supplied to fields .
Is there any way by using nifi we can remove null values contain fields before elastic search index loading
Details
Sample index
if all data populated
{ "_index": "bookdb_index", "_type": "book", "_id": "1", "_score": 0.28168046, "title": "Elasticsearch: The Definitive Guide", "summary": "A distibuted real-time search and analytics engine", "publish_date": "2015-02-07", "num_reviews": 20, "publisher": "manning" }
Sample index if all not data populated
{ "_index": "bookdb_index", "_type": "book", "_id": "1", "_score": 0.28168046, "title": "Elasticsearch: The Definitive Guide", "summary": null, "publish_date": null, "num_reviews": 20, "publisher":null }
Avro Scheam registery
{ "namespace": "ingestion", "type": "record", "name": "quest", "fields": [ { "name": "_type", "type": "string" }, { "name": "_id", "type": "string" }, { "name": "_score", "type": ["string", "null"] }, { "name": "title", "type": ["string", "null"] }, { "name": "summary", "type": ["string", "null"] }, { "name": "publish_date", "type": ["null", "string"] }, { "name": "num_reviews", "type": ["string", "null"] }, { "name": "publisher", "type": ["string", "null"] }, ]
I am using Putelasticsearchhttprecord processor to load data to elasticsearch . In nifi 1.6 there is "Suppress Null Values" feature is available , but i am using nifi 1.5 version , Is there any way to resolve the issue
I am trying to change Avro schema registry but fields are populated null values some time and sometimes not.
Please help me to resolve the issue
Created ‎04-26-2018 07:08 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You should keep them as "null" inside the elastic index. However, if you believe you need to actually forcibly remove them so there is no such key... then you need to create a custom processor.
I created a Nar that is called "deleteEmptyAttributes", it basically destroys them before putting it into the elastic index. But usually you want to consistently have the same fields for all parts of an index.
Created ‎04-26-2018 07:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@BX thanks for your response.. But my requirement consistence not required in elastic search index . Could you please provide your "deleteEmptyAttributes" nar
Created ‎04-26-2018 08:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your schema says that null values are allowed. If you don't want to allow nulls for particular fields, try a ValidateRecord processor using a schema that does not allow null values for the desired fields. I can't remember whether the "non-null" schema would be set on the Reader or Writer for ValidateRecord, but I believe it is the Reader. In that case, use the current schema (that allows nulls) for the Writer so the valid and invalid records can be output from the processor. Then you can send the "valid" relationship to the Elasticsearch processor, and handle the flowfiles/records on the "invalid" relationship however you choose.
Created ‎04-26-2018 08:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@matt Thanks for your response . I have already used the option validate record processor ,and applied the option "suppress null records" ==> Allays suppress in record writer . but it was not working as expected any other thoughts appreciated
