Support Questions

Find answers, ask questions, and share your expertise

How to avoid CharConversionException in HttpSolrServer breaking an hbase-indexer?

avatar
New Contributor

Hi,

 

We use an hbase-indexer for NRT indexing an hbase table and from time to time cell values cause a CharConversionException when the document is sent to Solr.

As we cannot guarantee 100% error-free data I would like to catch this exception in the mapper for further investigation and drop the value. I think a suitable place would be com.ngdata.hbaseindexer.indexer.Indexer.indexRowData().

Is there some configuration option to either make a character conversion issue non-fatal or replace the indexer with a custom class? I tried to replace the mapper but to no avail. As soon as the configuration is active it does not index anymore but without logging any error messages.

 

How would you fix such an issue? I thought about patching hbase-indexer-engine and suggest a code improvement but maybe there is an easier way?

 

We use CDH 5.0.1. This is the execution stack:

 

14/05/26 17:13:11 ERROR impl.SepEventExecutor: Error while processing event
java.lang.RuntimeException: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: [was class java.io.CharConversionException] Invalid UTF-8 character 0xfffe at char #2290, byte #2047)
at com.ngdata.hbaseindexer.indexer.IndexingEventListener.processEvents(IndexingEventListener.java:87)
at com.ngdata.sep.impl.SepEventExecutor$1.run(SepEventExecutor.java:97)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: [was class java.io.CharConversionException] Invalid UTF-8 character 0xfffe at char #2290, byte #2047)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:519)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:207)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:202)
at org.apache.solr.client.solrj.impl.LBHttpSolrServer.doRequest(LBHttpSolrServer.java:312)
at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:273)
at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:310)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
at com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter.retryAddsIndividually(DirectSolrInputDocumentWriter.java:123)
at com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter.add(DirectSolrInputDocumentWriter.java:108)
at com.ngdata.hbaseindexer.indexer.Indexer.indexRowData(Indexer.java:140)
at com.ngdata.hbaseindexer.indexer.IndexingEventListener.processEvents(IndexingEventListener.java:84)
... 6 more
14/05/26 17:13:11 WARN impl.SepConsumer: Error processing a batch of SEP events, the error will be forwarded to HBase for retry

 

 

Thanks in advance,

Rolf

1 ACCEPTED SOLUTION

avatar
Super Collaborator
To make Solr & XML Parser happy consider removing non-valid characters from input strings. Perhaps plug some sanity fixup logic into a custom morphline command, along similar lines as these:

https://github.com/kite-sdk/kite/blob/master/kite-morphlines/kite-morphlines-solr-cell/src/main/java...

Wolfgang.


View solution in original post

3 REPLIES 3

avatar
Super Collaborator
To make Solr & XML Parser happy consider removing non-valid characters from input strings. Perhaps plug some sanity fixup logic into a custom morphline command, along similar lines as these:

https://github.com/kite-sdk/kite/blob/master/kite-morphlines/kite-morphlines-solr-cell/src/main/java...

Wolfgang.


avatar
New Contributor

I put that into a new command as suggested and it works like a charm.

 

thank you.

avatar
New Contributor

How did you make it?Could you describe it in detail,thanks a lot.^_^