<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Xquery morphline command throws an exception for malformed XML in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/22061#M3845</link>
    <description>&lt;P&gt;TryRules will do too and more elegant.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
    <pubDate>Thu, 27 Nov 2014 12:08:16 GMT</pubDate>
    <dc:creator>akhettar</dc:creator>
    <dc:date>2014-11-27T12:08:16Z</dc:date>
    <item>
      <title>Xquery morphline command throws an exception for none well formed XML</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/21910#M3842</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have written a morphline xquery command - see below snippet- which expects an XML as an input and indexes a list of fields and two other fields &lt;SPAN&gt;&amp;nbsp;"&lt;/SPAN&gt;a:name" and "a:version". Note that the data is coming from hbase. &amp;nbsp;When the input is a none well formed XML, the indexer throws an exception which is the correct behaviour. The problem is that the other fields: "&lt;SPAN&gt;a:name" and "a:version" do not get indexed as result of the filed "p:in" containing an none well formed XML. Also, subsequent input into hbase are all valid XML do not get indexed either, somehow the indexer gets stuck on that exception and never reover until the bad XML is removed from hbase.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I have tried to use try and catch exception in the Xquery, only to find out the error happens well beofre the Xquery engine &amp;nbsp;parses the XML - see below stack trace.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Looks like the SaxonCommand class isn't handling the exception well. Any ideas on how to solve this use case is very much welecome. You may be wondering as to why we are processing a none well formed XML in the first place. Business uses case mandates that we have to audit these bad XMLs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Ayache&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;MORPHLINE CONF&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;morphlines : [&lt;BR /&gt;{&lt;BR /&gt;id : morphline1&lt;BR /&gt;importCommands : ["org.kitesdk.**", "com.ngdata.**"]&lt;/P&gt;&lt;P&gt;commands : [&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt;extractHBaseCells {&lt;BR /&gt;mappings : [&lt;BR /&gt;{&lt;BR /&gt;inputColumn : "a:name"&lt;BR /&gt;outputField : "serviceName"&lt;BR /&gt;type : string&lt;BR /&gt;source : value&lt;BR /&gt;}&lt;BR /&gt;{&lt;BR /&gt;inputColumn : "a:version"&lt;BR /&gt;outputField : "serviceVersion"&lt;BR /&gt;type : string&lt;BR /&gt;source : value&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;{&lt;BR /&gt;inputColumn : "p:in"&lt;BR /&gt;outputField : "_attachment_body"&lt;BR /&gt;type : "byte[]"&lt;BR /&gt;source : value&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;}&lt;BR /&gt;&lt;BR /&gt;{&lt;BR /&gt;if {&lt;BR /&gt;conditions : [&lt;BR /&gt;{ equals { _attachment_body : [] } }&lt;BR /&gt;]&lt;BR /&gt;then : [&lt;/P&gt;&lt;P&gt;{logInfo { format : "no payload..." } }&lt;/P&gt;&lt;P&gt;]&lt;BR /&gt;else : [&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt;xquery {&lt;BR /&gt;languageVersion : "3.0"&lt;BR /&gt;fragments : [&lt;BR /&gt;{&lt;BR /&gt;fragmentPath : "/"&lt;BR /&gt;queryString : """&lt;/P&gt;&lt;P&gt;(: All namespace declarations go here &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;(: Extracting all the fieleds that need indexing &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;try {&lt;BR /&gt;let $source := /message/messageData/LegacyMessage/LegacyHeader/SenderDetails/SendingApplication/text() | /message/messageHeader/sendingApplication/text()&lt;BR /&gt;let $orgId := /message/messageData/LegacyMessage/LegacyHeader/SenderDetails/ODSID/text() | /message/securityData/organisationId/text()&lt;BR /&gt;let $prescriptionId := /message/messageData/*/prescriptionId/text() | /message/messageData/*/UPN/text()&lt;BR /&gt;let $transaId := /message/messageHeader/transactionId/text()&lt;/P&gt;&lt;P&gt;(: Returning the list of the fields that needs to be indexed. These fields are defined in solar schema.xml file. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;return&lt;BR /&gt;&amp;lt;fieldsToIndex&amp;gt;&lt;BR /&gt;&amp;lt;source&amp;gt;{$source}&amp;lt;/source&amp;gt;&lt;BR /&gt;&amp;lt;organisationNationalId&amp;gt;{$orgId}&amp;lt;/organisationNationalId&amp;gt;&lt;BR /&gt;&amp;lt;prescriptionId&amp;gt;{$prescriptionId}&amp;lt;/prescriptionId&amp;gt;&lt;BR /&gt;&amp;lt;transactionId&amp;gt;{$transaId}&amp;lt;/transactionId&amp;gt;&lt;BR /&gt;&amp;lt;/fieldsToIndex&amp;gt;&lt;BR /&gt;}&lt;BR /&gt;catch * {&lt;BR /&gt;$err:code, $err:value, " module: ",&lt;BR /&gt;$err:module, "(", $err:line-number, ",", $err:column-number, ")"&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;"""&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;{ logInfo { format : "Finished processing..." }&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;]&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;STACK TRACE&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;4/11/21 15:37:16 WARN impl.SepConsumer: Error processing a batch of SEP events, the error will be forwarded to HBase for retry&lt;BR /&gt;java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at java.util.concurrent.FutureTask.report(FutureTask.java:122)&lt;BR /&gt;at java.util.concurrent.FutureTask.get(FutureTask.java:188)&lt;BR /&gt;at com.ngdata.sep.impl.SepConsumer.waitOnSepEventCompletion(SepConsumer.java:294)&lt;BR /&gt;at com.ngdata.sep.impl.SepConsumer.replicateWALEntry(SepConsumer.java:275)&lt;BR /&gt;at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20176)&lt;BR /&gt;at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)&lt;BR /&gt;at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)&lt;BR /&gt;at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74)&lt;BR /&gt;at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)&lt;BR /&gt;at java.util.concurrent.FutureTask.run(FutureTask.java:262)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:745)&lt;BR /&gt;Caused by: java.lang.RuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at com.ngdata.hbaseindexer.indexer.IndexingEventListener.processEvents(IndexingEventListener.java:102)&lt;BR /&gt;at com.ngdata.sep.impl.SepEventExecutor$1.run(SepEventExecutor.java:97)&lt;BR /&gt;... 5 more&lt;BR /&gt;Caused by: org.kitesdk.morphline.api.MorphlineRuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at org.kitesdk.morphline.base.FaultTolerance.handleException(FaultTolerance.java:73)&lt;BR /&gt;at com.ngdata.hbaseindexer.morphline.LocalMorphlineResultToSolrMapper.map(LocalMorphlineResultToSolrMapper.java:245)&lt;BR /&gt;at com.ngdata.hbaseindexer.morphline.MorphlineResultToSolrMapper.map(MorphlineResultToSolrMapper.java:145)&lt;BR /&gt;at com.ngdata.hbaseindexer.indexer.Indexer$RowBasedIndexer.calculateIndexUpdates(Indexer.java:289)&lt;BR /&gt;at com.ngdata.hbaseindexer.indexer.Indexer.indexRowData(Indexer.java:144)&lt;BR /&gt;at com.ngdata.hbaseindexer.indexer.IndexingEventListener.processEvents(IndexingEventListener.java:99)&lt;BR /&gt;... 6 more&lt;BR /&gt;Caused by: org.kitesdk.morphline.api.MorphlineRuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at org.kitesdk.morphline.saxon.SaxonCommand.doProcess(SaxonCommand.java:78)&lt;BR /&gt;at org.kitesdk.morphline.stdio.AbstractParser.doProcess(AbstractParser.java:96)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.stdlib.IfThenElseBuilder$IfThenElse.doProcess(IfThenElseBuilder.java:110)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.base.Connector.process(Connector.java:64)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)&lt;BR /&gt;at org.kitesdk.morphline.stdlib.ConvertTimestampBuilder$ConvertTimestamp.doProcess(ConvertTimestampBuilder.java:161)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.base.Connector.process(Connector.java:64)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)&lt;BR /&gt;at org.kitesdk.morphline.stdlib.ConvertTimestampBuilder$ConvertTimestamp.doProcess(ConvertTimestampBuilder.java:161)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.base.Connector.process(Connector.java:64)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)&lt;BR /&gt;at com.ngdata.hbaseindexer.morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells.doProcess(ExtractHBaseCellsBuilder.java:86)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at com.ngdata.hbaseindexer.morphline.LocalMorphlineResultToSolrMapper.map(LocalMorphlineResultToSolrMapper.java:239)&lt;BR /&gt;... 10 more&lt;BR /&gt;Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)&lt;BR /&gt;at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)&lt;BR /&gt;at com.ctc.wstx.sr.BasicStreamReader.reportWrongEndElem(BasicStreamReader.java:3258)&lt;BR /&gt;at com.ctc.wstx.sr.BasicStreamReader.readEndElem(BasicStreamReader.java:3200)&lt;BR /&gt;at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2832)&lt;BR /&gt;at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)&lt;BR /&gt;at org.kitesdk.morphline.saxon.XMLStreamCopier.copy(XMLStreamCopier.java:169)&lt;BR /&gt;at org.kitesdk.morphline.saxon.SaxonCommand.parseXmlDocument(SaxonCommand.java:99)&lt;BR /&gt;at org.kitesdk.morphline.saxon.XQueryBuilder$XQuery.doProcess2(XQueryBuilder.java:158)&lt;BR /&gt;at org.kitesdk.morphline.saxon.SaxonCommand.doProcess(SaxonCommand.java:74)&lt;BR /&gt;... 29 more&lt;BR /&gt;14/11/21 15:37:16 ERROR impl.SepConsumer: Encountered exceptions on 2 batches (out of 2 total batches)&lt;BR /&gt;14/11/21 15:37:16 ERROR ipc.RpcServer: Unexpected throwable object&lt;BR /&gt;java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at com.ngdata.sep.impl.SepConsumer.waitOnSepEventCompletion(SepConsumer.java:309)&lt;BR /&gt;at com.ngdata.sep.impl.SepConsumer.replicateWALEntry(SepConsumer.java:275)&lt;BR /&gt;at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:20176)&lt;BR /&gt;at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)&lt;BR /&gt;at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)&lt;BR /&gt;at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74)&lt;BR /&gt;at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)&lt;BR /&gt;at java.util.concurrent.FutureTask.run(FutureTask.java:262)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:745)&lt;BR /&gt;Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at java.util.concurrent.FutureTask.report(FutureTask.java:122)&lt;BR /&gt;at java.util.concurrent.FutureTask.get(FutureTask.java:188)&lt;BR /&gt;at com.ngdata.sep.impl.SepConsumer.waitOnSepEventCompletion(SepConsumer.java:294)&lt;BR /&gt;... 10 more&lt;BR /&gt;Caused by: java.lang.RuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at com.ngdata.hbaseindexer.indexer.IndexingEventListener.processEvents(IndexingEventListener.java:102)&lt;BR /&gt;at com.ngdata.sep.impl.SepEventExecutor$1.run(SepEventExecutor.java:97)&lt;BR /&gt;... 5 more&lt;BR /&gt;Caused by: org.kitesdk.morphline.api.MorphlineRuntimeException: org.kitesdk.morphline.api.MorphlineRuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at org.kitesdk.morphline.base.FaultTolerance.handleException(FaultTolerance.java:73)&lt;BR /&gt;at com.ngdata.hbaseindexer.morphline.LocalMorphlineResultToSolrMapper.map(LocalMorphlineResultToSolrMapper.java:245)&lt;BR /&gt;at com.ngdata.hbaseindexer.morphline.MorphlineResultToSolrMapper.map(MorphlineResultToSolrMapper.java:145)&lt;BR /&gt;at com.ngdata.hbaseindexer.indexer.Indexer$RowBasedIndexer.calculateIndexUpdates(Indexer.java:289)&lt;BR /&gt;at com.ngdata.hbaseindexer.indexer.Indexer.indexRowData(Indexer.java:144)&lt;BR /&gt;at com.ngdata.hbaseindexer.indexer.IndexingEventListener.processEvents(IndexingEventListener.java:99)&lt;BR /&gt;... 6 more&lt;BR /&gt;Caused by: org.kitesdk.morphline.api.MorphlineRuntimeException: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at org.kitesdk.morphline.saxon.SaxonCommand.doProcess(SaxonCommand.java:78)&lt;BR /&gt;at org.kitesdk.morphline.stdio.AbstractParser.doProcess(AbstractParser.java:96)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.stdlib.IfThenElseBuilder$IfThenElse.doProcess(IfThenElseBuilder.java:110)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.base.Connector.process(Connector.java:64)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)&lt;BR /&gt;at org.kitesdk.morphline.stdlib.ConvertTimestampBuilder$ConvertTimestamp.doProcess(ConvertTimestampBuilder.java:161)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.base.Connector.process(Connector.java:64)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)&lt;BR /&gt;at org.kitesdk.morphline.stdlib.ConvertTimestampBuilder$ConvertTimestamp.doProcess(ConvertTimestampBuilder.java:161)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.base.Connector.process(Connector.java:64)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)&lt;BR /&gt;at com.ngdata.hbaseindexer.morphline.ExtractHBaseCellsBuilder$ExtractHBaseCells.doProcess(ExtractHBaseCellsBuilder.java:86)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)&lt;BR /&gt;at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)&lt;BR /&gt;at com.ngdata.hbaseindexer.morphline.LocalMorphlineResultToSolrMapper.map(LocalMorphlineResultToSolrMapper.java:239)&lt;BR /&gt;... 10 more&lt;BR /&gt;Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag &amp;lt;/dispenserDetails&amp;gt;; expected &amp;lt;/organisation&amp;gt;.&lt;BR /&gt;at [row,col {unknown-source}]: [87,27]&lt;BR /&gt;at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)&lt;BR /&gt;at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)&lt;BR /&gt;at com.ctc.wstx.sr.BasicStreamReader.reportWrongEndElem(BasicStreamReader.java:3258)&lt;BR /&gt;at com.ctc.wstx.sr.BasicStreamReader.readEndElem(BasicStreamReader.java:3200)&lt;BR /&gt;at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2832)&lt;BR /&gt;at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)&lt;BR /&gt;at org.kitesdk.morphline.saxon.XMLStreamCopier.copy(XMLStreamCopier.java:169)&lt;BR /&gt;at org.kitesdk.morphline.saxon.SaxonCommand.parseXmlDocument(SaxonCommand.java:99)&lt;BR /&gt;at org.kitesdk.morphline.saxon.XQueryBuilder$XQuery.doProcess2(XQueryBuilder.java:158)&lt;BR /&gt;at org.kitesdk.morphline.saxon.SaxonCommand.doProcess(SaxonCommand.java:74)&lt;BR /&gt;... 29 more&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:13:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/21910#M3842</guid>
      <dc:creator>akhettar</dc:creator>
      <dc:date>2022-09-16T09:13:59Z</dc:date>
    </item>
    <item>
      <title>Re: Xquery morphline command throws an exception for malformed XML</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/22056#M3843</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After investigating this further, found out that the exception is thrown in saxon command code when attempting to parse the malformed XML. I don't&amp;nbsp;it is right to throw an exceptin which results in aborting the indexer service.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;A workaround solution is presented below. It consists of using Java command to parse the XML and catch the SaxParseException, log the error and set the '_attachment_body' filed to indicate a malformed XML is detected and other fields present in the same record are being indexed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;if {&lt;BR /&gt;conditions: [&lt;BR /&gt;{&lt;BR /&gt;java&lt;BR /&gt;{&lt;BR /&gt;imports: "import java.util.*; import org.xml.sax.*; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import java.io.StringReader; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.InputSource;"&lt;BR /&gt;code: """&lt;/P&gt;&lt;P&gt;List payload = record.get("_attachment_body");&lt;BR /&gt;try {&lt;BR /&gt;// parse the content of record field&lt;BR /&gt;SAXParserFactory factory = SAXParserFactory.newInstance();&lt;BR /&gt;SAXParser saxParser = factory.newSAXParser();&lt;BR /&gt;InputSource is = new InputSource(new StringReader(new String((byte[]) payload.get(0))));&lt;BR /&gt;saxParser.parse(is, new DefaultHandler());&lt;BR /&gt;} catch (Exception e)&lt;BR /&gt;{&lt;BR /&gt;logger.error("Malformed XML detected");&lt;BR /&gt;logger.error("payload: {} for record: {}", payload, record);&lt;BR /&gt;return true;&lt;BR /&gt;}&lt;BR /&gt;return false;&lt;BR /&gt;"""&lt;BR /&gt;}&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;then: [&lt;BR /&gt;{&lt;BR /&gt;translate {&lt;BR /&gt;field : _attachment_body&lt;BR /&gt;dictionary : {&lt;BR /&gt;0 : dummy&lt;BR /&gt;}&lt;BR /&gt;fallback : "Malformed XML detected" # if no fallback is defined and no match is found then the command fails&lt;BR /&gt;}&lt;BR /&gt;}&lt;BR /&gt;{logInfo {format: "Ignoring none well formed XML..."}}&lt;BR /&gt;]&lt;BR /&gt;else: [&lt;BR /&gt;{&lt;BR /&gt;xquery {&lt;BR /&gt;fragments: [&lt;BR /&gt;{&lt;BR /&gt;fragmentPath: "/"&lt;/P&gt;&lt;P&gt;queryString: """&lt;/P&gt;&lt;P&gt;(: All namespace declarations go here &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;(: Extracting all the fieleds that need indexing &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;let $source := /message/messageData/LegacyMessage/LegacyHeader/SenderDetails/SendingApplication/text() | /message/messageHeader/sendingApplication/text()&lt;BR /&gt;let $orgId := /message/messageData/LegacyMessage/LegacyHeader/SenderDetails/ODSID/text() | /message/securityData/organisationId/text()&lt;BR /&gt;let $prescriptionId := /message/messageData/*/prescriptionId/text() | /message/messageData/*/UPN/text()&lt;BR /&gt;let $transaId := /message/messageHeader/transactionId/text()&lt;/P&gt;&lt;P&gt;(: Returning the list of the fields that needs to be indexed. These fields are defined in solar schema.xml file. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;return&lt;BR /&gt;&amp;lt;fieldsToIndex&amp;gt;&lt;BR /&gt;&amp;lt;source&amp;gt;{$source}&amp;lt;/source&amp;gt;&lt;BR /&gt;&amp;lt;organisationNationalId&amp;gt;{$orgId}&amp;lt;/organisationNationalId&amp;gt;&lt;BR /&gt;&amp;lt;prescriptionId&amp;gt;{$prescriptionId}&amp;lt;/prescriptionId&amp;gt;&lt;BR /&gt;&amp;lt;transactionId&amp;gt;{$transaId}&amp;lt;/transactionId&amp;gt;&lt;BR /&gt;&amp;lt;/fieldsToIndex&amp;gt;&lt;BR /&gt;"""&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;BR /&gt;}&lt;/P&gt;</description>
      <pubDate>Thu, 27 Nov 2014 09:34:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/22056#M3843</guid>
      <dc:creator>akhettar</dc:creator>
      <dc:date>2014-11-27T09:34:04Z</dc:date>
    </item>
    <item>
      <title>Re: Xquery morphline command throws an exception for malformed XML</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/22057#M3844</link>
      <description>&lt;P&gt;FYI, the tryRules command with the&amp;nbsp;catchExceptions : true parameter handles this kind of scenario more easily.&amp;nbsp;&lt;A target="_blank" href="http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#/tryRules"&gt;http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#/tryRules&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Nov 2014 10:06:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/22057#M3844</guid>
      <dc:creator>whosch</dc:creator>
      <dc:date>2014-11-27T10:06:11Z</dc:date>
    </item>
    <item>
      <title>Re: Xquery morphline command throws an exception for malformed XML</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/22061#M3845</link>
      <description>&lt;P&gt;TryRules will do too and more elegant.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 27 Nov 2014 12:08:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/22061#M3845</guid>
      <dc:creator>akhettar</dc:creator>
      <dc:date>2014-11-27T12:08:16Z</dc:date>
    </item>
    <item>
      <title>Re: Xquery morphline command throws an exception for malformed XML</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/89816#M3846</link>
      <description>&lt;P&gt;Can you please share your morphlines.conf? I am stuck in a similar situation.&lt;/P&gt;</description>
      <pubDate>Wed, 01 May 2019 19:27:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Xquery-morphline-command-throws-an-exception-for-none-well/m-p/89816#M3846</guid>
      <dc:creator>abhi9</dc:creator>
      <dc:date>2019-05-01T19:27:50Z</dc:date>
    </item>
  </channel>
</rss>

