Created 12-20-2018 01:30 PM
I am needing some guidance on parsing some weather data in xml. I have tried the databricks xml package to no avail. example.xml I am using spark 2.3
Created 12-20-2018 03:10 PM
XML parsing is easy in NiFi. Your flow should be Weather Data XML -> NiFi -> push to Spark or just push to final destination. No reason to do spark for something simple like that.
https://community.hortonworks.com/articles/101904/part-2-iot-augmenting-gps-data-with-weather.html
https://community.hortonworks.com/articles/25720/parsing-xml-logs-with-nifi-part-1-of-3.html
Created 12-20-2018 05:22 PM
Maybe show this, that link doesn't render in a browser well:
<?xml version="1.0" encoding="UTF-8"?> |
<wmo-bulletin category-subcode="37" header-time="261500" category-code="SA" region="IR" originator="OIII" wmo-header="SAIR37 OIII 261500" leads-receipt-time="2018-11-26T15:04:24Z"><![CDATA[SAIR37 OIII 261500 |
METAR OIBL 261500Z 00000KT 9999 SCT030 22/20 Q1016= |
METAR OIBQ 261500Z 29016KT 9999 SCT020 20/16 Q1019= |
METAR OIIK 261500Z 03002KT 9999 FEW040 BKN090 07/01 Q1017= |
METAR OIMC 261500Z AUTO 08004KT //// // ////// 06/05 Q1018= |
METAR OIMD 261500Z NIL= |
METAR OIMQ 261500Z AUTO 24004KT //// // ////// 09/09 Q1015= |
METAR OINE 261500Z 00000KT 4000 -RA BR BKN015 OVC080 08/08 Q1019= |
METAR OITK 261500Z NIL= |
METAR OITM 261500Z 30002KT 9999 FEW037 05/02 Q1019= |
]]></wmo-bulletin> |
<?xml version="1.0" encoding="UTF-8"?> |
<wmo-bulletin afos-header="LSRLOT" afos-category="LSR" category-subcode="53" header-time="261503" category-code="NW" region="US" originator="KLOT" wmo-header="NWUS53 KLOT 261503" afos-designator="LOT" leads-receipt-time="2018-11-26T15:0 |
4:19Z"><![CDATA[NWUS53 KLOT 261503 |
LSRLOT |
PRELIMINARY LOCAL STORM REPORT |
NATIONAL WEATHER SERVICE CHICAGO IL |
903 AM CST MON NOV 26 2018 |
..TIME... ...EVENT... ...CITY LOCATION... ...LAT.LON... |
..DATE... ....MAG.... ..COUNTY LOCATION..ST.. ...SOURCE.... |
..REMARKS.. |
0700 AM HEAVY SNOW 1 WSW HARVARD 42.42N 88.63W |
11/26/2018 M9.0 INCH MCHENRY IL CO-OP OBSERVER |
&& |
]]></wmo-bulletin> |
Created 12-20-2018 05:22 PM
Two big CDATA blocks, not just one.
Created 02-23-2019 10:55 AM
You can test this NIFI groovy processor that converts XML files to CSV or AVRO
https://github.com/maxbback/nifi-xml