About cgambino

cgambino · ‎04-19-2016

The flow leading into that and the AttributesToJson Configuration

cgambino · ‎04-19-2016

So the execute SQL processor successfully connects with the same settings. returns Objavro.schemaú{"type":"record","name":"Home_Events","namespace":"any.data","fields":[{"name":"hub_insteon_id","type":["null","string"]},{"name":"device_insteon_id","type":["null","string"]},{"name":"device_group","type":["null","string"]},{"name":"status","type":["null","string"]},{"name":"recieved_at","type":["null","string"]}]} and the JsonToSQL processor is configured as

cgambino · ‎04-19-2016

I am having trouble with my JsonToSQL processor in nifi. I am trying to post to This table 'Home_Events', 'CREATE TABLE `Home_Events` (\n `hub_insteon_id` varchar(255) DEFAULT NULL,\n `device_insteon_id` varchar(45) DEFAULT NULL,\n `device_group` varchar(45) DEFAULT NULL,\n `status` varchar(45) DEFAULT NULL,\n `recieved_at` varchar(45) DEFAULT NULL\n) ENGINE=InnoDB DEFAULT CHARSET=latin1' With this JSON {"hub_insteon_id":"","device_group":"1","device_insteon_id":"3F68A2","recieved_at":"","status":"on"} Getting This Error ConvertJSONToSQL[id=d0dd4cc5-f2ab-43ab-8921-b2aafea03cb5] Failed to convert StandardFlowFileRecord[uuid=611848ee-f0e8-40a7-8119-0539d4b531dd,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1461081489294-74, container=default, section=74], offset=8383, length=127],offset=0,name=180088917788802,size=127] to a SQL INSERT statement due to org.apache.nifi.processor.exception.ProcessException: None of the fields in the JSON map to the columns defined by the Home_Automation.Home_Events table; routing to failure: org.apache.nifi.processor.exception.ProcessException: None of the fields in the JSON map to the columns defined by the Home_Automation.Home_Events table Any ideas how to resolve this?

cgambino · ‎04-15-2016

Two answers: For a rolling window look into "DistributedSetCache" as that allows the most recent X events to be lookedup for time chunking this jira question(also asked by you) resolves it https://issues.apache.org/jira/browse/NIFI-1775

cgambino · ‎04-14-2016

https://github.com/spark-jobserver/spark-jobserver#ad-hoc-mode---single-unrelated-jobs-transient-context details jobs to be started from the spark job server if there is one present. I don't be believe the hortonworks stack has it by default but it could still be a good option if this is a requirement

cgambino · ‎04-14-2016

Can we use spark's rest API to invoke the job when the flow file hits the invokehttp processor? http://arturmkrtchyan.com/apache-spark-hidden-rest-api

cgambino · ‎04-13-2016

We recently redid our site. Check out this link, there is a download for a .gz on this page http://hortonworks.com/downloads/#dataflow

cgambino · ‎04-12-2016

Hey Sunile, I believe you are looking for the unpack content processor found here https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.UnpackContent/index.html Allowable file types: use mime.type attribute tar zip flowfile-stream-v3 flowfile-stream-v2 flowfile-tar-v1

cgambino · ‎04-06-2016

Recently I had a client ask about how would we go about connecting a windows share to Nifi to HDFS, or if it was even possible. This is how you build a working proof of concept to demo the capabilities! You will need two Servers or Virtual machines. One for windows, one for Hadoop + Nifi. I personally elected to use these two The Sandbox http://hortonworks.com/products/hortonworks-sandbox/ A windows VM running win 7 https://developer.microsoft.com/en-us/microsoft-edge/tools/vms/linux/ You then need to install nifi on the sandbox, I find this repo to be the easiest to follow. https://github.com/abajwa-hw/ambari-nifi-service Be sure the servers can talk to each other directly, I personally used a bridged network connection in virtual box and looked up the IPs on my router's control panel. Next you need to setup a windows share of some format. This can be combined with active directory but I personally just enabled guest accounts and made an account called Nifi_Test. These instructions were the basis of creating a windows share http://emby.media/community/index.php?/topic/703-how-to-make-unc-folder-shares/ Keep in mind network user permissions may get funky and the example above will enforce a read only permission unless you do additional work. Now you have mount the share into the hadoop machine using CIFs+Samba. The instructions I followed are here http://blog.zwiegnet.com/linux-server/mounting-windows-share-on-centos/ Finally we are able to setup nifi to read the mounted drive and post it to HDFS. The GetFile processor retrieves the files while the PutHDFS stores it. To configure HDFS for the incoming data I ran the following commands on the sandbox: "su HDFS" ; “Hadoop dfs -mkdir /user/nifi” ; “Hadoop dfs -chmod 777 /user/nifi” I elected to keep the source file for troubleshooting purposes so that every time the processor ran it would just stream the data in. GetFile Configuration The PutHDFS Configuration for sandbox And finally run it and confirm it lands in HDFS!

cgambino · ‎04-03-2016

I have a plan to write a 3 part “intro” series as to how to handle your XML files. The subjects will be: Basic XML and Feature Extraction via Text Managment, Splitting and Xpath Interactive Text Handling with XQuery and Regex in relation to XMLs XML schema validation and transformations XML data is read into the flowfile contents when the file lands in nifi. As long as it is a valid XML format the 5 dedicated XML processors can be applied to it for management and feature extraction. Commonly a user will want to get this XML data into a database which will require us to do a feature extraction and convert to a new format such as JSON or AVRO. The simplest of the XML processors is the “SplitXml” processor. This simply takes the current selection of data and breaks the children off into their own files. The depth of the split in relation to the root is configurable as shown below. An example of when this may be helpful is when you have a list of events, each of which should be treated seperatly XPath is is a syntax language way of extracting information from an XML. It allows you to search for nodes based on hierarchy, name, or even attribute. It has limited regex integration and has framework for moderately complex queries. More complete documentation can be found here http://www.w3schools.com/xsl/xpath_syntax.asp The processor below shows the “EvaluateXPath” processor being combined with XPath language to extract node data and an attribute. It should not be confused for XQuery which I will cover in my next article. With executing the Xpath module something very important happens, the xml attributes are now NIFI attributes. This allows us to apply routing and other intelligence that is Nifi's signature. One of the transformations I have previously worked on is how to get the XML data into an AVRO format for easy ingestion. At this time all of the AVRO processors in nifi play nicely with JSONs so the “AttributestoJSON” processor can be used to as an out of the box intermediary to get the format you need. Note that I have set the destination of the processor to “flowfile-contents” which will over-ride the existing XML contents for a JSON. With a JSON + attributes this is a very easy flow file to work with and can be easily merged into existing workflows or written out to a file for the Hive SerDe.

Online	Offline
Last Visited	‎08-03-2017 04:57 PM

Member Since	‎01-17-2016 07:10 PM
Last Visited	‎08-03-2017 04:57 PM
Posts	42
Kudos received	50

Cloudera Community

Re: Nifi UpdateAttribute questions

Re: Are there processors that can share/collect fl...

Re: GZipped version of the Hortonworks DataFlow li...

Re: Does NiFi have a processor to expand a tar or ...

Re: Trouble with JsontoSQL processor in Nifi

Re: Trouble with JsontoSQL processor in Nifi

Trouble with JsontoSQL processor in Nifi

Re: Are there processors that can share/collect fl...

Re: Can I use NiFi to launch Spark (or other YARN)...

Re: Can I use NiFi to launch Spark (or other YARN)...

Re: GZipped version of the Hortonworks DataFlow li...

Re: Does NiFi have a processor to expand a tar or ...

Windows Share + Nifi + HDFS – A Practical Guide

Parsing XML Logs With Nifi – Part 1 of 3