About TimothySpann

TimothySpann · ‎04-12-2017

As expected, Bryan had the solution. Thanks!

TimothySpann · ‎04-12-2017

I'll check it was done automagically by HDF 2.1 setup. Let me check the properties

TimothySpann · ‎04-12-2017

2017-04-12 15:25:12,604 INFO [NiFi logging handler] org.apache.nifi.StdOut Error occurred during initialization of VM 2017-04-12 15:25:12,604 INFO [NiFi logging handler] org.apache.nifi.StdOut Initial heap size set to a larger value than the maximum heap size

TimothySpann · ‎04-11-2017

You should stick with the JDK 8 version. New features require JDK 8. JDK 8 is very old at this point.

TimothySpann · ‎04-10-2017

Metastore on princeton10.field.hortonworks.com failed (Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/alert_hive_metastore.py", line 200, in execute timeout_kill_strategy=TerminateStrategy.KILL_PROCESS_TREE, File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) File "/usr/lib/p ython2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call raise ExecutionFailed(err_msg, code, out, err) ExecutionFailed: Execution of 'export HIVE_CONF_DIR='/usr/hdp/current/hive-metastore/conf' ; hive --hiveconf hive.metastore.uris=thrift://princeton10.field.hortonworks.com:9083 --hiveconf hive.metastore.client.connect.retry.delay=1 --hiveconf hive.metastore.failure.retries=1 --hiveconf hive.metastore.connect.retries=1 --hiveconf hive.metastore.client.socket.timeout=14 --hiveconf hive.execution.engine=mr -e 'show databases;'' returned 12. log4j:WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender. Logging initialized using configuration in file:/etc/hive/2.6.0.3-8/0/hive-log4j.properties hive.exec.post.hooks Class not found:org.apache.atlas.hive.hook.HiveHook FAILED: Hive Internal Error: java.lang.ClassNotFoundException(org.apache.atlas.hive.hook.HiveHook) java.lang.ClassNotFoundException: org.apache.atlas.hive.hook.HiveHook at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.hadoop.hive.ql.hooks.HookUtils.getHooks(HookUtils.java:60) at org.apache.hadoop.hive.ql.Driver.getHooks(Driver.java:1386) at org.apache.hadoop.hive.ql.Driver.getHooks(Driver.java:1370) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1598) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1291) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1158) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1148) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:315) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:712) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) ) This host-level alert is triggered if the Hive Metastore process cannot be determined to be up and listening on the network.

TimothySpann · ‎04-07-2017

Apache has good documentation for this, you need to make sure you have correct kerberos versions.

TimothySpann · ‎04-04-2017

please post full exception logs, SQL string, flow file, attributes. is your db connection working? for further debugging: https://dzone.com/articles/finding-nifi-errors

TimothySpann · ‎04-02-2017

The QueryDatabaseTable processor can easily ingest data from a table based on a incrementing key. A sequence id or primary key that is autogeneratored like Postgresql and MariaDB do is ideal. You can also do an incrementing data or Oracle Sequence ID. As long as it increments when you get a new one you can set. If your tables don't this, you could write a trigger or procedure in your database that sends it to a transaction table with such an autogenerated id and NiFi will grab that. Clearly real CDC involves reading Write Ahead Logs or Transaction logs at a deep level and grabbing all changes. That is coming and can now be done by tools like Atunity + NiFi. For use cases that I have, I just need to grab new rows when they are added to a table and I control the ID. I convert from AVRO to JSON so I can extract attributes since I want to do some routing based on column values. Based on one field in the table, I want to determine where I land the data. It can be sent to HBase (and Phoenix), HDFS or Hive. I split my records for easy processing. One thing you I highly recommend you do for SQL safety and to prevent errors. Example SQL for CDC: upsert into trials (trialid, trialdescription, fileName) values (1,'FENTANYL','5ab2d068-dd53-4674-bcf8-17f7d80d0553') CREATE EXTERNAL TABLE IF NOT EXISTS trials2 (trialid INT, trialdescription STRING, trialtype STRING) STORED AS ORC location '/hiveorc' CREATE TABLE trials (trialid integer not null primary key, trialdescription varchar, filename varchar); Set your SQL Attributes for SQL Safety. The types are the numeric values for JDBC Types. 12 is String. -5 is BIG INT. Then your SQL is standard JDBC syntax with ?'s for place markers. Here is some cool data: I used Google Location API called via NiFi REST CALL to enhance some data and get lat and long from a vague location. This kind of thing happens in Twitter all the time. Reference: https://www.mockaroo.com/ https://community.hortonworks.com/articles/51902/incremental-fetch-in-nifi-with-querydatabasetable.html

TimothySpann · ‎04-02-2017

Monitoring Apache NiFi It's really important to pick some Reporting Tasks to let you know what's happening in Apache NiFi servers. Ambari will send it to your HDF Ambari which will show the results in nice Grafana graphs, charts and tables. You can also monitor disk usage, memory and also send tasks to DataDog, Ganglia and Other Servers. It's also easy to write your own Reporting Task if you need a different one. One of the ways to monitor your Apache NiFi Data Flows is to use the MonitorActivity processor which will create messages that can be sent to your Operations Dashboard, Console or elsewhere. For people doing ChatOps, you can easily push these messages to Slack (there's a processor for that) PutSlack. You could also send a REST call to HipChat or other chat tools. Pretty easy to wrap that up in a custom processor as well. Other Things to Monitor REST END Points server:port/nifi-api/system-diagnostics See: https://nifi.apache.org/docs/nifi-docs/rest-api/ Logs ...nifi/logs/nifi-app.log and ..nifi/logs/nifi-user.log These can be ingested with Apache NiFi for detailed log processing. You can filter and send some messages to SumoLogic or elsewhere via Apache NiFi. See: https://community.hortonworks.com/content/kbentry/67309/routing-logs-through-apache-nifi-to-phoenix-hdfs-a.html

TimothySpann · ‎03-31-2017

FlowFile Continuation Sometimes you need to backup your current running flow, let that flow run at a later date, or make a backup of what is in process now. You want this in a permanent storage and want to reconstitute it later like Orange Juice. And add it back into the flow or restart it. This could be do to failures, for integration testing, for testing new versions of components, as a checkpoint or for many other purposes. You don't always want to reprocess the original source or files (they may be gone). Option 1: You can save that raw data that came in originally in local files or HDFS. Then read it out of there later. Option 2: Preferred: MergeContent to FlowFileV3 then Reload with Get* to IdentifyMimeType to UnpackContent Using MergeContent with FlowFileV3 option. After that step you can PutFile, PutS3Object, PutHDFS or other file saving options. Perhaps send it to an FTP or sFTP server for storage elsewhere. Now you have a pkg file. cat /opt/demo/flow/904381478117605.pkg NiFiFF3+tempf73.02sql.args.2.value29.7sql.args.11.type3roll353.9306742667328 mqtt.brokertcp://m13.cloudmqtt.com:14162sql.args.4.type3uuid$9f2f8b6f-2870-40a3-a460-49427cddf9a8 mqtt.topicsensorsql.args.7.type3sql.args.7.value353.9306742667328path./sql.args.4.value33.9sql.args.9.value-0.0sql.args.1.type1humidity29.7pitch14.015266431562901 nf.file.path.mqtt.qos0sql.args.8.type3temp33.9sql.args.1.value34sql.args.2.type3sql.args.10.type3sql.args.8.value128.4983979122009sql.args.5.type3sql.args.6.value14.015266431562901sql.args.3.value1011.1sql.args.10.value-0.0mqtt.isDuplicatefalspressure1011.1mqtt.isRetainedfalseyaw128.4983979122009cputemp3filename904381478117605sql.args.11.value1.0sql.args.9.type3x-0.0y-0.0z1.0sql.args.6.type3 nf.file.name904381478117605sql.args.5.value73.02sql.args.3.type3�[{"tempf": 73.02, "pressure": 1011.1, "pitch": 14.015266431562901, "temp": 33.9, "yaw": 128.4983979122009, "humidity": 29.7, "cputemp": "34", "y": -0.0, "x": -0.0, "z": 1.0, "roll": 353.9306742667328}]% You can now reload that FlowFileV3 at any time, send it to IdentifyMimeType (so it knows it's a FlowFileV3) and then use UnpackContent to reconstitute into the original flow file. Now you can use it like it never stopped and was sent to disk. Now you have an unlimited queue to store pre or partially processed files. Saving time! You could run really expensive processes once and save the preprocessed items, files or models and reuse everywhere! Choose: FlowFile Stream, v3 Thanks to Joe Witt for explanation of the process. Reference: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.UnpackContent/ https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/

Online	Offline
Last Visited	‎02-02-2026 11:23 PM

Member Since	‎01-07-2019 11:58 AM
Last Visited	‎02-02-2026 11:23 PM
Posts	1,973
Kudos received	1122

Cloudera Community

Re: Has anyone tried NiFi consuming (JMSConsume) f...

Re: NiFi Crash after runing chain of lookups

Re: Recommend approach for listening to RSS Feed i...

Re: NiFi ListenFTP Processor Default Data Port

Re: Nifi: Kafka Producer with Avro format in both ...

Re: HDF 2.1 NiFi 1.1 Won't Start

Re: HDF 2.1 NiFi 1.1 Won't Start

HDF 2.1 NiFi 1.1 Won't Start

Re: Are there NiFi jars built for Java 7?

HDP 2.6 Cluster Issues with Hive Metastore

Re: Apache Drill - Authentication failed: Server r...

Re: NiFi PutSQL exception.

QADCDC: Our how to ingest some database tables t...

Monitor Apache NiFi with Apache NiFi

Store a Flow to Disk and Then Reserialize It to Co...