1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1851 | 04-03-2024 06:39 AM | |
| 2886 | 01-12-2024 08:19 AM | |
| 1594 | 12-07-2023 01:49 PM | |
| 2357 | 08-02-2023 07:30 AM | |
| 3248 | 03-29-2023 01:22 PM |
06-14-2017
03:36 PM
Fixed by removing almost all spaces Can't have empty lines, can't have line breaks, no spaces between array elements. {"type":"record","namespace":"hortonworks.hdp.refapp.sensehat","name":"sensehat","fields":[{"name": "tempf", "type": "float"},{ "name": "cputemp", "type": "float"},{ "name": "pressure","type": "float"},{ "name": "host","type": "string"},{ "name": "pitch","type": "float"},{"name": "ipaddress","type": "string"},{"name": "temp","type": "float"},{ "name": "diskfree","type": "string"},{ "name": "yaw","type": "float" },{"name": "humidity","type": "float"},{"name": "memory","type": "float"},{"name": "y", "type": "float"},{"name": "x", "type": "float" },{"name": "z","type": "float"},{"name": "roll", "type": "float"}]}
... View more
02-13-2019
09:54 PM
This is superb. I plan to do the same. Quick question , whats the version of the Minifi in the sense is it Java or C ++ . @Timothy Spann
... View more
05-29-2017
03:15 PM
Thanks Timothy I have find out the way to read template's xml file in onTrigger method of my custom processor, but I am getting some issues. I am using getTemplates() method of StandardNiFiServiceFacade class that implements NiFiServiceFacade interface, and I make object of that interface like this -- private NiFiServiceFacade serviceFacade = new StandardNiFiServiceFacade(); in my custom processor's code. But when I am going to up my nifi server, I got error on this line. Below are the error I am getting: java.util.ServiceConfigurationError: org.apache.nifi.processor.Processor: Provider org.apache.nifi.processors.hadoop.SparkConnector could not be instantiated
at java.util.ServiceLoader.fail(ServiceLoader.java:232) ~[na:1.8.0_111]
at java.util.ServiceLoader.access$100(ServiceLoader.java:185) ~[na:1.8.0_111]
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384) ~[na:1.8.0_111]
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) ~[na:1.8.0_111]
at java.util.ServiceLoader$1.next(ServiceLoader.java:480) ~[na:1.8.0_111]
at org.apache.nifi.nar.ExtensionManager.loadExtensions(ExtensionManager.java:116) ~[nifi-nar-utils-1.1.2.jar:1.1.2]
at org.apache.nifi.nar.ExtensionManager.discoverExtensions(ExtensionManager.java:97) ~[nifi-nar-utils-1.1.2.jar:1.1.2]
at org.apache.nifi.NiFi.<init>(NiFi.java:139) ~[nifi-runtime-1.1.2.jar:1.1.2]
at org.apache.nifi.NiFi.main(NiFi.java:262) ~[nifi-runtime-1.1.2.jar:1.1.2] Caused by: java.lang.NoClassDefFoundError: org/apache/nifi/web/revision/RevisionClaim
at org.apache.nifi.processors.hadoop.SparkConnector.<init>(SparkConnector.java:53) ~[nifi-hdfs-processors-1.1.2.jar:1.1.2]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_111]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_111]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_111]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_111]
at java.lang.Class.newInstance(Class.java:442) ~[na:1.8.0_111]
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380) ~[na:1.8.0_111]
... 6 common frames omitted Caused by: java.lang.ClassNotFoundException: org.apache.nifi.web.revision.RevisionClaim
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[na:1.8.0_111]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_111]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_111]
... 13 common frames omitted
I wont able to understand what exactly is the problem, as I have already added all required jar files in nar. Still I am getting this error. Please suggest me what I am doing wrong here or tell me some alternate way if you know that how I get all template object in TemplateEntity class in my onTrigger method of customProcessor. Thanks in advance
... View more
05-22-2017
11:23 PM
2 Kudos
Ingesting JMS Data into Hive A company has a lot of data transmitting around the enterprise asynchronously with Apache ActiveMQ, they want to tap into it and convert JSON messages coming from web servers and store into Hadoop. I am storing the data into Apache Phoenix / HBase via SQL. And also since it's so easy, I am storing the data into ORC files in HDFS for Apache Hive access.
Apache NiFi 1.2 generates DDL for a Hive Table for us hive.ddl
CREATE EXTERNAL TABLE IF NOT EXISTS meetup
(id INT, first_name STRING, last_name STRING, email STRING, ip_address STRING, company STRING, macaddress STRING, cell_phone STRING) STORED AS ORC
LOCATION '/meetup'
insert into meetup
(id , first_name , last_name , email , ip_address , company , macaddress , cell_phone)
values
(?,?,?,?,?,?,?,?)
HDF Flow
ConsumeJMS Path 1: Store in Hadoop as ORC with a Hive Table
InferAvroSchema: get a schema from the JSON data ConvertJSONtoAVRO: build an AVRO file from JSON data MergeContent: build a larger chunk of AVRO data ConvertAvroToORC: build ORC files PutHDFS: land in your Hadoop data lake
Path 2: Upsert into Phoenix (or any SQL database)
EvaluateJSONPath: extract fields from the JSON file UpdateAttribute: Set SQL fields ReplaceText: Create SQL statement with ?. PutSQL: Send to Phoenix through Connection Pool (JDBC) Path 3: Store Raw JSON in Hadoop
PutHDFS: Store JSON data on ingest
Path 4: Call Original REST API to Obtain Data and Send to JMS
GetHTTP: call a REST API to retrieve JSON arrays SplitJSON: split the JSON file into individual records PutJMS <or> PublishJMS: two ways to push messages to JMS. One uses a JMS controller and another uses a JMS client without a controller. I should benchmark this.
Error Message from Failed Load
If I get errors on JMS send, I send the UUID of the file to Slack for ChatOps.
Zeppelin Display of SQL Data
To check the tables I use Apache Zeppelin to query Phoenix and Hive tables.
Formatting in UpdateAttribute for SQL Arguments To set the ? properties for the JDBC prepared statements, we go by #, starting from 1. The type is the JDBC Type, 12 is String; and the value is the value of the FlowFile fields. Yet another message queue ingested with no fuss.
... View more
Labels:
05-22-2017
08:16 PM
4 Kudos
Backup Files from Hadoop
ListHDFS
set parameters, pick a high level directory and work down. /etc/hadoop/conf/core-site.xml FetchHDFS
${path}/${filename} PutFile Store in local Backup Hive Tables SelectHiveQL
Output format AVRO, with SQL: select * from beaconstatus ConvertAVROtoORC
generic for all the tables UpdateAttribute
tablename ${hive.ddl:substringAfter('CREATE EXTERNAL TABLE IF NOT EXISTS '):substringBefore(' (')} PutFile
Use replace directories and create missing directories with directory: /Volumes/Transcend/HadoopBackups/hive/${tablename} For Phoenix tables, I use the same ConvertAvroToORC, UpdateAttribute and PutFile boxes and just add ExecuteSQL to ingest Phoenix data. For every new table, I add one box and link it to ConvertAvroToORC. Done! This is enough of a backup so if I need to rebuild and refill my development cluster, I can do so easily. Also I have them schedule for once a day to rewrite everything. This is Not For Production or Extremely Large Data! This works great for a development cluster or personal dev cluster. You can easily backup files by ingesting with GetFile and other things can be backed up by called ExecuteStreamCommand. Local File Storage of Backed up Data drwxr-xr-x 3 tspann staff 102 May 20 23:00 any_data_trials2
drwxr-xr-x 3 tspann staff 102 May 20 22:59 any_data_meetup
drwxr-xr-x 3 tspann staff 102 May 20 22:59 any_data_ibeacon
drwxr-xr-x 3 tspann staff 102 May 20 22:57 any_data_gpsweather
drwxr-xr-x 3 tspann staff 102 May 20 10:53 any_data_beaconstatus
drwxr-xr-x 3 tspann staff 102 May 20 10:52 any_data_beacongateway
drwxr-xr-x 3 tspann staff 102 May 19 17:36 any_data_atweetshive2
drwxr-xr-x 3 tspann staff 102 May 19 17:31 any_data_atweetshive
Other Tools to Extract Data ShowTables to get your list and then you can grab all the DDL for the Hive tables. ddl.sql show create table atweetshive;
show create table atweetshive2;
show create table beacongateway;
show create table beaconstatus;
show create table dronedata;
show create table gps;
show create table gpsweather;
show create table ibeacon;
show create table meetup;
show create table trials2; Hive Script to Export Table DDL beeline -u jdbc:hive2://myhiveserverthrift:10000/default --color=false --showHeader=false --verbose=false --silent=true --outputformat=csv -f ddl.sql Backup Zeppelin Notebooks in Bulk tar -cvf notebooks.tar /usr/hdp/current/zeppelin-server/notebook/
gzip -9 notebooks.tar
scp userid@pservername:/opt/demo/notebooks.tar.gz .
... View more
Labels:
03-05-2018
03:51 PM
we are running inception see here: /tensorflow/models/tutorials/image/imagenet/classify_image.py
... View more
09-12-2017
02:03 AM
Make sure Phoenix is enabled. PQS can be running with HBase not enabled for phoenix.
... View more
08-16-2018
12:30 PM
i too wanted it
... View more
05-10-2017
06:29 AM
Hey @Timothy Spann, this is a really cool demo involving all my favorites Hadoop, Rpi, NiFi and IoT. Great job & Keep it up!
... View more
05-08-2017
05:39 PM
Thanks for the write up. It is probably permissions required for writing temp space and Hive datawarehouse HDFS structure. Always good to use permissions. If you had kerberos or required logins or ran with a different user you may face issues as well.
... View more