1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
839 | 04-03-2024 06:39 AM | |
1615 | 01-12-2024 08:19 AM | |
800 | 12-07-2023 01:49 PM | |
1386 | 08-02-2023 07:30 AM | |
2001 | 03-29-2023 01:22 PM |
08-15-2016
10:14 PM
2 Kudos
Hi @Timothy Spann It really all depends on your particular use case and requirements. First, I'm assuming you have a custom-built application that will be querying this data store. If so, how complex do the queries need to be? Do you need Relational (SQL) or Key-Value store? Also, how much latency can you afford? I would first explore if HBase (or HBase + Phoenix) would be sufficient. This will reduce the number of moving parts you have. If you're set on in-memory data grids/stores then some options would be Redis, Hazelcast, Teracotta Big Memory and GridGain (Apache Ignite). I believe the last two have connectors to Hadoop that allow writing results of MR jobs directly to the data grid (you'll need to confirm that functionality though) Like I said before though, I recommend you exhaust the HBase option before moving out-of-stack.
... View more
06-14-2017
02:23 PM
I confirmed this to be a bug in ConvertJSONToSQL, I have written up NIFI-4071, please see the Jira for details.
... View more
07-19-2016
02:34 PM
1 Kudo
When developing the MQTT processors I used Mosquitto to test. I found it to be a very easy to use and simple to configure broker that handled a decently high throughput even on my laptop. That said, the NiFi MQTT processors should be able to communicate with any broker that handles the vanilla MQTT Api.
... View more
01-24-2018
07:56 PM
if you do a PutHDFS it generates an attribute hive.ddl that can be used to create a hive table. you can also generate hive.ddl with updateattribute with your code ${hive.ddl} LOCATION '${absolute.hdfs.path}'
... View more
11-30-2016
06:38 PM
To get started with the HDCloud for AWS general availability version, visit http://docs.hortonworks.com/HDPDocuments/HDCloudAWS/HDCloudAWS-1.8.0/bk_hdcloud-aws/content/index.html
... View more
07-25-2016
08:03 PM
1 Kudo
@Timothy Spann There is no officially supported processor to schedule VORA jobs using NiFi. However, A VORA agent communicates directly with the Spark Client when running in Yarn mode. You can write your program in Python or Scala which invokes the VORA classes and then call those scripts through spark-submit in NiFi using the ExecuteCommand processor.
... View more
07-15-2016
11:35 AM
1 Kudo
su hdfs
hadoop fs -mkdir /udf
hadoop fs -put urldetector-1.0-jar-with-dependencies.jar /udf/
hadoop fs -put libs/url-detector-0.1.15.jar /udf/
hadoop fs -chown -R hdfs /udf
hadoop fs -chgrp -R hdfs /udf
hadoop fs -chmod -R 775 /udf
Create Hadoop Directories and upload the two necessary libraries. CREATE FUNCTION urldetector as 'com.dataflowdeveloper.detection.URLDetector' USING JAR 'hdfs:///udf/urldetector-1.0-jar-with-dependencies.jar', JAR 'hdfs:///udf/url-detector-0.1.15.jar'; Create Hive Function with those HDFS referenced JARs select http_user_agent,urldetector(remote_host)asurls,remote_host from AccessLogs limit 100; Test the UDF via Hive QL @Description(name="urldetector", value="_FUNC_(string) - detectsurls")
public final class URLDetector extends UDF{} Java Header for the UDF set hive.cli.print.header=true;
add jar urldetector-1.0-jar-with-dependencies.jar;CREATE TEMPORARY FUNCTION urldetector as 'com.dataflowdeveloper.detection.URLDetector';select urldetector(description) from sample_07 limit 100; You can test with a temporary function through Hive CLI before making the function permanent. mvn compile assembly:single Build the Jar File for Deployment The library from LinkedIn (https://github.com/linkedin/URL-Detector) must be compiled and the JAR used in your code and deployed to Hive. References See: https://github.com/tspannhw/URLDetector for full source code.
... View more
Labels:
07-08-2016
09:50 PM
So the issue was the library I was using was compiled with JDK 8 and everything else is JDK 7. There was no issue listed, JUnit and local Java applications ran fine. When I manually uploaded the JAR, it gave me the dreaded "Unsupported major.minor version 52.0" With a properly compiled library, we will be fine. So make sure you compile in JDK 7 if your Hadoop / Hive platform is JDK 7
... View more
07-07-2016
11:23 PM
2 Kudos
Adding HDF (with Apache NiFi) to your HDP 2.5 Sandbox is very quick, painless and easy. Get the most recent Hortonworks DataFlow (download😞 wget http://d3d0kdwqv675cq.cloudfront.net/HDF/centos6/1.x/updates/1.2.0.1/HDF-1.2.0.1-1.tar.gz
tar -xvf HDF-1.2.0.1-1.tar.gz
cd HDF-1.2.0.1-1/nifi/ Then change the port used by NiFi in the conf/nifi.properties file to: nifi.web.http.port=8090 Install NiFi as a Linux Service bin/nifi.sh install
sudo service nifi start
NiFi home: /opt/HDF-1.2.0.1-1/nifi
Bootstrap Config File: /opt/HDF-1.2.0.1-1/nifi/conf/bootstrap.conf
2016-07-04 02:18:00,005 INFO [main] org.apache.nifi.bootstrap.Command Starting Apache NiFi...
2016-07-04 02:18:00,006 INFO [main] org.apache.nifi.bootstrap.Command Working Directory: /opt/HDF-1.2.0.1-1/nifi
You can check the status of single NiFi server via status command: [root@sandbox nifi]# sudo service nifi status
nifi.sh: JAVA_HOME not set; results may vary
Java home:
NiFi home: /opt/HDF-1.2.0.1-1/nifi
Bootstrap Config File: /opt/HDF-1.2.0.1-1/nifi/conf/bootstrap.conf
2016-07-04 02:18:42,527 INFO [main] org.apache.nifi.bootstrap.Command Apache NiFi is currently running, listening to Bootstrap on port 43184, PID=4391
Make sure you add port 8090 to the sandbox networking. You are now ready to go. Now start flowing.
... View more
Labels: