About TimothySpann

TimothySpann · ‎07-22-2016

Thanks for the analysis. Does anyone have similiar sizings for Google and Azure:

TimothySpann · ‎07-22-2016

Are you on a sandbox? do you have access to / from the box? Did you get an error? It should show up in the UI. Is it always timeout. Can NiFi access anything?

TimothySpann · ‎07-22-2016

What are the best instance types for HDP nodes (master, data, edge)? I found a number of instance types that may work. Looking at TCO https://awstcocalculator.com/ Rough Pricing https://calculator.s3.amazonaws.com/index.html Amazon Types http://www.ec2instances.info/ Amazon EC2 Types https://aws.amazon.com/ec2/instance-types/

TimothySpann · ‎07-21-2016

Using the GetHTTP Processor we grab random images from the DigitalOcean's Unsplash.it free image site. I give it a random file name so we can save it uniquely in HDFS. The Entire Data Flow from GetHTTP to Final HDFS storage of image and it's metadata as JSON. ExtractMediaMetaData Processor The final results: hdfs dfs -cat /mediametadata/random1469112881039.json {"Number of Components":"3","Resolution Units":"none","Image Height":"200 pixels","File Name":"apache-tika-3181704319795384377.tmp", "Data Precision":"8 bits", "File Modified Date":"Thu Jul 21 14:54:43 UTC 2016","tiff:BitsPerSample":"8", "Compression Type":"Progressive,Huffman","X-Parsed-By":"org.apache.tika.parser.DefaultParser, org.apache.tika.parser.jpeg.JpegParser", "Component 1":"Y component: Quantization table 0, Sampling factors 2 horiz/2vert", "Component 2":"Cb component: Quantization table 1,Sampling factors 1 horiz/1 vert", "tiff:ImageLength":"200","mime.type":"image/jpeg","gethttp.remote.source":"unsplash.it", "Component3":"Cr component: Quantization table 1, Sampling factors 1 horiz/1vert", "X Resolution":"1 dot", "FileSize":"4701 bytes","tiff:ImageWidth":"200","path":"./", "filename":"random1469112881039.jpg","ImageWidth":"200 pixels", "uuid":"8b7c4f9f-9436-4ccb-b06e-9a720c91f6e0", "Content-Type":"image/jpeg", "YResolution":"1 dot"} We have as many images as we want. Using the Unsplash.it parameters I picked an image width of always 200. You can customize that. Below is the image downloaded with the above metadata.

TimothySpann · ‎07-21-2016

In Apache NiFi 1.2, there are processors to Get and Put data to an MQTT broker, which is popular in IoT because of it's small footprint and speed. MQTT is supported by Eclipse and IBM. I created an example on the HDP 2.6. I downloaded and installed the latest Apache NiFi 1.2 there as well as an example MQTT Broker (Mosquitto) http://mosquitto.org/. To Install Mosquitto on HDP 2.6 (Centos 7.x) sudo wget http://download.opensuse.org/repositories/home:/oojah:/mqtt/CentOS_CentOS-6/home:oojah:mqtt.reposudo cp *.repo /etc/yum.repos.d/ sudo yum -y update sudo yum -y install mosquitto To Verify the Settings and Prepare Logs [root@sandbox opt]# cat /etc/mosquitto/mosquitto.conf # Place your local configuration in /etc/mosquitto/conf.d/ pid_file /var/run/mosquitto.pid persistence true persistence_location /var/lib/mosquitto/ #log_dest file /var/log/mosquitto/mosquitto.log include_dir /etc/mosquitto/conf.d [root@sandbox opt]# vi /etc/mosquitto/mosquitto.conf [root@sandbox opt]# mkdir -p /var/log/mosquitto [root@sandbox opt]# chmod 777 /var/log/mosquitto/ [root@sandbox opt]# touch /var/log/mosquitto/mosquitto.log [root@sandbox opt]# chmod 777 /var/log/mosquitto/ Run MQTT Broker Server mosquitto -d The default port for MQTT and Mosquitto is 1883. Make sure that port is not blocked by Firewalls, Virus software and if one the sandbox it is exposed. Running Mosquitto on Sandbox NiFi PublishMQTT NiFi ConsumeMQTT After Running [root@sandbox demo]# hdfs dfs -ls /mqtt root hdfs 2783 2016-07-20 14:56 /mqtt/37115929161818 root hdfs 2805 2016-07-20 14:56 /mqtt/37115930927495 ConsumeMQTT Publish MQTT Resources: http://mosquitto.org/man/mosquitto-8.html http://ceit.uq.edu.au/content/mqtt-and-growl http://growl.info/ http://www.eclipse.org/paho/

TimothySpann · ‎07-20-2016

I am looking for the best option for in-memory computing, fast data. The most recent data we have (current, 5 minutes, 1 hours, < 1 day) we need to have access to as fast as possible. It's probably 500G or less. Something like Pivotal's Butterfly Architecture. What will work best for keeping some of this fast data? I have been looking at Apache Geode, Apache Ignite, Alluxio, SnappyData, Redis, HDFS Ram Data Nodes, HBase In-Memory Column Families, Kafka, Spark Streaming. Any baked solutions out there that work with HDP?

TimothySpann · ‎07-20-2016

I had catalog and schema name and then left them off. I tried a few options. twitter is a table in default hive database SelectHiveQL is working fine

TimothySpann · ‎07-20-2016

i set unmatched columns to ignore i tried true and false on field names

TimothySpann · ‎07-20-2016

Is there anything special to get this to work? Hive Table create table twitter( id int, handle string, hashtags string, msg string, time string, user_name string, tweet_id string, unixtime string, uuid string ) stored as orc tblproperties ("orc.compress"="ZLIB"); Data is paired down tweet: { "user_name" : "Tweet Person", "time" : "Wed Jul 20 15:09:42 +0000 2016", "unixtime" : "1469027382664", "handle" : "SomeTweeter", "tweet_id" : "755781737674932224", "hashtags" : "", "msg" : "RT some stuff" }

TimothySpann · ‎07-20-2016

true is spelled wrong

Online	Offline
Last Visited	‎05-20-2024 05:42 PM

Member Since	‎01-07-2019 11:58 AM
Last Visited	‎05-20-2024 05:42 PM
Posts	1,973
Kudos received	1122

Cloudera Community

Re: Has anyone tried NiFi consuming (JMSConsume) f...

Re: NiFi Crash after runing chain of lookups

Re: Recommend approach for listening to RSS Feed i...

Re: NiFi ListenFTP Processor Default Data Port

Re: Nifi: Kafka Producer with Avro format in both ...

Re: Cloud Sizing for AWS, Azure and Google - What ...

Re: Unable to connect to Google Places API using N...

Cloud Sizing for AWS, Azure and Google - What are ...

Using ExtractMediaMetaData for Image Analysis

IoT Example in Apache NiFi: Consuming and Produc...

In-Memory Layer

Re: ConvertJSONtoSQL in Apache NiFi for Sending to...

Re: ConvertJSONtoSQL in Apache NiFi for Sending to...

ConvertJSONtoSQL in Apache NiFi for Sending to Put...

Re: Urgent..please help me...2 data nodes goes dow...