About TimothySpann

TimothySpann · ‎10-01-2016

I ran the same flow myself and examined the AVRO file in HDFS using AVRO Cli. Even though I didn't specify SNAPPY compression, it was there in the file. java -jar avro-tools-1.8.0.jar getmeta 23568764174290.avro log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. avro.codec snappyavro.schema {"type":"record","name":"people","doc":"Schema generated by Kite","fields":[{"name":"id","type":"long","doc":"Type inferred from '2'"},{"name":"first_name","type":"string","doc":"Type inferred from 'Gregory'"},{"name":"last_name","type":"string","doc":"Type inferred from 'Vasquez'"},{"name":"email","type":"string","doc":"Type inferred from 'gvasquez1@pcworld.com'"},{"name":"gender","type":"string","doc":"Type inferred from 'Male'"},{"name":"ip_address","type":"string","doc":"Type inferred from '32.8.254.252'"},{"name":"company_name","type":"string","doc":"Type inferred from 'Janyx'"},{"name":"domain_name","type":"string","doc":"Type inferred from 'free.fr'"},{"name":"file_name","type":"string","doc":"Type inferred from 'NonMauris.xls'"},{"name":"mac_address","type":"string","doc":"Type inferred from '03-FB-66-0F-20-A3'"},{"name":"user_agent","type":"string","doc":"Type inferred from '\"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7;'"},{"name":"lat","type":"string","doc":"Type inferred from ' like Gecko) Version/5.0.4 Safari/533.20.27\"'"},{"name":"long","type":"double","doc":"Type inferred from '26.98829'"}]} It's hard coded in NIFI. https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/main/java/org/apache/nifi/processors/kite/ConvertCSVToAvro.java It always adds SnappyCompression to every AVRO file. No options. writer.setCodec(CodecFactory.snappyCodec()); Make sure you have a schema set: Record schema Record Schema: ${inferred.avro.schema} If you can make everything Strings and convert to other types later, you will be happier. References: https://www.linkedin.com/pulse/converting-csv-avro-apache-nifi-jeremy-dyer https://community.hortonworks.com/questions/44063/nifi-avro-to-csv-or-json-to-csvnifi-convert-avro-t.html https://community.hortonworks.com/articles/28341/converting-csv-to-avro-with-apache-nifi.html

TimothySpann · ‎09-30-2016

Here's the simple zeppelin file. twitter-from-strata-hadoop-processing.txt Rename that as .JSON. For security, don't upload/download are working with .JS or .JSON fies.

devopsteja · ‎03-02-2017

Hi i have 5 separate queues for 5 different processors, everytime i'm going to each processor and clearing the each queue its taking me lot of time, is there any way to clear all the queue's at same time ? please help me with this thanks Ravi

ganne · ‎09-28-2016

@Timothy Spann yes that code was written for storm 0.10. Now I am trying to test that for 1.0.1. Updated the Pom with necessary storm and Kafka versions and added guava dependency as suggested in the above link. Still I am getting build errors

daisuke_baba · ‎08-02-2019

@Riccardo Iacomini Thank you for the great post! This is very helpful. Here I am wondering how you batch things together like having many csv rows instead of one csv row. Because if we want to batch csv row into multiple rows, we use MergeContent processor, but you also mention that MergeContent is costly. So how batch processing will work on Nifi??

cstanca · ‎09-26-2016

@Arkaprova Saha It depends on you feel about yourself and your future. If you consider yourself a software engineer that has solid Java background and wants to deliver highly optimized and scalable software products based on Spark then you may want to focus more on Scala. If you are more focused on data wrangling, discovery and analysis, short-term use focused studies, or to resolve business problems as quick as possible then Python is awesome. Python has such a large community and code snippets, applications etc. Don't get me wrong, but Python could also be used to deliver enterprise-level applications, but it is more often to use Java and Scala for highly optimized. Python has some culprits, which we will not debate here. Anyhow, I would say that Python is kind of a MUST HAVE and Scala is NICE TO HAVE. Obviously, this is my 2c and I would be amazed that any of these responses in this thread is the ANSWER.

andrewg · ‎09-26-2016

Thanks Andy. I clearly understand the concern around security confidence levels, and don't put it out as a solution. Rather a workaround to let the devs move forward. This isn't an official solution by any means, and everyone should understand that in a thread.

TimothySpann · ‎10-11-2016

TensorFlow 0.11 is out export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.11.0rc0-cp27-none-linux_x86_64.whl

anjultiwari · ‎09-22-2016

Hi @Timothy Spann and @Jasper , I found the cause of issue now. The issue was I was not putting colon (: ) between port(2181) and hbase tablespace(hbase-unsecure) in spark-shell properly while loading the table. - Earlier I was loading the table in spark-shell as below, which was giving me no Table found error. val jdbcDF = sqlContext.read.format("jdbc").options( Map( "driver" -> "org.apache.phoenix.jdbc.PhoenixDriver", "url" -> "jdbc:phoenix:<host>:2181/hbase-unsecure", "dbtable" -> "TEST_TABLE2") ).load() - But now after putting colon ( : ) between port(2181) number andhbase tablespace (hbase-unsecure). I am able to load table. val jdbcDF = sqlContext.read.format("jdbc").options( Map( "driver" -> "org.apache.phoenix.jdbc.PhoenixDriver", "url" -> "jdbc:phoenix:<host>:2181:/hbase-unsecure", "dbtable" -> "TEST_TABLE2") ).load()

radhwane_chebaa · ‎02-04-2017

This solved my issue. In my case, the ambari database is a postgresql Database.

Online	Offline
Last Visited	‎05-20-2024 05:42 PM

Member Since	‎01-07-2019 11:58 AM
Last Visited	‎05-20-2024 05:42 PM
Posts	1,973
Kudos received	1122

Cloudera Community

Re: Has anyone tried NiFi consuming (JMSConsume) f...

Re: NiFi Crash after runing chain of lookups

Re: Recommend approach for listening to RSS Feed i...

Re: NiFi ListenFTP Processor Default Data Port

Re: Nifi: Kafka Producer with Avro format in both ...

CSV to AVRO Conversion with NiFi Debugging, Checki...

Re: HDF 2.0 Flow for Processing Real-Time Tweets

Re: NIFI 1.0 can't empty queues

Re: Guava dependency issue on HDP 2.5 Sandbox, wit...

Re: NiFi: unable to improve performances

Re: Should I learn Scala or Python

Re: HDF 2.0 Hanging on Restart

Re: Analyzing images in HDF 2.0 using TensorFlow

Re: Spark is not able to load Phoenix Tables.,Not ...

Re: to add services in hdp-2.5. ambari-2.4.0.1