1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
772 | 04-03-2024 06:39 AM | |
1424 | 01-12-2024 08:19 AM | |
771 | 12-07-2023 01:49 PM | |
1326 | 08-02-2023 07:30 AM | |
1920 | 03-29-2023 01:22 PM |
05-11-2016
07:39 PM
Make sure Ambari Agent is installed and started. Make sure firewall is off, passwordless SSH is on and working. Restart ambari server. Make sure SELINUX is off. Make sure network ports are open.
... View more
05-11-2016
01:40 PM
1 Kudo
http://spark.apache.org/docs/latest/submitting-applications.html --deploy-mode cluster export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode cluster \ # can be client for client mode
--executor-memory 20G \
--num-executors 50 \
/path/to/examples.jar \
1000
... View more
05-10-2016
03:45 PM
I am wondering about a full open source solution for Master Data Management.
... View more
05-10-2016
02:39 PM
So we have 100 different spreadsheets in CSV format with 20 fields. The fields are kind of standard, but some people use First Name, some use Name or firstname, some use one name field. Some use M and F for gender; some use 0 and 1. We want to convert all these types of CSVs into one gold standard and standard fieldnames/types/rangers.
... View more
Labels:
- Labels:
-
Apache Spark
05-04-2016
05:06 PM
excellent, let me know when that drops or is in alpha. I will test it.
... View more
05-04-2016
02:49 PM
1 Kudo
I ran the same flow myself and examined the AVRO file in HDFS using AVRO Cli.
Even though I didn't specify SNAPPY compression, it was there in the file. [root@sandbox opt]# java -jar avro-tools-1.8.0.jar getmeta 23568764174290.avro
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
avro.codec snappy
avro.schema {"type":"record","name":"people","doc":"Schema generated by Kite","fields":[{"name":"id","type":"long","doc":"Type inferred from '2'"},{"name":"first_name","type":"string","doc":"Type inferred from 'Gregory'"},{"name":"last_name","type":"string","doc":"Type inferred from 'Vasquez'"},{"name":"email","type":"string","doc":"Type inferred from 'gvasquez1@pcworld.com'"},{"name":"gender","type":"string","doc":"Type inferred from 'Male'"},{"name":"ip_address","type":"string","doc":"Type inferred from '32.8.254.252'"},{"name":"company_name","type":"string","doc":"Type inferred from 'Janyx'"},{"name":"domain_name","type":"string","doc":"Type inferred from 'free.fr'"},{"name":"file_name","type":"string","doc":"Type inferred from 'NonMauris.xls'"},{"name":"mac_address","type":"string","doc":"Type inferred from '03-FB-66-0F-20-A3'"},{"name":"user_agent","type":"string","doc":"Type inferred from '\"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7;'"},{"name":"lat","type":"string","doc":"Type inferred from ' like Gecko) Version/5.0.4 Safari/533.20.27\"'"},{"name":"long","type":"double","doc":"Type inferred from '26.98829'"}]}
It's hard coded in NIFI.
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-kite-bundle/nifi-kite-processors/src/main/java/org/apache/nifi/processors/kite/ConvertCSVToAvro.java
It always adds SnappyCompression to every AVRO file. No options.
224 writer.setCodec(CodecFactory.snappyCodec());
... View more
05-03-2016
03:07 PM
Is the data encrypted when it leaves the edge device? Use SSL transport and land encrypted in HDP.
... View more
05-03-2016
02:07 PM
2 Kudos
Good setup for Scala + SBT + Spark https://hadoopist.wordpress.com/2016/02/03/how-to-setup-your-first-spark-project-in-intellij-ide/ And the Spark Team has a good setup here: https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-IDESetup
... View more
05-02-2016
08:08 PM
1 Kudo
After you download the sandbox and HDF. Take a look at these resources https://dzone.com/articles/getting-started-with-apache-nifi-and-hdf https://dzone.com/articles/anatomy-of-a-scala-spark-program https://dzone.com/articles/hortonworks-top-15-links-of-april-2016
... View more
05-02-2016
06:30 PM
Awesome. I knew something awesome would come out of YASK. Next NIFI demo has to create this article in HCC and promote in on social media.
... View more