Member since
06-20-2016
488
Posts
433
Kudos Received
118
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3106 | 08-25-2017 03:09 PM | |
1965 | 08-22-2017 06:52 PM | |
3393 | 08-09-2017 01:10 PM | |
8065 | 08-04-2017 02:34 PM | |
8115 | 08-01-2017 11:35 AM |
09-13-2016
04:04 PM
@Timothy Spann very effective answer, but similar information as @Randy Gelhausen and he was first in.
... View more
09-13-2016
10:53 AM
1 Kudo
This article shows how to use a list of urls in an external file to iterate InvokeHttp https://community.hortonworks.com/content/kbentry/48816/nifi-to-ingest-and-transform-rss-feeds-to-hdfs-usi.html
You can schedule GetFile to run once per day, week, etc. If errors at the end of the flow inserting into a db, you can configure to ignore failure.
... View more
09-12-2016
10:01 PM
1 Kudo
I am trying to create phoenix interpreter using %jdbc in Zeppelin using 2.5 and am not succeeding. Steps are: Log into Zeppelin (sandbox 2.5) Create new interpreter as follows
restart (just to be paranoid) go to my notebook and bind interpreter
when I run with %jdbc(phoenix) I get Prefix not found. when I run it with %jdbc.phoenix I get jdbc.phoenix interpreter not found What am I missing?
... View more
Labels:
- Labels:
-
Apache Phoenix
-
Apache Zeppelin
09-12-2016
06:46 PM
1 Kudo
Agree with @mqureshi @Constantin Stanca Would like to add the theme that compression is a strategy and usually not a universal yes or no, or this codec or that. Important questions to ask for your data are: Will it be processed frequently, rarely or never (cold storage)? How critical is performance when it is processed? Which leads to: Which file format/compression codec if any for each dataset? The following are good references for compression and file format strategies (takes some thinking and evaluating): http://www.slideshare.net/Hadoop_Summit/kamat-singh-june27425pmroom210cv2 http://comphadoop.weebly.com/ http://www.dummies.com/programming/big-data/hadoop/hadoop-for-dummies/ After formulating a strategy, think about dividing your hdfs filepaths into zones in accordance with your strategy.
... View more
09-12-2016
05:48 PM
@Saumitra Buragohain could you help out here? Could use your expertise 🙂
... View more
09-12-2016
03:37 PM
2 Kudos
I have heard that full-dev-platform is being deprecated and vagrant-multinode should be used for a development envt instead: https://github.com/apache/incubator-metron/tree/master/metron-deployment/vagrant/multinode-vagrant This is very resource intensive, so for dev the best option is: https://github.com/apache/incubator-metron/tree/master/metron-deployment/vagrant/quick-dev-platform Also, ansible should be installed latest version and downgraded as follows: https://cwiki.apache.org/confluence/display/METRON/Downgrade+Ansible
... View more
09-12-2016
12:43 PM
2 Kudos
Please be sure to follow these instructions https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4-Win/bk_HDP_Install_Win/content/LZOCompression.html You can do step 3 from the Ambari web UI. Also, note that steps 1-2 need to be done on each node in the cluster.
... View more
09-11-2016
02:28 PM
2 Kudos
@Mohan V Issue of jar version imcompatibility. You need to use the following newer versions of elephant bird (and not the older version) REGISTER elephant-bird-core-4.1.jar
REGISTER elephant-bird-pig-4.1.jar
REGISTER elephant-bird-hadoop-compat-4.1.jar I tested it with your code and sample and it works. You can get the jars at: http://www.java2s.com/Code/JarDownload/elephant/elephant-bird-core-4.1.jar.zip
http://www.java2s.com/Code/JarDownload/elephant/elephant-bird-pig-4.1.jar.zip
http://www.java2s.com/Code/JarDownload/elephant/elephant-bird-hadoop-compat-4.1.jar.zip Regarding DESCRIBE working but DUMP causing the issue: DUMP runs the map-reduce program and DESCRIBE does not.
... View more
09-11-2016
12:48 PM
@Mohan V Very glad to see you solved it yourself by debugging -- it is the best way to learn and improve your skills 🙂
... View more
09-10-2016
12:31 PM
1 Kudo
There is a lot going on here -- when writing a complex script like this, the following approach is useful to build and debug: run locally against a small subset of records (pig -x local -f <scriptOnLocalFileSystem>.pig). This makes each instance of the script run faster. build each statement line by line until you get to the failure statement (run the first statement, add the second and run, etc until it fails). When it fails you need to focus on the last statement and fix it. These steps are good for finding grammar issues (which it looks like you have based on the error statement). If you also want to make sure your data is being processed correctly, put a DUMP statement after each line during each iteration. That way you can inspect the results of each statement If using inline statements like your grouped = statement, separate out at first until it works. This makes the issue easier to isolate. Let me know how that goes.
... View more