Member since
05-30-2018
1322
Posts
715
Kudos Received
148
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4067 | 08-20-2018 08:26 PM | |
| 1962 | 08-15-2018 01:59 PM | |
| 2390 | 08-13-2018 02:20 PM | |
| 4138 | 07-23-2018 04:37 PM | |
| 5045 | 07-19-2018 12:52 PM |
03-03-2017
04:54 PM
1 Kudo
I have flow files running through a nifi cluster. A remote process group is used to basically load balance flow files. if I added a new node to the cluster (while flow files are running, no stoppage), will nifi automatically distribute the load to the new nifi node? Does the cluster coorindator perform this function?
... View more
Labels:
- Labels:
-
Apache NiFi
03-02-2017
04:24 AM
this is using ambari 2.4.2
... View more
03-02-2017
04:04 AM
I have tried to install HDP 2.5.3 on redhat 7 (aws) and it has failed due to error below: 2017-03-01 22:41:38,450 - Execution of '/usr/bin/yum -d 0 -e 0 -y install snappy-devel' returned 1. Error: Package: snappy-devel-1.0.5-1.el6.x86_64 (HDP-UTILS-1.1.0.21)
Requires: snappy(x86-64) = 1.0.5-1.el6
Installed: snappy-1.1.0-3.el7.x86_64 (@anaconda/7.3)
snappy(x86-64) = 1.1.0-3.el7
Available: snappy-1.0.5-1.el6.x86_64 (HDP-UTILS-1.1.0.21)
snappy(x86-64) = 1.0.5-1.el6
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest The work around is to uninstall snappy-1.1.0-3.el7.x86_64 sudo yum remove snappy-1.1.0-3.el7.x86_64 and then install correct package sudo yum install snappy-devel-1.0.5-1.el6.x86_64 then retry services install (ie data node) Is this known issue?
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
03-01-2017
03:56 AM
You can use built in DR capabilities with hbase. hbase support active/active and active/passive. as metadata changes/add are pushed to atlas (hbase), thos can be push to your atlas DR site via hbase replication. for solr you can use CDCR. more info here https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462
... View more
03-01-2017
03:37 AM
oh yes. you need to use ip address of the sandbox. the host file does not need to be updated unless you plan to use a dns (ie sandbox.hortonworks.com) to it. for your test use the sandbox ip. i assume all this exist on same box? if not then your windows machine needs to be able to communicate to the sandbox. open up firewalls for ip com.
... View more
03-01-2017
03:16 AM
I am not familiar with sas but for most BI tools they require the hostname for your hiveserver 2 and port which is generally 10010 or for llap 10500.
... View more
02-28-2017
09:22 PM
1 Kudo
All suggestions above are good. Adding article on tools to use to benchmark the hardware. https://community.hortonworks.com/content/kbentry/56158/benchmark-your-hardware-for-hadoop-spark.html
... View more
02-28-2017
02:42 PM
4 Kudos
I would go with querydatabasetable. this will provide you state and also another important feature to break up return records into flow files. for example if 1000 records are expected to be a output of query, you can set Max Rows Per Flow File to x, and process data in smaller chunks. if you use selecthiveql, then build your query using update attribute and use a state via distributed map cache (DMC). maintain the last state in DMC and use that state in your updateattribute to run query in selecthiveql. Flow DMC fetch (state field) --> update attribute (build query) --> selecthiveql You will have to set dmc to initial state value or set in logic via update attribute.
... View more
02-27-2017
05:49 PM
You can shut down all services via ambari by going to right side tool bar and select stop all. This is clean shutdown. Info here https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_Ambari_Users_Guide/content/_starting_and_stopping_all_services.html Via api info here: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=41812517
... View more
02-26-2017
04:08 PM
yes you can do this but first you will need to install oozie client on the nifi nodes. This will become easier once there is 1 ambari managing HDF and HDP. However I would recommend using NiFi to ingest and stream data (using hive streaming processor) into hive tables or just use the putHiveQL. why? the operational capabilities in nifi (back pressure, data linage, event replay, stats on performance) you simply don't get with oozie. Lastly You can reuse this common processing logic or isolate in other terms but using nifi process group.
... View more