Member since
07-07-2016
79
Posts
17
Kudos Received
13
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
645 | 08-01-2017 12:00 PM | |
1396 | 08-01-2017 08:28 AM | |
985 | 07-28-2017 01:43 PM | |
1029 | 06-15-2017 11:56 AM | |
978 | 06-01-2017 09:28 AM |
09-25-2017
01:57 PM
@rmr1989 In-place migrations from RHL6 to RHL7 are challenging. Would advise to upgrade HDP 2.4 to 2.6.2 as is, and use swing kit to install RHL7 on new nodes, which you add to the cluster. You then migrate the services off the RHL6 nodes onto the new RHL7 asap. It is advised to run with a mixed cluster (RHL6 and RHL7) for as short a time as possible. As usual, read the docs, and test as much as possible.
... View more
09-19-2017
09:24 AM
3 Kudos
@ketan kunde Some comments: - HDCloud is an AWS specific Provisioning capability. It is just a layer built upon Cloudbreak, which is open-source. - SmartSense is more of a service. Cluster logs, metrics, configs, volumetrics, etc. are captured in a bundle and analyzed to produce a set of recommendations/reports. It is more of a process or a service, than an opensource project. - HCP, aka CyberSecurity, is based on Apache Metron. There are other Apache Projects in HCP - Ni-Fi, Storm, Kafka, Ambari, Ranger (i.e. HDP and HDF).
... View more
08-01-2017
12:00 PM
1 Kudo
@Zubair Jaleel There are many Kappa and other case studies presented at the DataWorks Summit (e.g. Ford, Yahoo, etc.). Videos and Slides are available for most sessions: https://dataworkssummit.com/san-jose-2017/agenda/
... View more
08-01-2017
08:28 AM
1 Kudo
@Smart Data Ranger can be used to sync users with LDAP/AD. Credentials are stored in LDAP/AD, and Ranger configured to access. Knox is used as a proxy, but more for REST API service calls, and some UIs. It is not meant to proxy high volume traffic like Kafka messages.
... View more
08-01-2017
07:57 AM
@Smart Data Atlas is more Governance related, security to a less extent. You secure Kafka via Kerberos for authentication, and Ranger for authorization: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/index.html#bk_security
... View more
07-28-2017
04:51 PM
@Gagan SinghThe ports are different to the HDP Sandbox. Ambari is on port 9080, please see the tutorial below:https://hortonworks.com/tutorial/real-time-event-processing-in-nifi-sam-schema-registry-and-superset/
... View more
07-28-2017
01:43 PM
1 Kudo
@Kiran Kumar Should all be answered in the below: https://hortonworks.com/agreements/support-services-policy/
... View more
07-25-2017
09:18 PM
One comment is RHL 7.3 is not supported with Ambari 2.2.2/HDP2.4.3
... View more
07-25-2017
09:07 PM
@somesh sriramoju In the list of connectors, there is a drop down option for HDP: https://datascience.ibm.com/docs/content/local/createconnectionslocal.html?linkInPage=true Regards,
... View more
07-25-2017
08:32 AM
@Irfan syafii - Take a look at the Ambari Metrics to understand some of the core cluster metrics. Current versions of HDP have Grafana Dashboards to give more detail. - There a number of benchmark options: https://gist.github.com/ace-subido/0a9b219b2348921f6a87 https://github.com/hortonworks/hive-testbench TESTDFSIO being a good option to get an understanding of read/write performance (lots of articles on that). The second link relates to Hive and TPC type benchmarks. - For more advanced users, there is also SmartSense with HDP, which can make recommendations on performance issues. Regards,
... View more
07-18-2017
05:08 PM
@Mihai Lucaciu The IDs and password is provided below - admin/hadoophadoop https://hortonworks.com/tutorial/real-time-event-processing-in-nifi-sam-schema-registry-and-superset/#step7
... View more
07-18-2017
03:33 AM
@Maurice Hickey With HDP 3.x this will be much easier, but AFAIK this isn't as elegant prior to that. Keep in mind that there is one RDBMS maintaining the Hive Metadata - and therefore you can have only one Database/Table entry. In theory, you could have 2 x HS2/Metastores (QA/DEV) pointing to a different Metastore RDBMS each - but you lose the ability to query between the two (as DEV and QA are separate) unless you federate at a level above the cluster. A common approach would be to split at the database level, so you can keep the same table names. You need to change for PROD, so would advise changing DB name is part of the promotion process to PROD anyway, so should be done from DEV to QA).
... View more
06-21-2017
12:54 PM
@Roberto Sancho This error can occur if you do not have adequate permissions on the file. I would suggest you run the 'fsck' command to determine if there are any corrupt blocks on HDFS. Is it only certain queries that fail, or all queries? Would also locate the under-lying files on HDFS, and check permissions, etc.
... View more
06-21-2017
12:45 PM
@priyal patel Have you considered using Ni-Fi instead. Lots of articles on Ni-Fi capturing API data, including from FB: https://community.hortonworks.com/questions/65310/apache-nifi-facebook-integration.html
... View more
06-20-2017
04:35 PM
@Saby SS We always advise the Knox is not designed to be a proxy/sentry for bulk data xfer. It is fine as an ODBC interface, passing summary data back to the client, but not designed for large volumes. For large volumes, you might be better served with an edge node, or jump box inside the secure zone and close to the cluster.
... View more
06-19-2017
10:52 AM
@Farrukh Munir The HA Guide covers this, and yes Ambari can be used to install, configure, and manage the services. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/index.html#bk_hadoop-high-availability
... View more
06-15-2017
12:20 PM
@Qinglin Xia Checking you created the Tag Service in Ranger. You need to define the Tag Service to enable tags flowing from Atlas to Ranger. And always worth checking all the components are online - Ranger Tag-sync, Kafka, etc.
... View more
06-15-2017
11:56 AM
@ahmed bejaoui I would recommend: https://community.hortonworks.com/questions/103514/nifi-and-storm-withour-kafka.html and also to look at the Partner Demo Kit: https://community.hortonworks.com/articles/75341/partner-demo-kit-1.html which has a packaged demo of the IoT Trucking App - which includes Ni-Fi, Kafka, and Storm. All ready to go in a few easy steps.
... View more
06-11-2017
06:29 PM
@Hemil Shah 1. Correct, each DC would have its own Zookeeper ensemble. The offsets are stored here. 2. Intra-Data Centre data redundancy can be provided by Kafka Replication. Cross-Data Centre requires MirrorMaker. For reference below, Mirror Maker best Practices. https://community.hortonworks.com/articles/79891/kafka-mirror-maker-best-practices.html
... View more
06-04-2017
08:05 PM
@nbalaji-elangovan If you perform a search on tensorflow in the HCC Search, there are a number of excellent articles, such as: https://community.hortonworks.com/articles/54954/setting-up-gpu-enabled-tensorflow-to-work-with-zep.html plus articles on DZone: https://dzone.com/articles/tensorflow-on-the-edge-part-2-of-5
... View more
06-01-2017
09:28 AM
@Vishal Gupta Would always advise to review the documentation on the Ambari and HDP upgrade. Generally, no, you don't need to re-gen all of the keytabs or SSL in the cluster as part of an upgrade - though Ambari will generate keytabs as required post-upgrade. https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.2.0/bk_ambari-upgrade/content/upgrading_HDP_post_upgrade_tasks_ranger_kerberos.html There are certain cases which need some attention (e.g. Kafka at 2.2, Ranger HA, etc.). https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/index.html You can raise a ticket with Hortonworks Support prior to the upgrade, as they can inform you on any known issues (if you have a subscription). Also, would recommend to move to 2.5.5 if moving to 2.5. And testing is imperative. Would advise to test all of the components under representative use cases.
... View more
05-31-2017
03:52 PM
@Sharon Kirkham Nothing obvious come to mind....so - Re-cycle of the Cloud Controller - Kill your browser, and re-try. - Same issue for different Images (e.g. the Data Science image)? - Same issue for different regions (or can you only provision on one region)? - Reducing the size of the cluster?
... View more
05-31-2017
01:02 PM
@Sharon Kirkham Exact same config, and it works fine. The ssh key you are using is working ok, you can ssh onto an AWS host with it?
... View more
05-31-2017
10:18 AM
@Sharon Kirkham Looking at another json (same config, worked 2 days ago), all is the same, except for spot price (maybe check if it works by removing), and the ClusterAndAmbariPassword is empty (though maybe you have manually removed before publishing?). To confirm, you have access/authority to provision resources on AWS also?
... View more
05-30-2017
01:27 PM
@katelyn thomas If you use the 'asparagus' chart below (even if you are not using HDP), you can select the combination of components that have been tested with each other. Not just Storm, ZK, but also Kafka, HDFS, etc., depending on what your source/targets are: https://hortonworks.com/products/data-center/hdp/ The Hortonworks Sandbox contains all of the HDP components installed, configured, and ready to use. The Partner Demo kit also has a Storm use case pre-built: https://community.hortonworks.com/articles/75341/partner-demo-kit-1.html
... View more
05-30-2017
09:03 AM
@Sanaz Janbakhsh Assigning the Solr standalone (you can still install via Ambari Management pack) to the slaves, at least separates the Solr-Cloud Infra, from the Solr standalone. The other consideration is what storage are you using to store the indexes on Solr standalone. There are a number of threads where this is discussed on HCC. In summary storing the indexes on HDFS makes it easier to manage. If performance and response time it important, than you should review storing the indexes on fast disk, like SSD.
... View more
05-29-2017
10:30 PM
@Saurabh FYI https://community.hortonworks.com/questions/82135/how-to-limit-access-to-zeppelin-webui-based-for-sp.html
... View more
05-29-2017
09:10 PM
@Karl Can you check: - you are able to run a sparkR shell, e.g. /usr/hdp/current/spark2-client/bin/sparkR - in the /var/log/spark2, /var/log/spark, /var/log/zeppelin directories, are there any error messages. - all the related services and components are running (Spark, Spark2, Zeppelin, HDFS, etc.) - what R packages are installed?
... View more
05-28-2017
01:27 PM
@Allen Niu For reference, see this thread with a number of comments around this topic: https://community.hortonworks.com/questions/91789/traditional-etl-vs-open-source.html#answer-91800
... View more
05-25-2017
04:00 PM
@Christophe Vico I recommend you download the Sandbox: https://hortonworks.com/products/sandbox/ From Zeppelin, and in the one notebook, you can run different versions of Spark (1.6.3 or 2.1) as per your choice of interpreter: %spark.spark or %spark2.spark You can review the settings used in the Interpreter screen. Regards,
... View more