Member since
09-10-2015
93
Posts
33
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2400 | 10-07-2016 03:37 PM | |
2362 | 10-04-2016 04:14 PM | |
2562 | 09-29-2016 03:17 PM | |
1264 | 09-28-2016 03:14 PM | |
2197 | 09-09-2016 09:41 PM |
06-26-2020
08:27 AM
@Kapardjh, As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question.
... View more
05-05-2017
05:13 PM
1 Kudo
The Spark Component Guide and Command Line Installation Guide were updated to reflect new Spark features. Here are links to several of the latest features: Support for Spark 2, documented in several topics including:
Installing Spark Using Ambari Installing and Configuring Apache Spark 2 (manual installation) Running Spark Configuring Spark2 for Wire Encryption Automating Spark Jobs with Oozie Spark Action Using Livy with Spark Versions 1 and 2 Livy API information, in Submitting Spark Applications Through Livy Enabling Spark SQL user impersonation for the Spark Thrift Server (doAs support), in Configuring the Spark Thrift Server The Zeppelin Component Guide was updated with additional details and examples for configuring Zeppelin with LDAP/AD and Kerberos security; see Configuring Zeppelin Security.
In addition, the documentation for interpreters and user impersonation
was extended. Portions of this information that apply to HDP 2.5 were
also added to the Security chapter in the HDP 2.5 Zeppelin Component
Guide. In the messaging area, the Kafka Component Guide has additional information in Configuring Kafka for a Production Environment.
... View more
12-01-2016
03:21 AM
1 Kudo
There are many ways to run Hadoop on virtual machines. Earlier this year I tried several approaches, and ended up using a helpful Quick Start Guide written by Yusaku Sako. The Quick Start uses VirtualBox, Vagrant, and predefined scripts to set up a multi-node HDP cluster. You can choose which version of Ambari to install, and then choose and install an associated version of the HDP stack. For anyone new to virtual machines, there is now a Quick Start for New VM Users. The extended version adds background information and additional details for installing Ambari and the HDP stack. Topics include: Terminology Prerequisites Installing VirtualBox and Vagrant Starting Linux Virtual Machines Accessing Virtual Machines Installing Ambari Installing the HDP Stack Troubleshooting Reference information for basic Vagrant commands
... View more
07-11-2018
05:34 AM
@Greg, i am trying to execute mysql queries, got error prefix not found. when i verified in mysql note there i observer mysql is there instead of %mysql. But unable to add % before mysql, its not editable. can u please help me how can i edit.
... View more
09-27-2016
01:32 AM
1 Kudo
Our stream analytics documentation received substantial updates for HDP 2.5.0. The Apache Storm Component Guide was reorganized to focus on administration and development workflow: installing and configuring Storm, developing Storm applications, moving data into and out of Storm, and managing topologies. New content includes the following topics: Installing Storm using Ambari. Configuring Storm for a production environment. Implementing windowing computations on data streams. Implementing state management in core Storm. Configuring and using the HDFS spout to ingest data from HDFS. Monitoring and debugging Apache Storm topologies through the use of dynamic log levels, topology event logging, distributed log search, and dynamic worker profiling. The Storm application development chapter differentiates more clearly between core Storm and Trident, and the Kafka spout subsection describes performance settings and tradeoffs. The Storm Ambari view is documented in the Apache Ambari Views Guide, and the Apache Ambari Apache Storm Kerberos Configuration Guide (formerly Configuring Storm for Kerberos Over Ambari) has moved to the Security Guide. The Apache Kafka Component Guide was enhanced with the following content: Installing Kafka using Ambari. Configuring Kafka for a production environment. Sample code for a basic Kafka producer and consumer, with and without SSL enabled on the cluster. The Apache Ambari Apache Kafka Kerberos Configuration Guide (formerly Configuring Kafka for Kerberos Over Ambari) has moved to the Security Guide. If you have comments or suggestions, corrections or updates regarding our documentation, let us know on HCC. Help us continue to improve our documentation!
... View more
Labels:
09-27-2016
01:31 AM
The new Apache Zeppelin Component Guide has instructions for installing, configuring, and using the web-based Zeppelin notebook for interactive data exploration and visualization. The Apache Spark Component Guide was reorganized to focus on administration and development workflows: installing and configuring Spark, developing Spark applications, using SparkR, and tuning Spark. For HDP 2.5, the book includes the following new topics: Installing the Spark 2.0 technical preview alongside Spark 1.6.2, using Ambari. Configuring Spark for wire encryption. The Apache Solr Search Installation Guide now contains instructions for installing HDP Search using Ambari.
... View more
Labels:
02-22-2018
08:28 PM
@lgeorge @justin kuspa @Rick Moritz Any further updates on why the R interpreter was removed in 2016? Will functionality differ from RStudio in terms of running R Code through the Livy interpreter in Zeppelin?
... View more
09-07-2016
07:35 PM
@Artem Ervits: Release Eng has copied the 2.3.6 Companion files to the correct location.
... View more
07-28-2017
08:19 AM
We were into the same scenario where Zeppelin was always launching the 3 Containers in YARN even after having the Dynamic allocation parameters enabled from Spark but Zeppelin is not able to pick these parameters,
To get the Zeppelin to launch more than 3 containers (the default it is launching) we need to configure in the Zeppelin Spark interpreter spark.dynamicAllocation.enabled=true
spark.shuffle.service.enabled=true
spark.dynamicAllocation.initialExecutors=0
spark.dynamicAllocation.minExecutors=2 --> Start this value with the lower number, if not it will launch number of the minimum containers specified and will only use the required containers (memory and VCores) and rest of the memory and VCores will be marked as reserved memory and causes memory issues
spark.dynamicAllocation.maxExecutors=10
And it is always good to start with less executor memory (e.g 10/15g) and more executors (20/30) Our scenario we have observed that giving the executor memory (50/100g) and executors as (5/10) the query took 3min 48secs (228sec) --> which is obvious as the parallelism is very less and reducing the executor memory (10/15g) and increasing the executors (25/30) the same query took on 54secs. Please note the number of executors and executor memory are usecase dependent and we have done few trails before getting the optimal performance for our scenario.
... View more