About lgeorge

VidyaSargur · ‎06-26-2020

@Kapardjh, As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question.

lgeorge · ‎05-05-2017

The Spark Component Guide and Command Line Installation Guide were updated to reflect new Spark features. Here are links to several of the latest features: Support for Spark 2, documented in several topics including: Installing Spark Using Ambari Installing and Configuring Apache Spark 2 (manual installation) Running Spark Configuring Spark2 for Wire Encryption Automating Spark Jobs with Oozie Spark Action Using Livy with Spark Versions 1 and 2 Livy API information, in Submitting Spark Applications Through Livy Enabling Spark SQL user impersonation for the Spark Thrift Server (doAs support), in Configuring the Spark Thrift Server The Zeppelin Component Guide was updated with additional details and examples for configuring Zeppelin with LDAP/AD and Kerberos security; see Configuring Zeppelin Security. In addition, the documentation for interpreters and user impersonation was extended. Portions of this information that apply to HDP 2.5 were also added to the Security chapter in the HDP 2.5 Zeppelin Component Guide. In the messaging area, the Kafka Component Guide has additional information in Configuring Kafka for a Production Environment.

lgeorge · ‎12-01-2016

There are many ways to run Hadoop on virtual machines. Earlier this year I tried several approaches, and ended up using a helpful Quick Start Guide written by Yusaku Sako. The Quick Start uses VirtualBox, Vagrant, and predefined scripts to set up a multi-node HDP cluster. You can choose which version of Ambari to install, and then choose and install an associated version of the HDP stack. For anyone new to virtual machines, there is now a Quick Start for New VM Users. The extended version adds background information and additional details for installing Ambari and the HDP stack. Topics include: Terminology Prerequisites Installing VirtualBox and Vagrant Starting Linux Virtual Machines Accessing Virtual Machines Installing Ambari Installing the HDP Stack Troubleshooting Reference information for basic Vagrant commands

harinarayana_hd · ‎07-11-2018

@Greg, i am trying to execute mysql queries, got error prefix not found. when i verified in mysql note there i observer mysql is there instead of %mysql. But unable to add % before mysql, its not editable. can u please help me how can i edit.

lgeorge · ‎09-27-2016

Our stream analytics documentation received substantial updates for HDP 2.5.0. The Apache Storm Component Guide was reorganized to focus on administration and development workflow: installing and configuring Storm, developing Storm applications, moving data into and out of Storm, and managing topologies. New content includes the following topics: Installing Storm using Ambari. Configuring Storm for a production environment. Implementing windowing computations on data streams. Implementing state management in core Storm. Configuring and using the HDFS spout to ingest data from HDFS. Monitoring and debugging Apache Storm topologies through the use of dynamic log levels, topology event logging, distributed log search, and dynamic worker profiling. The Storm application development chapter differentiates more clearly between core Storm and Trident, and the Kafka spout subsection describes performance settings and tradeoffs. The Storm Ambari view is documented in the Apache Ambari Views Guide, and the Apache Ambari Apache Storm Kerberos Configuration Guide (formerly Configuring Storm for Kerberos Over Ambari) has moved to the Security Guide. The Apache Kafka Component Guide was enhanced with the following content: Installing Kafka using Ambari. Configuring Kafka for a production environment. Sample code for a basic Kafka producer and consumer, with and without SSL enabled on the cluster. The Apache Ambari Apache Kafka Kerberos Configuration Guide (formerly Configuring Kafka for Kerberos Over Ambari) has moved to the Security Guide. If you have comments or suggestions, corrections or updates regarding our documentation, let us know on HCC. Help us continue to improve our documentation!

lgeorge · ‎09-27-2016

The new Apache Zeppelin Component Guide has instructions for installing, configuring, and using the web-based Zeppelin notebook for interactive data exploration and visualization. The Apache Spark Component Guide was reorganized to focus on administration and development workflows: installing and configuring Spark, developing Spark applications, using SparkR, and tuning Spark. For HDP 2.5, the book includes the following new topics: Installing the Spark 2.0 technical preview alongside Spark 1.6.2, using Ambari. Configuring Spark for wire encryption. The Apache Solr Search Installation Guide now contains instructions for installing HDP Search using Ambari.

mnath · ‎02-22-2018

@lgeorge @justin kuspa @Rick Moritz Any further updates on why the R interpreter was removed in 2016? Will functionality differ from RStudio in terms of running R Code through the Livy interpreter in Zeppelin?

plevinson · ‎09-07-2016

@Artem Ervits: Release Eng has copied the 2.3.6 Companion files to the correct location.

dheer_vijji_rag · ‎07-28-2017

We were into the same scenario where Zeppelin was always launching the 3 Containers in YARN even after having the Dynamic allocation parameters enabled from Spark but Zeppelin is not able to pick these parameters, To get the Zeppelin to launch more than 3 containers (the default it is launching) we need to configure in the Zeppelin Spark interpreter spark.dynamicAllocation.enabled=true spark.shuffle.service.enabled=true spark.dynamicAllocation.initialExecutors=0 spark.dynamicAllocation.minExecutors=2 --> Start this value with the lower number, if not it will launch number of the minimum containers specified and will only use the required containers (memory and VCores) and rest of the memory and VCores will be marked as reserved memory and causes memory issues spark.dynamicAllocation.maxExecutors=10 And it is always good to start with less executor memory (e.g 10/15g) and more executors (20/30) Our scenario we have observed that giving the executor memory (50/100g) and executors as (5/10) the query took 3min 48secs (228sec) --> which is obvious as the parallelism is very less and reducing the executor memory (10/15g) and increasing the executors (25/30) the same query took on 54secs. Please note the number of executors and executor memory are usecase dependent and we have done few trails before getting the optimal performance for our scenario.

Online	Offline
Last Visited	‎08-08-2019 07:47 PM

Member Since	‎09-10-2015 08:46 PM
Last Visited	‎08-08-2019 07:47 PM
Posts	93
Kudos received	31

Cloudera Community

Re: how to install Solr using Ambari view?

Re: Dynamic allocation on Spark Standalone cluster

Re: spark2 not accessable

Re: apache spark 2.0

Re: Zeppelin: %hive and %phoenix interpreters in 2...

Re: Can anyone explain Kafka rack awareness featur...

HDP 2.6 Documentation Updates for Data Science Com...

Ambari and HDP Installation: Quick Start for new V...

Re: %jdbc(hive) prefix not found in Zeppelin

HDP 2.5 Documentation Updates for Streaming Compon...

HDP 2.5 Documentation Updates for Data Science Com...

Re: No R interpreter in Zeppelin, HDP 2.5

Re: HDP 2.3.6 manual install companion files are m...

Re: Spark dynamic-allocation dont work