Member since 
    
	
		
		
		03-25-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                142
            
            
                Posts
            
        
                48
            
            
                Kudos Received
            
        
                7
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 7277 | 06-13-2017 05:15 AM | |
| 2458 | 05-16-2017 05:20 AM | |
| 1631 | 03-06-2017 11:20 AM | |
| 11275 | 02-23-2017 06:59 AM | |
| 2499 | 02-20-2017 02:19 PM | 
			
    
	
		
		
		02-06-2018
	
		
		06:35 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Lekya Goriparti   Have a look at this: https://community.hortonworks.com/questions/26622/the-node-hbase-is-not-in-zookeeper-it-should-have.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-09-2017
	
		
		03:15 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 1. Introduction  This article is an extension of the one created by @Dan Zaratsian - H2O on Livy  2. Environment Details  Here are the environment details I did test it:  HDP: 2.6.1  Ambari: 2.5.0.3  OS: 7.3.1611  python: 2.7.5  IMPORTANT NOTE: H2O requires python ver. 2.7+  3. Installing H2O  Go to Zeppelin node and do the following:  $ mkdir /tmp/H2O
$ cd /tmp/H2O
$ wget http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/16/sparkling-water-2.1.16.zip
$ unzip sparkling-water-2.1.16.zip
  4. Testing H2O from CLI  Go to Zeppelin node where you downloaded and installed H2O  $ export SPARK_HOME='/usr/hdp/current/spark2-client'
$ export HADOOP_CONF_DIR=/etc/hadoop/conf
$ export MASTER="yarn-client"
$ export SPARK_MAJOR_VERSION=2
$ cd /tmp/H2O/sparkling-water-2.1.16/bin
$ ./pysparkling
>>> from pysparkling import *
>>> hc = H2OContext.getOrCreate(spark)
  My test  [root@dkozlowski-dkhdp262 bin]# ./pysparkling
Python 2.7.5 (default, Jun 17 2014, 18:11:42)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Warning: Master yarn-client is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.6.1.0-129/spark2/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.6.1.0-129/spark2/jars/spark-llap_2.11-1.1.3-2.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/11/08 14:33:06 WARN HiveConf: HiveConf of name hive.llap.daemon.service.hosts does not exist
17/11/08 14:33:06 WARN HiveConf: HiveConf of name hive.llap.daemon.service.hosts does not exist
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.1.1.2.6.1.0-129
      /_/
Using Python version 2.7.5 (default, Jun 17 2014 18:11:42)
SparkSession available as 'spark'.
>>> from pysparkling import *
>>> hc = H2OContext.getOrCreate(spark)
17/11/08 14:33:57 WARN H2OContext: Method H2OContext.getOrCreate with an argument of type SparkContext is deprecated and parameter of type SparkSession is preferred.
17/11/08 14:33:57 WARN InternalH2OBackend: Increasing 'spark.locality.wait' to value 30000
17/11/08 14:33:57 WARN InternalH2OBackend: Due to non-deterministic behavior of Spark broadcast-based joins
We recommend to disable them by
configuring `spark.sql.autoBroadcastJoinThreshold` variable to value `-1`:
sqlContext.sql("SET spark.sql.autoBroadcastJoinThreshold=-1")
17/11/08 14:33:57 WARN InternalH2OBackend: The property 'spark.scheduler.minRegisteredResourcesRatio' is not specified!
We recommend to pass `--conf spark.scheduler.minRegisteredResourcesRatio=1`
Connecting to H2O server at http://172.26.110.84:54323. successful.
--------------------------  ---------------------------------------------------
H2O cluster uptime:         25 secs
H2O cluster version:        3.14.0.7
H2O cluster version age:    18 days
H2O cluster name:           sparkling-water-root_application_1507531306616_0032
H2O cluster total nodes:    2
H2O cluster free memory:    1.693 Gb
H2O cluster total cores:    8
H2O cluster allowed cores:  8
H2O cluster status:         accepting new members, healthy
H2O connection url:         http://172.26.110.84:54323
H2O connection proxy:
H2O internal security:      False
H2O API Extensions:         XGBoost, Algos, AutoML, Core V3, Core V4
Python version:             2.7.5 final
--------------------------  ---------------------------------------------------
Sparkling Water Context:
 * H2O name: sparkling-water-root_application_1507531306616_0032
 * cluster size: 2
 * list of used nodes:
  (executorId, host, port)
  ------------------------
  (2,dkhdp262.openstacklocal,54321)
  (1,dkhdp263.openstacklocal,54321)
  ------------------------
  Open H2O Flow in browser: http://172.26.110.84:54323 (CMD + click in Mac OSX)
  5. Zeppelin site  Before following up the below steps ensure point 4. Testing H2O from CLI runs successfully  a) Ambari UI  Ambari -> Zeppelin -> Configs -> Advanced zeppelin-env -> zeppelin_env_template   export SPARK_SUBMIT_OPTIONS="--files /tmp/H2O/sparkling-water-2.1.16/py/build/dist/h2o_pysparkling_2.1-2.1.16.zip" 
export PYTHONPATH="/tmp/H2O/sparkling-water-2.1.16/py/build/dist/h2o_pysparkling_2.1-2.1.16.zip:${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.8.2.1-src.zip"   b) Zeppelin UI  Zeppelin UI -> Interpreter -> Spark2 (H2O does not work with dynamicAllocation enabled)  spark.dynamicAllocation.enabled=<blank> 
spark.shuffle.service.enabled=<blank> 
  c) Sample code  %pyspark
from pysparkling import *
hc = H2OContext.getOrCreate(spark)      %pyspark
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator
from h2o.grid.grid_search import H2OGridSearch
import sys
sys.stdout.isatty = lambda : False
sys.stdout.encoding = None
training_data = h2o.import_file("hdfs://dkhdp261:8020/tmp/test1.csv")
training_data.show()     
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		09-08-2017
	
		
		03:57 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Vijay Kiran   Do not raise issues within the Article. Just create a separate HCC providing details of your JDBC interpreter as well as Credentials.  NOTE: The above was tested on HDP 2.6.1 only. And you are on 2.6.0.3 if I am right. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-31-2017
	
		
		06:00 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Problem:  I have been trying to configure Zeppelin's JDBC interpreter to work with our Phoenix servers, but am getting an error when running queries. The JDBC interpreter works fine for Hive and MySQL.   Running this:  %jdbc(phoenix)
select * from <table>
  I am getting  org.apache.zeppelin.interpreter.InterpreterException: null
org.apache.phoenix.exception.PhoenixIOException: Failed after attempts=1, exceptions:
Thu Jul 20 08:27:49 BST 2017, RpcRetryingCaller{globalStartTime=1500535669736, pause=100, retries=1}, org.apache.hadoop.hbase.MasterNotRunningException: com.google.protobuf.ServiceException: java.io.IOException: Broken pipe
at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:416)
at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:564)
at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:692)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:489)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
  Root cause:  This problem is very likely caused by the having done the upgrade from HDP 2.5 to 2.6. There was an environment variable missing in zeppelin_env: ZEPPELIN_INTP_CLASSPATH_OVERRIDES.  Solution:  a)  - mv /etc/zeppelin/conf/interpreter.json /etc/zeppelin/conf/interpreter_bckup.json   
- restart zeppelin  b)   - go to Ambari UI -> Zeppelin -> Configs -> Advanced zeppelin-env -> zeppelin-env_content   
- add 
export ZEPPELIN_INTP_CLASSPATH_OVERRIDES="/etc/zeppelin/conf/external-dependency-conf"  just above 
#### Spark interpreter configuration ####  - save the change and restart all required 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		07-26-2017
	
		
		07:25 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Karan Alang  Re-implement the SSL by following up exactly the steps described in here:  http://docs.confluent.io/2.0.0/kafka/ssl.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-26-2017
	
		
		06:21 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Karan Alang   1) After enabling the debug - what can you see in controller log file?  2) What steps did you follow to enable SSL for Kafka? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-26-2017
	
		
		05:30 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Karan Alang   For debugging do this - change the log4j.rootLogger parameter in /etc/kafka/conf/tools-log4j.properties as:  log4j.rootLogger=DEBUG, stderr   Also check if producer works find for PLAINTEXT like:  /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list <broker-node>:6667 --topic <topic> --security-protocol PLAINTEXT   For the testing purpose - use only one broker-node. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-26-2017
	
		
		04:10 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Karan Alang   Remove:  - ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1 
- ssl.endpoint.identification.algorithm=HTTPS 
- ssl.secure.random.implementation=SHA1PRNG
  Add:  advertised.listeners=SSL://nwk2-bdp-kafka-04.gdcs-qa.apple.com:6668,PLAINTEXT://nwk2-bdp-kafka-04.gdcs-qa.apple.com:6667
  client-ssl.properties:  security.protocol=SASL_SSL
ssl.truststore.location=/tmp/ssl-kafka/server.truststore.jks
ssl.truststore.password=changeit
  Run (if your cluster is non-Kerberized)  ./kafka-console-producer.sh --broker-list nwk2-bdp-kafka-04.gdcs-qa.apple.com:6668 --topic <topic> --producer.config client-ssl.properties --security-protocol SSL
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-25-2017
	
		
		09:36 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Karan Alang Can you share your server.properties for review? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-19-2017
	
		
		08:26 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 PROBLEM 
 
 I have a non kerberized cluster. I did apply https://community.hortonworks.com/articles/81910/how-to-enable-user-impersonation-for-jdbc-interpre.html however the jdbc intepreter is still not impersonated.
      
 NOTE: From HDP 2.6.2 (Zeppelin 0.7.2) JDBC interpreter on non-kerberised cluster is impersonated by having the following property added into Zeppelin UI -> Interpreter -> JDBC config:
  
  
  hive.proxy.user.property=hive.server2.proxy.user
  
          SOLUTION  1. Go to Zeppelin UI -> Interpreter 
 
 The configuration for JDBC hive interpreter should look like
  
   
  2. Edit JDBC interpreter  
 
 Remove properties
 hive.user and hive.password
 and save the changes.
So, now the configuration looks like
  
   
  3. Go to Zeppelin UI -> Credential 
 
 Add the credentials for the user like
  Entity: jdbc.jdbc 
Username: <username> 
Password: <password>
  
   
  4. Run the query 
 
 Go to your notebook and run the jdbc query for hive. In RM UI this query is now running by YOU
  
   
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels: