Member since
11-14-2017
13
Posts
0
Kudos Received
0
Solutions
12-08-2017
12:28 AM
@Venkata Sudheer Kumar M and @Jay Kumar SenSharma is this a bug then ?
... View more
12-08-2017
12:27 AM
@Jay Kumar SenSharma - Yes it is Version2.5.0.3
... View more
12-07-2017
05:58 AM
Hi @Venkata Sudheer Kumar M firstly thanks for being across all my issues. I may not have made myself clear. The size of the output differs every time . Every time we extract data as csv there is a different file size. It should be consistent and all rows must get extracted just like the insert overwrite functionality in Hive
... View more
12-07-2017
01:00 AM
Ambari Hive views 1.5 is running from a standalone Ambari server. The hive view is setup to access the cluster and the configuration using zookeeper ports to access data. The users are downloading output of a sql query from Ambari hive views using the "save as" button. There are two options here
Save to HDFS Download as CSV Every time the business users download a result set the row count of the output extract is different. Be it with option 1 or option 2. The Ambari.properties file content is embedded here. As far as I am aware there are only timeouts set here but there aren't any configurations to limit the result set. #
# Copyright 2011 The Apache Software Foundation
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#Mon Dec 04 11:28:46 AEDT 2017
agent.package.install.task.timeout=1800
agent.stack.retry.on_repo_unavailability=false
agent.stack.retry.tries=5
agent.task.timeout=900
agent.threadpool.size.max=25
ambari-server.user=root
ambari.python.wrap=ambari-python-wrap
api.ssl=true
bootstrap.dir=/var/run/ambari-server/bootstrap
bootstrap.script=/usr/lib/python2.6/site-packages/ambari_server/bootstrap.py
bootstrap.setup_agent.script=/usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
check_database_skipped=false
client.api.ssl.cert_name=https.crt
client.api.ssl.key_name=https.key
client.api.ssl.port=8080
client.threadpool.size.max=25
common.services.path=/var/lib/ambari-server/resources/common-services
custom.action.definitions=/var/lib/ambari-server/resources/custom_action_definitions
custom.postgres.jdbc.name=postgresql-42.1.1.jar
extensions.path=/var/lib/ambari-server/resources/extensions
http.cache-control=no-store
http.pragma=no-cache
http.strict-transport-security=max-age=31536000
http.x-content-type-options=nosniff
http.x-frame-options=DENY
http.x-xss-protection=1; mode=block
java.home=/usr/lib/java/jdk1.8.0_121
java.releases=jdk1.8,jdk1.7
java.releases.ppc64le=
jce.download.supported=true
jdk.download.supported=true
jdk1.7.desc=Oracle JDK 1.7 + Java Cryptography Extension (JCE) Policy Files 7
jdk1.7.dest-file=jdk-7u67-linux-x64.tar.gz
jdk1.7.home=/usr/jdk64/
jdk1.7.jcpol-file=UnlimitedJCEPolicyJDK7.zip
jdk1.7.jcpol-url=http://public-repo-1.hortonworks.com/ARTIFACTS/UnlimitedJCEPolicyJDK7.zip
jdk1.7.re=(jdk.*)/jre
jdk1.7.url=http://public-repo-1.hortonworks.com/ARTIFACTS/jdk-7u67-linux-x64.tar.gz
jdk1.8.desc=Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8
jdk1.8.dest-file=jdk-8u112-linux-x64.tar.gz
jdk1.8.home=/usr/jdk64/
jdk1.8.jcpol-file=jce_policy-8.zip
jdk1.8.jcpol-url=http://public-repo-1.hortonworks.com/ARTIFACTS/jce_policy-8.zip
jdk1.8.re=(jdk.*)/jre
jdk1.8.url=http://public-repo-1.hortonworks.com/ARTIFACTS/jdk-8u112-linux-x64.tar.gz
kerberos.keytab.cache.dir=/var/lib/ambari-server/data/cache
metadata.path=/var/lib/ambari-server/resources/stacks
mpacks.staging.path=/var/lib/ambari-server/resources/mpacks
pid.dir=/var/run/ambari-server
recommendations.artifacts.lifetime=1w
recommendations.dir=/var/run/ambari-server/stack-recommendations
resources.dir=/var/lib/ambari-server/resources
rolling.upgrade.skip.packages.prefixes=
security.server.disabled.ciphers=TLS_RSA_WITH_AES_256_GCM_SHA384|TLS_RSA_WITH_CAMELLIA_256_CBC_SHA|TLS_RSA_WITH_CAMELLIA_128_CBC_SHA|TLS_RSA_WITH_3DES_EDE_CBC_SHA|TLS_DHE_RSA_WITH_AES_128_GCM_SHA256|TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384|TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384|TLS_RSA_WITH_AES_256_CBC_SHA256|TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA384|TLS_ECDH_RSA_WITH_AES_256_CBC_SHA384|TLS_DHE_RSA_WITH_AES_256_CBC_SHA256|TLS_DHE_DSS_WITH_AES_256_CBC_SHA256|TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA|TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA|TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA|TLS_ECDH_RSA_WITH_AES_256_CBC_SHA|TLS_DHE_RSA_WITH_AES_256_CBC_SHA|TLS_DHE_DSS_WITH_AES_256_CBC_SHA|TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256|TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256|TLS_RSA_WITH_AES_128_CBC_SHA256|TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256|TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256|TLS_DHE_RSA_WITH_AES_128_CBC_SHA256|TLS_DHE_DSS_WITH_AES_128_CBC_SHA256|TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA|TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA|TLS_RSA_WITH_AES_128_CBC_SHA|TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA|TLS_ECDH_RSA_WITH_AES_128_CBC_SHA|TLS_DHE_RSA_WITH_AES_128_CBC_SHA|TLS_DHE_DSS_WITH_AES_128_CBC_SHA|TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA|TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA|TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA|TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA|SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA|SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA|TLS_EMPTY_RENEGOTIATION_INFO_SCSV|TLS_DH_anon_WITH_AES_256_CBC_SHA256|TLS_ECDH_anon_WITH_AES_256_CBC_SHA|TLS_DH_anon_WITH_AES_256_CBC_SHA|TLS_DH_anon_WITH_AES_128_CBC_SHA256|TLS_ECDH_anon_WITH_AES_128_CBC_SHA|TLS_DH_anon_WITH_AES_128_CBC_SHA|TLS_ECDH_anon_WITH_3DES_EDE_CBC_SHA|SSL_DH_anon_WITH_3DES_EDE_CBC_SHA|SSL_RSA_WITH_DES_CBC_SHA|SSL_DHE_RSA_WITH_DES_CBC_SHA|SSL_DHE_DSS_WITH_DES_CBC_SHA|SSL_DH_anon_WITH_DES_CBC_SHA|SSL_RSA_EXPORT_WITH_DES40_CBC_SHA|SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA|SSL_DHE_DSS_EXPORT_WITH_DES40_CBC_SHA|SSL_DH_anon_EXPORT_WITH_DES40_CBC_SHA|TLS_RSA_WITH_NULL_SHA256|TLS_ECDHE_ECDSA_WITH_NULL_SHA|TLS_ECDHE_RSA_WITH_NULL_SHA|SSL_RSA_WITH_NULL_SHA|TLS_ECDH_ECDSA_WITH_NULL_SHA|TLS_ECDH_RSA_WITH_NULL_SHA|TLS_ECDH_anon_WITH_NULL_SHA|SSL_RSA_WITH_NULL_MD5|TLS_KRB5_WITH_3DES_EDE_CBC_SHA|TLS_KRB5_WITH_3DES_EDE_CBC_MD5|TLS_KRB5_WITH_DES_CBC_SHA|TLS_KRB5_WITH_DES_CBC_MD5|TLS_KRB5_EXPORT_WITH_DES_CBC_40_SHA|TLS_KRB5_EXPORT_WITH_DES_CBC_40_MD5|TLS_RSA_WITH_AES_256_CBC_SHA
security.server.keys_dir=/var/lib/ambari-server/keys
server.connection.max.idle.millis=900000
server.execution.scheduler.isClustered=false
server.execution.scheduler.maxDbConnections=5
server.execution.scheduler.maxThreads=5
server.execution.scheduler.misfire.toleration.minutes=480
server.fqdn.service.url=http://169.254.169.254/latest/meta-data/public-hostname
server.http.session.inactive_timeout=1800
server.jdbc.connection-pool=internal
server.jdbc.database=postgres
server.jdbc.database_name=amabariview
server.jdbc.driver=org.postgresql.Driver
server.jdbc.hostname=lxdb4282-pgvip.dc.corp.telstra.com
server.jdbc.port=5432
server.jdbc.postgres.schema=amabariview
server.jdbc.rca.driver=org.postgresql.Driver
server.jdbc.rca.url=jdbc:postgresql://lxdb4282-pgvip.dc.corp.telstra.com:5432/amabariview
server.jdbc.rca.user.name=amabariview
server.jdbc.rca.user.passwd=/etc/ambari-server/conf/password.dat
server.jdbc.url=jdbc:postgresql://lxdb4282-pgvip.dc.corp.telstra.com:5432/amabariview?socketTimeout=6000000&tcpKeepAlive=true
server.jdbc.user.name=amabariview
server.jdbc.user.passwd=/etc/ambari-server/conf/password.dat
server.os_family=redhat6
server.os_type=redhat6
server.persistence.type=remote
server.python.log.level=INFO
server.python.log.name=ambari-server-command.log
server.stages.parallel=true
server.task.timeout=1200
server.tmp.dir=/var/lib/ambari-server/data/tmp
server.version.file=/var/lib/ambari-server/resources/version
shared.resources.dir=/usr/lib/ambari-server/lib/ambari_commons/resources
skip.service.checks=false
ssl.trustStore.password=Offshore01
ssl.trustStore.path=/webserver/certs/ambari-server-truststore
ssl.trustStore.type=jks
stackadvisor.script=/var/lib/ambari-server/resources/scripts/stack_advisor.py
ulimit.open.files=65536
user.inactivity.timeout.default=0
user.inactivity.timeout.role.readonly.default=0
user.inactivity.timeout.role.readonly.default=0
views.ambari.hive.AUTO_HIVE_INSTANCE.result.fetch.timeout=500000
views.ambari.request.connect.timeout.millis=50000
views.ambari.request.read.timeout.millis=5000
views.http.cache-control=no-store
views.http.pragma=no-cache
views.http.strict-transport-security=max-age=31536000
views.http.x-content-type-options=nosniff
views.http.x-frame-options=SAMEORIGIN
views.http.x-xss-protection=1; mode=block
views.request.connect.timeout.millis=600000
views.request.read.timeout.millis=600000
views.skip.home-directory-check.file-system.list=wasb,adls,adl
webapp.dir=/usr/lib/ambari-server/web
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive
12-06-2017
06:28 AM
@Kit Menke - Hey Kit , I have requirement where the users need to execute a query using beeline from hdfs.I tried your approach however i have tried several versions of it and the outcome unfortunately contradicts your posts.Can beeline access hdfs uri?? It would be great help if you could share your thoughts on this. beeline -u "jdbc:hive2://namenode2.dc.corp.astro.com:2181,namenode1.dc.corp.astro.com:2181,namenode3.dc.corp.astro.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNameSpace=hiveserver2sockettimeout=600000;tcpKeepAlive=true" -n xxx -p ******* -f "City.sql" --verbose true --hivevar HDFSDIR="hdfs://namenode1.dc.corp.astro.com:8020/user/xxx"
############ OUTPUT ##########################
Connected to: Apache Hive (version 1.2.1000.2.6.0.3-8)
Driver: Hive JDBC (version 1.2.1000.2.6.0.3-8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
City.sql (No such file or directory)
##############################################
Option 2:
beeline -u "jdbc:hive2://namenode2.dc.corp.astro.com:2181,namenode1.dc.corp.astro.com:2181,namenode3.dc.corp.astro.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNameSpace=hiveserver2sockettimeout=600000;tcpKeepAlive=true" -n xxx -p ******* -f "hdfs://namenode1.dc.corp.astro.com:8020/user/xxx/City.sql" --verbose true
############ OUTPUT ##########################
Connected to: Apache Hive (version 1.2.1000.2.6.0.3-8)
Driver: Hive JDBC (version 1.2.1000.2.6.0.3-8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
City.sql (No such file or directory)
##############################################
Option 3:
beeline -u "jdbc:hive2://namenode2.dc.corp.astro.com:2181,namenode1.dc.corp.astro.com:2181,namenode3.dc.corp.astro.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNameSpace=hiveserver2sockettimeout=600000;tcpKeepAlive=true" -n xxx -p ******* -f "hdfs://user/xxx/City.sql" --verbose true
############ OUTPUT ##########################
Connected to: Apache Hive (version 1.2.1000.2.6.0.3-8)
Driver: Hive JDBC (version 1.2.1000.2.6.0.3-8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
City.sql (No such file or directory)
##############################################
... View more
12-06-2017
06:19 AM
@Venkata Sudheer Kumar M - this looks like a bug to me.
... View more
12-04-2017
10:30 PM
@Aditya Sirna - it works fine without hive impersonation.
... View more
12-04-2017
10:30 PM
@Venkata Sudheer Kumar M - here you go attached the .out and the .log file in debug mode. zeppelin-zeppelin-1.zip (log file split into two parts) zeppelin-zeppelin-sss.txt (.out file)
... View more
12-04-2017
06:21 AM
@Venkata Sudheer Kumar M - All the performed all the steps mentioned here. I am aware of where the log files are. The logs dont show any other warning or any useful information. This is the only error message.
... View more
12-04-2017
06:01 AM
@Venkata Sudheer Kumar M - unfortunately this is an error even with the admin account . Any other suggestions? java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
at org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:90)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:211)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:377)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:105)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:387)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
... View more
12-04-2017
01:32 AM
I am currently working on a client project where we have installed HDPv2.6 and there is an issue with the Zeppelin user impersonation for the hive interpreter. If the Interpreter name is mentioned the query executes correctly. However, if it executes as a default interpreter then there seems to be an issue with Impersonation. Hive application also accepts impersonation, the users registry is managed by Ranger. I have attached the interpreter settings as a screenshot. Zeppelin Properties
export JAVA_HOME=
export JAVA_HOME=/usr/lib/java/jdk1.8.0_121
# export MASTER= # Spark master url. eg. spark://master_addr:7077. Leave empty if you want to use local mode.
export MASTER=yarn-client
export SPARK_YARN_JAR=/apps/zeppelin/zeppelin-spark-0.5.5-SNAPSHOT.jar
# export ZEPPELIN_JAVA_OPTS # Additional jvm options. for example, export ZEPPELIN_JAVA_OPTS="-Dspark.executor.memory=8g -Dspark.cores.max=16"
export ZEPPELIN_JAVA_OPTS="-Dorg.xerial.snappy.tempdir=/webserver/tmp/zeppelin_tmp -Dhdp.version=2.6.0.3-8 -Dspark.driver.memory=512m -Dspark.executor.memory=1024m -Dspark.executor.instances=2 -Dspark.cores.max=8 -Dspark.dynamicAllocation.enabled=true -Dspark.dynamicAllocation.initialExecutors=1 -Dspark.dynamicAllocation.minExecutors=2 -Dspark.dynamicAllocation.maxExecutors=5"
# export ZEPPELIN_MEM # Zeppelin jvm mem options Default -Xms1024m -Xmx1024m -XX:MaxPermSize=512m
export ZEPPELIN_MEM="-Xms512m -Xmx2G -XX:MaxPermSize=512m -XX:MaxMetaspaceSize=512m"
# export ZEPPELIN_INTP_MEM # zeppelin interpreter process jvm mem options. Default -Xms1024m -Xmx1024m -XX:MaxPermSize=512m
export ZEPPELIN_INTP_MEM="-Xms512m -Xmx2G -XX:MaxPermSize=512m -XX:MaxMetaspaceSize=512m"
# export ZEPPELIN_INTP_JAVA_OPTS # zeppelin interpreter process jvm options.
export ZEPPELIN_INTP_JAVA_OPTS="-Dorg.xerial.snappy.tempdir=/webserver/tmp/zeppelin_tmp -Xms512m -Xmx2G -XX:MaxPermSize=512m -XX:MaxMetaspaceSize=512m"
# export ZEPPELIN_SSL_PORT # ssl port (used when ssl environment variable is set to true)
# export ZEPPELIN_LOG_DIR # Where log files are stored. PWD by default.
export ZEPPELIN_LOG_DIR=/webserver/logs/var/log/zeppelin
# export ZEPPELIN_PID_DIR # The pid files are stored. ${ZEPPELIN_HOME}/run by default.
export ZEPPELIN_PID_DIR=/var/run/zeppelin
# export ZEPPELIN_WAR_TEMPDIR # The location of jetty temporary directory.
# export ZEPPELIN_NOTEBOOK_DIR # Where notebook saved
# export ZEPPELIN_NOTEBOOK_HOMESCREEN # Id of notebook to be displayed in homescreen. ex) 2A94M5J1Z
# export ZEPPELIN_NOTEBOOK_HOMESCREEN_HIDE # hide homescreen notebook from list when this value set to "true". default "false"
# export ZEPPELIN_NOTEBOOK_S3_BUCKET # Bucket where notebook saved
# export ZEPPELIN_NOTEBOOK_S3_ENDPOINT # Endpoint of the bucket
# export ZEPPELIN_NOTEBOOK_S3_USER # User in bucket where notebook saved. For example bucket/user/notebook/2A94M5J1Z/note.json
# export ZEPPELIN_IDENT_STRING # A string representing this instance of zeppelin. $USER by default.
# export ZEPPELIN_NICENESS # The scheduling priority for daemons. Defaults to 0.
# export ZEPPELIN_INTERPRETER_LOCALREPO # Local repository for interpreter's additional dependency loading
# export ZEPPELIN_NOTEBOOK_STORAGE # Refers to pluggable notebook storage class, can have two classes simultaneously with a sync between them (e.g. local and remote).
# export ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC # If there are multiple notebook storages, should we treat the first one as the only source of truth?
# export ZEPPELIN_NOTEBOOK_PUBLIC # Make notebook public by default when created, private otherwise
export ZEPPELIN_INTP_CLASSPATH_OVERRIDES="/etc/zeppelin/conf/external-dependency-conf"
#### Spark interpreter configuration ####
## Use provided spark installation ##
## defining SPARK_HOME makes Zeppelin run spark interpreter process using spark-submit
##
# export SPARK_HOME # (required) When it is defined, load it instead of Zeppelin embedded Spark libraries
export SPARK_HOME=/usr/hdp/current/spark-client
# export SPARK_SUBMIT_OPTIONS # (optional) extra options to pass to spark submit. eg) "--driver-memory 512M --executor-memory 1G".
# export SPARK_APP_NAME # (optional) The name of spark application.
## Use embedded spark binaries ##
## without SPARK_HOME defined, Zeppelin still able to run spark interpreter process using embedded spark binaries.
## however, it is not encouraged when you can define SPARK_HOME
##
# Options read in YARN client mode
# export HADOOP_CONF_DIR # yarn-site.xml is located in configuration directory in HADOOP_CONF_DIR.
export HADOOP_CONF_DIR=/etc/hadoop/conf
# Pyspark (supported with Spark 1.2.1 and above)
# To configure pyspark, you need to set spark distribution's path to 'spark.home' property in Interpreter setting screen in Zeppelin GUI
# export PYSPARK_PYTHON # path to the python command. must be the same path on the driver(Zeppelin) and all workers.
# export PYTHONPATH
export PYSPARK_PYTHON=/usr/bin/python2.6
#export PYTHONPATH="${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.8.2.1-src.zip"
export PYTHONPATH="${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.9-src.zip"
export SPARK_YARN_USER_ENV="PYTHONPATH=${PYTHONPATH}"
## Spark interpreter options ##
##
# export ZEPPELIN_SPARK_USEHIVECONTEXT # Use HiveContext instead of SQLContext if set true. true by default.
# export ZEPPELIN_SPARK_CONCURRENTSQL # Execute multiple SQL concurrently if set true. false by default.
# export ZEPPELIN_SPARK_IMPORTIMPLICIT # Import implicits, UDF collection, and sql if set true. true by default.
# export ZEPPELIN_SPARK_MAXRESULT # Max number of Spark SQL result to display. 1000 by default.
# export ZEPPELIN_WEBSOCKET_MAX_TEXT_MESSAGE_SIZE # Size in characters of the maximum text message to be received by websocket. Defaults to 1024000
#### HBase interpreter configuration ####
## To connect to HBase running on a cluster, either HBASE_HOME or HBASE_CONF_DIR must be set
# export HBASE_HOME= # (require) Under which HBase scripts and configuration should be
# export HBASE_CONF_DIR= # (optional) Alternatively, configuration directory can be set to point to the directory that has hbase-site.xml
#export ZEPPELIN_IMPERSONATE_CMD ='sudo -H -u ${ZEPPELIN_IMPERSONATE_USER} bash -c '
#export SPARK_HOME=/usr/hdp/current/spark-client
... View more
- Tags:
- Data Science & Advanced Analytics
- hive-interpreter
- hive-jdbc
- impersonation
- interpreter
- zeppelin-notebook
Labels:
- Labels:
-
Apache Hive
-
Apache Zeppelin
11-24-2017
12:28 AM
Thanks a lot
... View more
11-14-2017
07:26 PM
Hi guys, I have a unique issue. Grouping__id function doesnt seem to be worked as expected or as shown in the hive manual. I am executing the same example as shown in the hive guide. https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup I could find any open bugs but I have tested it out in 3 different version of hdp HDP2.4 The statement executed are here below and the results are as expected and everything works fine -- create table
create table grp_tst( col1 int,col2 int);
-- insert query
insert into table grp_tst values (1, NULL);
insert into table grp_tst values (1, 1);
insert into table grp_tst values (2, 2);
insert into table grp_tst values (3, 3);
insert into table grp_tst values (3, NULL);
insert into table grp_tst values (4, 5);
-- select query
SELECT col1,
col2, GROUPING__ID, count(*) from grp_tst GROUP BY col1, col2 WITH ROLLUP
Results col1 col2 grouping_id count NULL NULL 0 6 1 NULL 1 2 1 NULL 3 1 1 1 3 1 2 NULL 1 1 2 2 3 1 3 NULL 1 2 3 NULL 3 1 3 3 3 1 4 NULL 1 1 4 5 3 1 HDP2.5 and HDP2.6.0 - Both the resutls seems to be wrong but consistently wrong. So i am wondering if there was a bug introduced in Hive from 2.4 to 2.5 upgrade. col1 col2 grouping_id count NULL NULL 3 6 1 NULL 0 1 1 NULL 1 2 1 1 0 1 2 NULL 1 1 2 2 0 1 3 NULL 0 1 3 NULL 1 2 3 3 0 1 4 NULL 1 1 4 5 0 1 Hive setting: I am happy to share any property files if required. It would be great if you could help me out here.
... View more
Labels:
- Labels:
-
Apache Hive