Member since
01-26-2017
5
Posts
0
Kudos Received
0
Solutions
11-07-2019
06:33 AM
Hi @Scharan Spark conf: tee -a ~/spark/conf/spark-defaults.conf >> /dev/null <<EOF
spark.sql.catalogImplementation hive
spark.master yarn
spark.driver.memory 4g
spark.shuffle.service.enabled true
spark.yarn.jars hdfs:///user/zeppelin/lib/spark/jars/*
EOF Livy conf: tee -a ~/livy/conf/livy-env.sh >> /dev/null <<EOF
JAVA_HOME=/usr/lib/jvm/java-8-oracle
HADOOP_HOME=/usr/lib/hadoop
HADOOP_CONF_DIR=/etc/hadoop/conf
SPARK_HOME=~/spark
LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native
EOF tee -a ~/livy/conf/livy.conf >> /dev/null <<EOF
livy.repl.enable-hive-context = true
livy.spark.master = yarn
livy.spark.deploy-mode = cluster
livy.impersonation.enabled = true
EOF Zeppelin Conf: tee ~/zeppelin/conf/shiro.ini > /dev/null <<EOF
[main]
### FreeIPA over LDAP
ldapRealm = org.apache.zeppelin.realm.LdapRealm
ldapRealm.contextFactory.environment[ldap.searchBase] = dc=my,dc=corp,dc=de
ldapRealm.userDnTemplate = uid={0},cn=users,cn=accounts,dc=my,dc=corp,dc=de
ldapRealm.userSearchScope = subtree
ldapRealm.groupSearchScope = subtree
ldapRealm.searchBase = cn=accounts,dc=my,dc=corp,dc=de
ldapRealm.userSearchBase = cn=users,cn=accounts,dc=my,dc=corp,dc=de
ldapRealm.groupSearchBase = cn=groups,cn=accounts,dc=my,dc=corp,dc=de
ldapRealm.userObjectClass = person
ldapRealm.groupObjectClass = groupofnames
ldapRealm.groupSearchEnableMatchingRuleInChain = true
ldapRealm.userSearchAttributeName = uid
ldapRealm.userSearchFilter=(&(objectclass=person)(uid={0}))
ldapRealm.memberAttribute = member
ldapRealm.memberAttributeValueTemplate = uid={0},cn=users,cn=accounts,dc=my,dc=corp,dc=de
ldapRealm.contextFactory.authenticationMechanism = simple
ldapRealm.contextFactory.systemUsername = zeppelin
ldapRealm.contextFactory.systemPassword = password
ldapRealm.contextFactory.url = ldap://freeipa.my.corp.de:389
securityManager.realms = \$ldapRealm
sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager
### Enables 'HttpOnly' flag in Zeppelin cookies
cookie = org.apache.shiro.web.servlet.SimpleCookie
cookie.name = JSESSIONID
cookie.httpOnly = true
### Uncomment the below line only when Zeppelin is running over HTTPS
#cookie.secure = true
sessionManager.sessionIdCookie = \$cookie
securityManager.sessionManager = \$sessionManager
# 86,400,000 milliseconds = 24 hour
securityManager.sessionManager.globalSessionTimeout = 86400000
shiro.loginUrl = /api/login
[roles]
pharos = *
admin = *
[urls]
/api/version = anon
# Allow all authenticated users to restart interpreters on a notebook page.
# Comment out the following line if you would like to authorize only admin users to restart interpreters.
/api/interpreter/setting/restart/** = authc
/api/interpreter/** = authc
/api/configurations/** = authc
/api/credential/** = authc
/** = authc
EOF Interpreter: "livy": {
"id": "livy",
"name": "livy",
"group": "livy",
"properties": {
"livy.spark.executor.instances": {
"name": "livy.spark.executor.instances",
"value": "",
"type": "number"
},
"livy.spark.dynamicAllocation.cachedExecutorIdleTimeout": {
"name": "livy.spark.dynamicAllocation.cachedExecutorIdleTimeout",
"value": "",
"type": "string"
},
"zeppelin.livy.concurrentSQL": {
"name": "zeppelin.livy.concurrentSQL",
"value": false,
"type": "checkbox"
},
"zeppelin.livy.url": {
"name": "zeppelin.livy.url",
"value": "http://localhost:8998",
"type": "url"
},
"zeppelin.livy.pull_status.interval.millis": {
"name": "zeppelin.livy.pull_status.interval.millis",
"value": "1000",
"type": "number"
},
"livy.spark.executor.memory": {
"name": "livy.spark.executor.memory",
"value": "",
"type": "string"
},
"zeppelin.livy.restart_dead_session": {
"name": "zeppelin.livy.restart_dead_session",
"value": false,
"type": "checkbox"
},
"livy.spark.dynamicAllocation.enabled": {
"name": "livy.spark.dynamicAllocation.enabled",
"value": false,
"type": "checkbox"
},
"zeppelin.livy.maxLogLines": {
"name": "zeppelin.livy.maxLogLines",
"value": "1000",
"type": "number"
},
"livy.spark.dynamicAllocation.minExecutors": {
"name": "livy.spark.dynamicAllocation.minExecutors",
"value": "",
"type": "number"
},
"livy.spark.executor.cores": {
"name": "livy.spark.executor.cores",
"value": "",
"type": "number"
},
"zeppelin.livy.session.create_timeout": {
"name": "zeppelin.livy.session.create_timeout",
"value": "120",
"type": "number"
},
"zeppelin.livy.spark.sql.maxResult": {
"name": "zeppelin.livy.spark.sql.maxResult",
"value": "1000",
"type": "number"
},
"livy.spark.driver.cores": {
"name": "livy.spark.driver.cores",
"value": "4",
"type": "number"
},
"livy.spark.jars.packages": {
"name": "livy.spark.jars.packages",
"value": "",
"type": "textarea"
},
"zeppelin.livy.spark.sql.field.truncate": {
"name": "zeppelin.livy.spark.sql.field.truncate",
"value": true,
"type": "checkbox"
},
"livy.spark.driver.memory": {
"name": "livy.spark.driver.memory",
"value": "8G",
"type": "string"
},
"zeppelin.livy.displayAppInfo": {
"name": "zeppelin.livy.displayAppInfo",
"value": true,
"type": "checkbox"
},
"zeppelin.livy.principal": {
"name": "zeppelin.livy.principal",
"value": "",
"type": "string"
},
"zeppelin.livy.keytab": {
"name": "zeppelin.livy.keytab",
"value": "",
"type": "textarea"
},
"livy.spark.dynamicAllocation.maxExecutors": {
"name": "livy.spark.dynamicAllocation.maxExecutors",
"value": "",
"type": "number"
},
"livy.spark.dynamicAllocation.initialExecutors": {
"name": "livy.spark.dynamicAllocation.initialExecutors",
"value": "",
"type": "number"
}
},
"status": "READY",
"interpreterGroup": [
{
"name": "spark",
"class": "org.apache.zeppelin.livy.LivySparkInterpreter",
"defaultInterpreter": true,
"editor": {
"language": "scala",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
}
},
{
"name": "sql",
"class": "org.apache.zeppelin.livy.LivySparkSQLInterpreter",
"defaultInterpreter": false,
"editor": {
"language": "sql",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
}
},
{
"name": "pyspark",
"class": "org.apache.zeppelin.livy.LivyPySparkInterpreter",
"defaultInterpreter": false,
"editor": {
"language": "python",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
}
},
{
"name": "pyspark3",
"class": "org.apache.zeppelin.livy.LivyPySpark3Interpreter",
"defaultInterpreter": false,
"editor": {
"language": "python",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
}
},
{
"name": "sparkr",
"class": "org.apache.zeppelin.livy.LivySparkRInterpreter",
"defaultInterpreter": false,
"editor": {
"language": "r",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
}
},
{
"name": "shared",
"class": "org.apache.zeppelin.livy.LivySharedInterpreter",
"defaultInterpreter": false
}
],
"dependencies": [],
"option": {
"remote": true,
"port": -1,
"perNote": "shared",
"perUser": "scoped",
"isExistingProcess": false,
"setPermission": false,
"owners": [],
"isUserImpersonate": false
}
}, Thanks! Nicola
... View more
11-07-2019
05:39 AM
Hi @cjervis I would also like to change nickname and company. How can I do it? Best regards, Nicola
... View more
11-07-2019
05:26 AM
I have Livy and Zeppelin running under the user "zeppelin" on an edge node.
The Cloudera-cluster is not kerberized. Livy is set in yarn client mode.
When the impersonation is not enabled the Livy interpreter works, and a YARN job ist started on the cluster using the user "zeppelin".
When livy.impersonation.enabled = true Livy cannot connect to the YARN RM:
19/10/24 14:40:58 INFO utils.LineBufferedStream: 19/10/24 14:40:58 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm271
19/10/24 14:40:58 INFO utils.LineBufferedStream: 19/10/24 14:40:58 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm271 after 1 fail over attempts. Trying to fail over immediately.
19/10/24 14:40:58 INFO utils.LineBufferedStream: 19/10/24 14:40:58 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm270
19/10/24 14:40:58 INFO utils.LineBufferedStream: 19/10/24 14:40:58 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm270 after 2 fail over attempts. Trying to fail over after sleeping for 374ms.
19/10/24 14:40:58 INFO utils.LineBufferedStream: java.net.ConnectException: Call From <localhost>/123.123.123.123 to yarn.rm.host.name:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http: //wiki.apache.org/hadoop/ConnectionRefused
This error happen after the spark driver is successfully started.
Spark simply doesn't get any resources from YARN.
Zeppelin is configured to use FreeIPA over LDAP for login. The user I use to login to the zeppelin UI has even more rights on the cluster and if I start a spark-shell on the same cluster with that user I get a YARN job without any issue.
When impersonation is disabled I have no issues.
Environment:
Centos7
Cloudera 5.7
Hadoop 2.6
Spark 2.4.x
Livy 0.6
Zeppelin 0.8.2
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
-
Apache Zeppelin
07-19-2017
07:08 AM
After increasing the java heap size to 4GB, I can start the NameNode without errors from the service menu. However, the issue persists during the upgrade process. It seems that the issue is specific to the upgrade wizard.
... View more
07-11-2017
12:31 AM
I have a HDP 2.5.3 kerberized cluster. An express upgrade from 2.5.0 to 2.5.3 has been already successful in the past using the Ambari wizard. The upgrade to 2.6.1 fails when trying to restart the HDFS Namnode. It fails with the "connection refused" error message (see below). I tried to choose "Ignore and Proceed" after every service restart failure, but in the end the wizard give only 2 options: retry (the last step) or downgrade. The same error appears when performing the downgrade back to 2.5.3. it cannot restart the namenode, I always have to choose "Ignore and proceed". However, after completing the downgrade, I can start all services in Ambari. I noticed the same behavior in the logs when starting the cluster every morning. There is exactly the same "connection refused" error lines before the lines indicating the attempts to leave safe mode. However, the start the the HDFS service always completes successfully and the namenode works in the end (that's why I notice the error only now). It seems that only the upgrade process cannot properly restart the namenode. Now my questions: Why the upgrade/downgrade wizard fails to restart the namenode while the usual ambari service section has no issues with it? Is it possible to choose "ignore and proceed" during the upgrade process without being obliged to downgrade and restart the services afterwards? Regards, Nicola java.net.ConnectException: Call From <hostname/IP to hostname:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1556)
at org.apache.hadoop.ipc.Client.call(Client.java:1496)
at org.apache.hadoop.ipc.Client.call(Client.java:1396)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy10.setSafeMode(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setSafeMode(ClientNamenodeProtocolTranslatorPB.java:711)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
at com.sun.proxy.$Proxy11.setSafeMode(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2657)
at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1340)
at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1324)
at org.apache.hadoop.hdfs.tools.DFSAdmin.setSafeMode(DFSAdmin.java:611)
at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1916)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2107)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:650)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:745)
at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1618)
at org.apache.hadoop.ipc.Client.call(Client.java:1449)
... 20 more
safemode: Call From hostname/IP to hostname:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2017-07-10 10:55:33,095 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://hostname:8020 -safemode get | grep 'Safe mode is OFF'' returned 1. 17/07/10 10:55:32 WARN ipc.Client: Failed to connect to server: hostname/IP:8020: try once and fail.
... View more
Labels:
03-10-2017
01:11 PM
I managed to install and start Ambari 2.4.2.0 and HDP 2.5.3 complete (all components except Smartsense) on Ubuntu 16.04.2 without Kerberos, SSL and LDAP configurations yet. It took some tweaking but finally I have no alerts at all. And all UIs are working, and all Service Checks are OK.
The following has to be done in advance before starting the installation of HDP: In the files: /usr/lib/ambari-agent/lib/ambari_commons/resources/os_family.json
/usr/lib/ambari-server/lib/ambari_commons/resources/os_family.json
Add the the alias ubuntu16: "aliases": {
"amazon2015": "amazon6",
"amazon2016": "amazon6",
"suse11sp3": "suse11",
"opensuse 42": "sles12",
"ubuntu16": "ubuntu14"
}
I added also "opensuse 42" because I tried an install on Opensuse Leap 42 (equivalend of SLES12), but that's another history... I also removed 16 from the ubuntu versions: "ubuntu": {
"distro": [
"ubuntu"
],
"versions": [
12,
14
]
},
In this way ubuntu16 is treated as an alias of ubuntu14. During the configuration and installation of HDP 2.5.3, Ambari will not complain anymore about the OS version. In one single service it was necessary to modify something due to a new version of a specific deb package. Go to the file: /var/lib/ambari-server/resources/common-services/AMBARI_METRICS/0.1.0/metainfo.xml Replace "libsnappy1*" with "libsnappy1v5*"
Afterwards everything was OK, except for OS-independent known issues.
... View more