Member since
03-25-2016
142
Posts
48
Kudos Received
7
Solutions
04-24-2017
01:24 PM
Problem This problem happened on HDP 2.5.3 when running Spark On HBase. Here is the error seen in the application log: ...
17/04/11 10:12:04 WARN RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
17/04/11 10:12:05 INFO ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
17/04/11 10:12:05 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
...
Solution To fix that problem ensure that hbase-site.xml file exists in /etc/spark/conf on each of the NodeManager node.
... View more
Labels:
07-16-2018
06:22 AM
@Daniel Kozlowski :- The kill solution will work in "client" mode. In cluster mode, driver would be any node of the cluster. Assuming, we dont have ssh access to that node, how can one kill the driver?
... View more
02-06-2018
06:35 AM
@Lekya Goriparti Have a look at this: https://community.hortonworks.com/questions/26622/the-node-hbase-is-not-in-zookeeper-it-should-have.html
... View more
05-10-2017
04:58 AM
@azelmad zakaria As this is an article, raise a separate question in HCC, refer to this one and provide the full stack trace from your console
... View more
03-14-2017
01:54 PM
Environment - HDP 2.5.3 - Kerberos disabled Problem I have a problem to use hiveContext with zeppelin. For example this code does not works: %pyspark
from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)
sample07 = sqlContext.table("default.sample_07")
sample07.show()
Here is the error displayed: You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt assembly
Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
: java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: Permission denied
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:204)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238)
at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:225)
at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:215)
at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:480)
at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:479)
at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40)
at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException: Permission denied
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:515)
... 21 more
Caused by: java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2001)
at org.apache.hadoop.hive.ql.session.SessionState.createTempFile(SessionState.java:818)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:513)
... 21 more
(<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError(u'An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.\n', JavaObject id=o125), <traceback object at 0x17682d8>)
Solution Even though you are logged in to Zeppelin UI as a user from AD/LDAP/local, the query gets executed as zeppelin user. Hence, zeppelin user needs to have a written permission to where hive.exec.local.scratchdir parameter indicates. As default, it is set to /tmp/<userName>. So, the following needs to exist in zeppelin node: [root@dan2 ~]# ls -lrt /tmp
drwxr-xr-x. 20 zeppelin zeppelin 4096 Mar 10 16:46 zeppelin
... View more
Labels:
03-14-2017
09:49 AM
Problem When trying to import larger notebooks into Zeppelin, the import does nothing. Checking the /var/log/zeppelin/ logs, it is apparent that this is a jetty websocket text size issue. WARN [2017-03-13 14:22:30,427] ({qtp110945054-15} Parser.java[notifyWebSocketException]:235) -
org.eclipse.jetty.websocket.api.MessageTooLargeException: Text message size [2408484] exceeds maximum size [1024000]
at org.eclipse.jetty.websocket.api.WebSocketPolicy.assertValidTextMessageSize(WebSocketPolicy.java:140)
at org.eclipse.jetty.websocket.common.Parser.assertSanePayloadLength(Parser.java:127)
at org.eclipse.jetty.websocket.common.Parser.parseFrame(Parser.java:482)
at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:254)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
INFO [2017-03-13 14:22:30,428] ({qtp110945054-15} NotebookServer.java[onClose]:217) - Closed connection to 84.20.132.89 : 24324. (1009) Text message size [2408484] exceeds maximum size [1024000] Solution Steps to resolve the issue - go to Ambari UI -> Zeppelin -> Configs -> Advanced zeppelin-config
- increase the value of zeppelin.websocket.max.text.message.size to i.e. 3072000
- save the changes
- restart required services
... View more
Labels:
03-13-2017
01:30 PM
Problem When I run the below code on Pig View (HDP 2.5.3 and Ambari 2.4.2) in Kerberos enabled REGISTER '/usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-pig-adapter.jar';
a = LOAD 'test123.foo' USING org.apache.hive.hcatalog.pig.HCatLoader();
dump a;
I get the following errors - PIG log ...
07 Mar 2017 12:22:32,971 ERROR [ambari-client-thread-28] [PIG 1.0.0 Pig] JobService:232 - Exception occurred :
java.io.FileNotFoundException: File /user/<userID>/pig/jobs/pighivetest_07-03-2017-12-22-15/stderr not found.
...
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File /user/<userID>/pig/jobs/pighivetest_07-03-2017-12-22-15/stderr not found.
... - Application log ...
2017-03-07 12:22:45,223 [main] ERROR org.apache.thrift.transport.TSaslTransport - SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
...
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
...
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed
...
2017-03-07 12:24:45,466 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2245: Cannot get schema from loadFunc org.apache.hive.hcatalog.pig.HCatLoader
Details at logfile: /hadoop/1/hadoop/yarn/local/usercache/<userID>/appcache/application_1488824278698_2495/container_e187_1488824278698_2495_01_000002/pig_1488889361907.log
... Interestingly, the same job runs perfectly fine while running from pig command line. Solution In Pig View, under the script section, there is Arguments one to add pig properties. To fix the problem, add "-useHCatalog" like presented below.
... View more
Labels:
03-14-2017
06:47 AM
@yvora This problem happened at the customer's site on HDP 2.5.0. The customer had already spark job running on RM UI. Then created a folder, permissions, etc using sh interpreter or command line. Running livy.sparkr using source function displayed "No such file or directory" even though the file was there. Restarting livy interpreter fixed the issue. BTW: I have tested it out on my local HDP 2.5.3 and all works fine - no livy interpreter needs to be restarted.
... View more
03-01-2017
03:15 PM
Problem I have my shiro configured as [users]
# List of users with their password allowed to access Zeppelin.
# To use a different strategy (LDAP / Database / ...) check the shiro doc at http://shiro.apache.org/configuration.html#Configuration-INISections
#admin = password1
admin = change
#user1 = password2, role1, role2
#user2 = password3, role3
#user3 = password4, role2
# Sample LDAP configuration, for user Authentication, currently tested for single Realm
[main]
ldapRealm = org.apache.zeppelin.server.LdapGroupRealm
ldapRealm.contextFactory.environment[ldap.searchBase] = ou=my,ou=company,o=com
ldapRealm.userDnTemplate = uid={0},ou=my,ou=company,o=com
ldapRealm.contextFactory.url = ldap://<ldap_host>:389
ldapRealm.contextFactory.authenticationMechanism = SIMPLE
# doc horton
ldapRealm.contextFactory.systemUsername = <system_username_bind>
ldapRealm.contextFactory.systemPassword = <system_username_password>
#sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager
#securityManager.sessionManager = $sessionManager
# 86,400,000 milliseconds = 24 hour
#securityManager.sessionManager.globalSessionTimeout = 86400000
shiro.loginUrl = /api/login
[urls]
# anon means the access is anonymous.
# authcBasic means Basic Auth Security
# To enfore security, comment the line below and uncomment the next one
/api/version = anon
#/** = anon
/** = authc
After logging into zeppelin UI and running scripts from Notebook I can see the following error in zeppelin log WARN [2017-02-28 09:20:09,062] ({qtp1918627686-14} ServletHandler.java[doHandle]:620) -
javax.servlet.ServletException: Filtered request failed.
at org.apache.shiro.web.servlet.AbstractShiroFilter.doFilterInternal(AbstractShiroFilter.java:384)
at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.apache.zeppelin.server.CorsFilter.doFilter(CorsFilter.java:72)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoSuchMethodError: javax.ws.rs.ClientErrorException.validate(Ljavax/ws/rs/core/Response;Ljavax/ws/rs/core/Response$Status$Family;)Ljavax/ws/rs/core/Response;
at javax.ws.rs.ClientErrorException.<init>(ClientErrorException.java:88)
at org.apache.cxf.jaxrs.utils.JAXRSUtils.findTargetMethod(JAXRSUtils.java:503)
at org.apache.cxf.jaxrs.interceptor.JAXRSInInterceptor.processRequest(JAXRSInInterceptor.java:207)
at org.apache.cxf.jaxrs.interceptor.JAXRSInInterceptor.handleMessage(JAXRSInInterceptor.java:103)
at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:272)
at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:239)
at org.apache.cxf.transport.servlet.ServletController.invokeDestination(ServletController.java:248)
at org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:222)
at org.apache.cxf.transport.servlet.ServletController.invoke(ServletController.java:153)
at org.apache.cxf.transport.servlet.CXFNonSpringServlet.invoke(CXFNonSpringServlet.java:167)
at org.apache.cxf.transport.servlet.AbstractHTTPServlet.handleRequest(AbstractHTTPServlet.java:286)
at org.apache.cxf.transport.servlet.AbstractHTTPServlet.doGet(AbstractHTTPServlet.java:211)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at org.apache.cxf.transport.servlet.AbstractHTTPServlet.service(AbstractHTTPServlet.java:262)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61)
at org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108)
at org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137)
at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66)
at org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:449)
at org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:365)
at org.apache.shiro.subject.support.SubjectCallable.doCall(SubjectCallable.java:90)
at org.apache.shiro.subject.support.SubjectCallable.call(SubjectCallable.java:83)
at org.apache.shiro.subject.support.DelegatingSubject.execute(DelegatingSubject.java:383)
at org.apache.shiro.web.servlet.AbstractShiroFilter.doFilterInternal(AbstractShiroFilter.java:362)
... 22 more
Solution To solve the problem enable sessionManager, cacheManager and securityManager attributes in shiro config so it looks like [users]
# List of users with their password allowed to access Zeppelin.
# To use a different strategy (LDAP / Database / ...) check the shiro doc at http://shiro.apache.org/configuration.html#Configuration-INISections
#admin = password1
admin = change
#user1 = password2, role1, role2
#user2 = password3, role3
#user3 = password4, role2
# Sample LDAP configuration, for user Authentication, currently tested for single Realm
[main]
ldapRealm = org.apache.zeppelin.server.LdapGroupRealm
ldapRealm.contextFactory.environment[ldap.searchBase] = ou=my,ou=company,o=com
ldapRealm.userDnTemplate = uid={0},ou=my,ou=company,o=com
ldapRealm.contextFactory.url = ldap://<ldap_host>:389
ldapRealm.contextFactory.authenticationMechanism = SIMPLE
# doc horton
ldapRealm.contextFactory.systemUsername = <system_username_bind>
ldapRealm.contextFactory.systemPassword = <system_username_password>
sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager
cacheManager = org.apache.shiro.cache.MemoryConstrainedCacheManager
securityManager.realm = $ldapRealm
securityManager.cacheManager = $cacheManager
securityManager.sessionManager = $sessionManager
securityManager.sessionManager.globalSessionTimeout = 86400000
shiro.loginUrl = /api/login
[urls]
# anon means the access is anonymous.
# authcBasic means Basic Auth Security
# To enfore security, comment the line below and uncomment the next one
/api/version = anon
#/** = anon
/** = authc
... View more
Labels:
02-28-2017
06:21 AM
1 Kudo
Problem When running $ ambari-server restart I can see "tones" of these messages in ambari-server.log 27 Feb 2017 11:38:00,942 WARN [main] AbstractLifeCycle:204 - FAILED SelectChannelConnector@0.0.0.0:8080: java.net.BindException: Address already in use
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:463)
at sun.nio.ch.Net.bind(Net.java:455)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316)
at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at org.eclipse.jetty.server.Server.doStart(Server.java:293)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at org.apache.ambari.server.controller.AmbariServer.run(AmbariServer.java:616)
at org.apache.ambari.server.controller.AmbariServer.main(AmbariServer.java:925)
27 Feb 2017 11:38:00,942 WARN [main] AbstractLifeCycle:204 - FAILED SelectChannelConnector@0.0.0.0:8080: java.net.BindException: Address already in use
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:463)
at sun.nio.ch.Net.bind(Net.java:455)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316)
at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at org.eclipse.jetty.server.Server.doStart(Server.java:293)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at org.apache.ambari.server.controller.AmbariServer.run(AmbariServer.java:616)
at org.apache.ambari.server.controller.AmbariServer.main(AmbariServer.java:925)
27 Feb 2017 11:38:00,944 WARN [main] AbstractLifeCycle:204 - FAILED org.eclipse.jetty.server.Server@5c0ecfc4: java.net.BindException: Address already in use
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:463)
at sun.nio.ch.Net.bind(Net.java:455)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316)
at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
Solution On AWS Cloudbreak starts the instances with Amazon Linux. On Amazon-linux Upstart is used for system init. To restart ambari server with upstart you have to use "initctl" command like:
$ initctl restart ambari-server If you have ambari server already running, do the following after logging in to ambari server node as root: a) $ ambari-server stop
b) $ ps -ef | grep ambari-server
c) kill all the outstanding ambari-server's processes
d) backup and remove all logs from /var/log/ambari-server
e) $ initctl start ambari-server
f) tail -f /var/log/ambari-server/ambari-server.log
Now, there is no more "Address already in use" errors in ambari-server.log.
... View more
Labels:
- « Previous
-
- 1
- 2
- Next »