Support Questions

Find answers, ask questions, and share your expertise

Smartsense Zeppelin Notebooks Can't Connect

avatar
Contributor

I just noticed that the SmartSense Activity Explorer Zeppelin Notebooks have been failing to run on a production HDP 2.6.1 cluster. I'm not sure how long the issue has been occurring since the dashboards haven't been used much until now. Whenever we try to run the paragraphs, we immediately get an error about unable to make a connection. No other information is given. We are able to connect to Phoenix through psql.py, so we know Phoenix is working properly, just not the dashboard. We've tried restarting the activity explorer, which hasn't fixed the issue. Has someone seen this issue? Any ideas? I'm including the logs we are seeing below.

 

==> activity-explorer.log <==

2019-11-05 10:34:42,555  INFO [qtp1209702763-1653] NotebookServer:711 - New operation from 10.142.131.4 : 62057 : admin : GET_NOTE : 2BPD7951H

2019-11-05 10:34:42,558  WARN [qtp1209702763-1653] VFSNotebookRepo:292 - Get Note revisions feature isn't supported in class org.apache.zeppelin.notebook.repo.VFSNotebookRepo

2019-11-05 10:34:45,886  INFO [pool-2-thread-31] SchedulerFactory:131 - Job paragraph_1490380022011_880344082 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpretershared_session255064451

2019-11-05 10:34:45,887  INFO [pool-2-thread-31] Paragraph:366 - run paragraph 20160728-152731_1797959357 using null org.apache.zeppelin.interpreter.LazyOpenInterpreter@4a66e7be

 

==> zeppelin-interpreter-phoenix-phoenix--<HOSTNAME> <==

2019-11-05 10:34:45,889  INFO [pool-2-thread-4] SchedulerFactory:131 - Job remoteInterpretJob_1572971685889 started by scheduler org.apache.zeppelin.phoenix.PhoenixInterpreter717591913

2019-11-05 10:34:45,889  INFO [pool-2-thread-4] PhoenixInterpreter:192 - Run SQL command 'SELECT file_size_category  as "Size category",

       total_files         as "Total files",

       avg_file_size       as "Avg file size"

FROM (

     SELECT  CASE WHEN file_size_range_end <= 10000  THEN 'Tiny (0-10K)'

               WHEN file_size_range_end <= 1000000  THEN 'Mini (10K-1M)'

               WHEN file_size_range_end <= 30000000  THEN 'Small (1M-30M)'

               WHEN file_size_range_end <= 128000000  THEN 'Medium (30M-128M)'

               ELSE 'Large (128M+)'

          END as file_size_category,

          sum(file_count) as total_files,

          (sum(total_size) / sum(file_count)) as avg_file_size

     FROM ACTIVITY.HDFS_USER_FILE_SUMMARY

     WHERE analysis_date in ( SELECT MAX(analysis_date)

                              FROM ACTIVITY.HDFS_USER_FILE_SUMMARY)

     GROUP BY file_size_category

)'

2019-11-05 10:34:45,890  INFO [pool-2-thread-4] SchedulerFactory:137 - Job remoteInterpretJob_1572971685889 finished by scheduler org.apache.zeppelin.phoenix.PhoenixInterpreter717591913


==> activity-explorer.log <==

2019-11-05 10:34:45,891  WARN [pool-2-thread-31] NotebookServer:2067 - Job 20160728-152731_1797959357 is finished, status: ERROR, exception: null, result: %text ERROR 103 (08004): Unable to establish connection.

2019-11-05 10:34:45,909  INFO [pool-2-thread-31] SchedulerFactory:137 - Job paragraph_1490380022011_880344082 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpretershared_session255064451

 

1 ACCEPTED SOLUTION

avatar
Contributor

After looking into this some more, we found the error trace below the first time that a paragraph was called after the interpreter was restarted. This didn't show up originally since the above log was only trying to run a paragraph, not necessarily just after the interpreter was restarted. As you can see, in the end there is an exception about a class not being accessible. Once we made sure the wandisco class was accessible to the interpreter in the classpath, then everything started to work properly.

 

2019-11-06 10:24:48,850 ERROR [pool-2-thread-2] PhoenixInterpreter:108 - Cannot open connection
java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
        at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:386)
        at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:288)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:171)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1881)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1860)
        at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1860)
        at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:162)
        at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:131)
        at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:133)
        at java.sql.DriverManager.getConnection(DriverManager.java:664)
        at java.sql.DriverManager.getConnection(DriverManager.java:247)
        at org.apache.zeppelin.phoenix.PhoenixInterpreter.open(PhoenixInterpreter.java:99)
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:493)
        at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
        at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
        at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:410)
       at org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:319)
        at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
        at org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:286)
        ... 22 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
        ... 27 more
Caused by: java.lang.NoClassDefFoundError: com/wandisco/shadow/com/google/protobuf/InvalidProtocolBufferException
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2573)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2586)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)

 

View solution in original post

3 REPLIES 3

avatar
Super Collaborator

Dashboards used by Activity explorer will use phoenix interpreter configured to connect ams-hbase. Verify if AMS hbase is accessible using sqlline.py 

 

#export HBASE_CONF_PATH=/etc/ams-hbase/conf

#<PathTo>/sqlline.py <ams-fqdn>:61181:/ams-hbase-secure (if AMS is running in embedded mode>

avatar
Contributor

I've verified that I can access Phoenix through sqlline.py and psql.py using the configuration in /etc/ams-hbase/conf, and run queries as the activity-explorer user that I'm trying to run through Zeppelin.

 

One thing of note with all this: we've changed the ZNode parent from ams-hbase-secure1 to ams-hbase-secure2. I've verified that the value in /etc/ams-hbase/conf/hbase-site.xml holds the new value, but the value in /etc/ams-metrics-collector/conf/hbase-site.xml is the old value and hasn't been updated recently. activity-env.sh points to /etc/ams-hbase/conf, so I believe this shouldn't be an issue, but it was a bit confusing when I first came across it.

avatar
Contributor

After looking into this some more, we found the error trace below the first time that a paragraph was called after the interpreter was restarted. This didn't show up originally since the above log was only trying to run a paragraph, not necessarily just after the interpreter was restarted. As you can see, in the end there is an exception about a class not being accessible. Once we made sure the wandisco class was accessible to the interpreter in the classpath, then everything started to work properly.

 

2019-11-06 10:24:48,850 ERROR [pool-2-thread-2] PhoenixInterpreter:108 - Cannot open connection
java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
        at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:386)
        at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:288)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:171)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1881)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1860)
        at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1860)
        at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:162)
        at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:131)
        at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:133)
        at java.sql.DriverManager.getConnection(DriverManager.java:664)
        at java.sql.DriverManager.getConnection(DriverManager.java:247)
        at org.apache.zeppelin.phoenix.PhoenixInterpreter.open(PhoenixInterpreter.java:99)
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:493)
        at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
        at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
        at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:410)
       at org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:319)
        at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
        at org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47)
        at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:286)
        ... 22 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
        ... 27 more
Caused by: java.lang.NoClassDefFoundError: com/wandisco/shadow/com/google/protobuf/InvalidProtocolBufferException
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2573)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2586)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)