About jiangok2006

jiangok2006 · ‎03-22-2019

Hi, I am using Horton work HDP3.0 which has zeppelin 0.8.0. I followed https://zeppelin.apache.org/docs/0.8.0/development/helium/writing_visualization_basic.html to install helium viz packages. Since my environment does not have internet access, I have to install packages into local registry. Here is what I did: step 1. added zeppelin.helium.localregistry.default in custom zeppelin-site settings, pointing to a local helium folder on the zeppelin master host. Restart zeppelin in ambari. step 2. download https://s3.amazonaws.com/helium-package/helium.json to the helium folder. step 3. create a npm package tarball, upload the zeppelin master host, unzip into the helium folder. I also updated the artifact to point to local npm folder (e.g. from sogou-map-vis@1.0.0 to /u01/helium/sogou-map-vis) However, zeppelin's helium page does not list any viz packages. Below post all indicate it is possible to use helium packages offline but it does not work for me: https://community.hortonworks.com/questions/223589/installing-helium-modules-in-zeppelin.html I am not sure if using helium offline is officially supported by zeppelin. Any clue is highly appreciated! Thanks.

jiangok2006 · ‎11-02-2018

Below command failed on HDP3. Any idea? Thanks a lot. sudo -u hadoop hadoop distcp hdfs://namenode.mycluster.com:8020/tmp/txt hdfs://namenode.mycluster.com:8020/tmp/txt2 java.lang.NullPointerException at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.stop(TimelineV2ClientImpl.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.serviceStop(TimelineV2ClientImpl.java:141) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:495) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158) at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1868) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1309) at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54) at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:202) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691) 2018-11-02 18:57:15,753 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:178) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:979) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.getHttpPort(MRClientService.java:177) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:159) ... 14 more 2018-11-02 18:57:15,754 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NullPointerException

jiangok2006 · ‎09-25-2018

@Hayati İbiş Please copy all jar files instead of only spark*.jar. Hope this helps. Thanks.

jiangok2006 · ‎09-04-2018

It is a knox bug https://issues.apache.org/jira/browse/KNOX-1424. After patching the fix, SQL interpreter show results.

jiangok2006 · ‎08-31-2018

I am using HDP3.0 (having zeppelin 0.8.0). Below queries do not show any result, even though downloading the csv show the data correctly (e.g. if there is no tables, show the header). %livy2.sqlshow tables %spark2.sql show tables Infrequently I saw the table show a short time and then disappear. I guess it is html rendering issue. This issue does not happen when I ssh to the zeppelin host and use zeppelin via port forwarding. It happens when I use zeppelin via knox. Any idea? Any setting is related to this? Sql interpreters worked fine in HDP2.6 using zeppelin 0.7.3. javascript error: Error: [ngRepeat:iidexp] '_item_' in '_item_ in _collection_' should be an identifier or '(_key_, _value_)' expression, but got '/gateway/ui/zeppelin/app'. http://errors.angularjs.org/1.5.7/ngRepeat/iidexp?p0=%2Fgateway%2Fui%2Fzeppelin%2Fapp b/<@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:376 Kg</<.compile@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:40:30580 Z@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:37:4755 S@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:30480 S@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:30611 S@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:30611 N@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:29413 X/<@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:37:440 d@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:30833 m@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:37:909 mg</<.link/<@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:40:17582 xc/this.$get</o.prototype.$digest@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:38:10966 a/n.prototype.safeDigest@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:76:1460 b@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:76:3748 a/n.prototype._onMessageHandler@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:76:3960 R/<@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:5633 "  Appreciate any clue.

jiangok2006 · ‎08-30-2018

Per https://spark.apache.org/docs/latest/building-spark.html, spark 2.3.1 is built with hadoop 2.6.X by default. This is why I see my fat jar includes hadoop 2.6.5 (instead of 3.1.0) jars. HftpFileSystem has been removed in hadoop 3. I need spark 2.3.1 jars that built with hadoop 3.1. On https://spark.apache.org/downloads.html, I only see spark 2.3.1 built with hadoop 2.7. Where can I get spark 2.3.1 built with hadoop 3? Does spark 2.3.1 support hadoop 3? Appreciate your help. [UPDATE] I solved this issue by using spark 2.3.1 jars under /usr/hdp/current/spark2-client/ from the HDP3.0 cluster. Thanks.

jiangok2006 · ‎08-30-2018

Hi, My spark structured streaming jobs working in HDP2.6 failed in HDP3.0: java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem cannot access its superinterface org.apache.hadoop.hdfs.web.TokenAspect$TokenManagementDelegator at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at java.util.ServiceLoader$1.next(ServiceLoader.java:480) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:3268) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3313) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3352) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:85) at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:46) at org.apache.spark.sql.execution.datasources.json.TextInputJsonDataSource$.readFile(JsonDataSource.scala:125) at org.apache.spark.sql.execution.datasources.json.JsonFileFormat$$anonfun$buildReader$2.apply(JsonFileFormat.scala:132) at org.apache.spark.sql.execution.datasources.json.JsonFileFormat$$anonfun$buildReader$2.apply(JsonFileFormat.scala:130) at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:148) at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:132) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:128) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:216) at org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:108) at org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:101) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) I did not find useful info online. Any clue is appreciated.

jiangok2006 · ‎08-28-2018

Thanks guys. Rolling back to python 2.7 made livy server succeed to start.

jiangok2006 · ‎08-24-2018

The domain name used by hadoop hosts and the one used by the load balancer are different. DEFAULT setting will use the load balancer's domain to construct whitelist filter. I need to update the whitelist filter to use hadoop hosts' domain name instead. Hope this helps.

jiangok2006 · ‎08-24-2018

Hi, I am using HDP3.0 and cannot start the livy server: 18/08/24 20:24:39 WARN LivyConf: The configuration key livy.repl.enableHiveContext has been deprecated as of Livy 0.4 and may be removed in the future. Please use the new key livy.repl.enable-hive-context instead. 18/08/24 20:24:39 WARN LivyConf: The configuration key livy.server.csrf_protection.enabled has been deprecated as of Livy 0.4 and may be removed in the future. Please use the new key livy.server.csrf-protection.enabled instead. 18/08/24 20:24:39 INFO AccessManager: AccessControlManager acls disabled;users with view permission: ;users with modify permission: ;users with super permission: ;other allowed users: * 18/08/24 20:24:39 INFO LineBufferedStream: stdout: File "/usr/bin/hdp-select", line 251 18/08/24 20:24:39 INFO LineBufferedStream: stdout: print "ERROR: Invalid package - " + name 18/08/24 20:24:39 INFO LineBufferedStream: stdout: ^ 18/08/24 20:24:39 INFO LineBufferedStream: stdout: SyntaxError: Missing parentheses in call to 'print'. Did you mean print("ERROR: Invalid package - " + name)? 18/08/24 20:24:39 INFO LineBufferedStream: stdout: ls: cannot access /usr/hdp//hadoop/lib: No such file or directory 18/08/24 20:24:39 INFO LineBufferedStream: stdout: Exception in thread "main" java.lang.IllegalStateException: hdp.version is not set while running Spark under HDP, please set through HDP_VERSION in spark-env.sh or add a java-opts file in conf with -Dhdp.version=xxx 18/08/24 20:24:39 INFO LineBufferedStream: stdout: at org.apache.spark.launcher.Main.main(Main.java:118) Exception in thread "main" java.lang.IllegalArgumentException: Fail to parse Spark version from is not set while running Spark under HDP, please set through HDP_VERSION in spark-env.sh or add a java-opts file in conf with -Dhdp.version=xxx at org.apache.livy.utils.LivySparkUtils$.formatSparkVersion(LivySparkUtils.scala:155) at org.apache.livy.utils.LivySparkUtils$.testSparkVersion(LivySparkUtils.scala:82) at org.apache.livy.server.LivyServer.start(LivyServer.scala:74) at org.apache.livy.server.LivyServer$.main(LivyServer.scala:339) at org.apache.livy.server.LivyServer.main(LivyServer.scala) The error looks like livy tried to use python3 to parse the python2 syntax. Any idea why this happens? Thanks.

Online	Offline
Last Visited	‎04-15-2019 06:21 AM

Member Since	‎06-02-2017 04:42 AM
Last Visited	‎04-15-2019 06:21 AM
Posts	39
Kudos received	4

Cloudera Community

Re: sql interpreter shows empty result in zeppelin...

Re: HDP3.0: spark structured streaming jobs workin...

Re: HDP3.0 hbase fail to start due to noexec /tmp

use helium package in zeppelin offline

HDP3: hadoop distcp NullPointerException

Re: HDP3.0: spark structured streaming jobs workin...

Re: sql interpreter shows empty result in zeppelin...

sql interpreter shows empty result in zeppelin via...

Re: HDP3.0: spark structured streaming jobs workin...

HDP3.0: spark structured streaming jobs working in...

Re: HDP3.0: livy server cannot start

Re: HDP3.0: knox fails to dispatch webhdfs request...

HDP3.0: livy server cannot start