Member since
06-02-2017
39
Posts
4
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
917 | 09-04-2018 09:07 PM | |
9371 | 08-30-2018 04:57 PM | |
2905 | 08-14-2018 03:49 PM |
04-13-2019
05:13 PM
I read https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/running-spark-applications/content/automating_spark_jobs_with_oozie_spark_action.html However, I don't know where should I set "oozie.action.sharelib.for.spark.exclude=oozie/jackson.*=". Should I set in job.properties? or oozie-site? If oozie-site, I need restart oozie (it is not mentioned in the document) before running sharelibupdate. Thanks for any hints on this. Also, I deleted all files under /user/oozie/share/lib/lib_{TS}/oozie/ (except oozie-hadoop-utils-hadoop-2-4.3.1.3.0.0.0-1634.jar and oozie-sharelib-oozie-4.3.1.3.0.0.0-1634.jar) which should have the same effect of oozie.action.sharelib.for.spark.exclude, but I still got below exception: Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.9.0
at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:751)
at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
Looks like the conflict is not coming from oozie share lib. Where this conflict could come from? Thanks again!
... View more
04-11-2019
07:38 AM
Hi, I am using hortonwork HDP3.0 which has oozie 4.3.1 and spark 2.3.1. My spark job throws intermittent dependency error: 2019-04-11 06:29:07,178 [Driver] ERROR org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception: java.lang.NoClassDefFoundError: Could not initialize class com.fasterxml.jackson.databind.SerializationConfig java.lang.NoClassDefFoundError: Could not initialize class com.fasterxml.jackson.databind.SerializationConfig at com.fasterxml.jackson.databind.ObjectMapper.<init>(ObjectMapper.java:565) at com.fasterxml.jackson.databind.ObjectMapper.<init>(ObjectMapper.java:480) Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.9.0 at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64) at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19) at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:751) at org.apache.spark.util.JsonProtocol$.<init>(JsonProtocol.scala:59) at org.apache.spark.util.JsonProtocol$.<clinit>(JsonProtocol.scala) The same spark job running via spark-submit always pass. I have tried below settings but no luck: <configuration>
<property>
<name>oozie.launcher.mapreduce.user.classpath.first</name>
<value>true</value>
</property>
<property>
<name>spark.yarn.user.classpath.first</name>
<value>true</value>
</property>
<property>
<name>oozie.launcher.mapreduce.task.classpath.user.precedence</name>
<value>true</value>
</property>
</configuration> Why spark action has such issue? Thanks for any clue.
... View more
Labels:
- Labels:
-
Apache Oozie
03-22-2019
04:24 PM
Hi,
I am using Horton work HDP3.0 which has zeppelin 0.8.0. I followed https://zeppelin.apache.org/docs/0.8.0/development/helium/writing_visualization_basic.html to install helium viz packages. Since my environment does not have internet access, I have to install packages into local registry. Here is what I did:
step 1. added zeppelin.helium.localregistry.default in custom zeppelin-site settings, pointing to a local helium folder on the zeppelin master host. Restart zeppelin in ambari.
step 2. download https://s3.amazonaws.com/helium-package/helium.json to the helium folder.
step 3. create a npm package tarball, upload the zeppelin master host, unzip into the helium folder. I also updated the artifact to point to local npm folder (e.g. from sogou-map-vis@1.0.0 to /u01/helium/sogou-map-vis)
However, zeppelin's helium page does not list any viz packages.
Below post all indicate it is possible to use helium packages offline but it does not work for me:
https://community.hortonworks.com/questions/223589/installing-helium-modules-in-zeppelin.html
I am not sure if using helium offline is officially supported by zeppelin.
Any clue is highly appreciated! Thanks.
... View more
Labels:
11-02-2018
07:03 PM
Below command failed on HDP3. Any idea? Thanks a lot. sudo -u hadoop hadoop distcp hdfs://namenode.mycluster.com:8020/tmp/txt hdfs://namenode.mycluster.com:8020/tmp/txt2
java.lang.NullPointerException
at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.stop(TimelineV2ClientImpl.java:563)
at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.serviceStop(TimelineV2ClientImpl.java:141)
at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:495)
at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:158)
at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1868)
at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1309)
at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54)
at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:202)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691)
2018-11-02 18:57:15,753 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NullPointerException
at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:178)
at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122)
at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:979)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691)
Caused by: java.lang.NullPointerException
at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.getHttpPort(MRClientService.java:177)
at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:159)
... 14 more
2018-11-02 18:57:15,754 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NullPointerException
... View more
Labels:
09-25-2018
08:45 PM
1 Kudo
@Hayati İbiş Please copy all jar files instead of only spark*.jar. Hope this helps. Thanks.
... View more
09-04-2018
09:07 PM
It is a knox bug https://issues.apache.org/jira/browse/KNOX-1424. After patching the fix, SQL interpreter show results.
... View more
08-31-2018
04:40 PM
I am using HDP3.0 (having zeppelin 0.8.0). Below
queries do not show any result, even though downloading the csv show the
data correctly (e.g. if there is no tables, show the header). %livy2.sqlshow tables %spark2.sql show tables Infrequently I saw the table show a short time and then disappear. I guess it is html rendering issue. This issue does not happen when I ssh to the zeppelin host and use
zeppelin via port forwarding. It happens when I use zeppelin via knox. Any idea? Any setting is related to this? Sql interpreters worked fine in HDP2.6 using zeppelin 0.7.3. javascript error: Error: [ngRepeat:iidexp] '_item_' in '_item_ in _collection_'
should be an identifier or '(_key_, _value_)' expression, but got
'/gateway/ui/zeppelin/app'.
http://errors.angularjs.org/1.5.7/ngRepeat/iidexp?p0=%2Fgateway%2Fui%2Fzeppelin%2Fapp
b/<@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:376
Kg</<.compile@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:40:30580
Z@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:37:4755
S@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:30480
S@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:30611
S@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:30611
N@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:29413
X/<@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:37:440
d@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:30833
m@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:37:909
mg</<.link/<@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:40:17582
xc/this.$get</o.prototype.$digest@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:38:10966
a/n.prototype.safeDigest@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:76:1460
b@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:76:3748
a/n.prototype._onMessageHandler@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:76:3960
R/<@https://mydomain.com/gateway/ui/zeppelin/scripts/vendor.49d751b0c72342f6.js:36:5633
" <!-- ngRepeat: /gateway/ui/zeppelin/app in apps --> Appreciate any clue.
... View more
Labels:
08-30-2018
04:57 PM
2 Kudos
Per https://spark.apache.org/docs/latest/building-spark.html, spark 2.3.1 is built with hadoop 2.6.X by default. This is why I see my fat jar includes hadoop 2.6.5 (instead of 3.1.0) jars. HftpFileSystem has been removed in hadoop 3. I need spark 2.3.1 jars that built with hadoop 3.1. On https://spark.apache.org/downloads.html, I only see spark 2.3.1 built with hadoop 2.7. Where can I get spark 2.3.1 built with hadoop 3? Does spark 2.3.1 support hadoop 3?
Appreciate your help. [UPDATE] I solved this issue by using spark 2.3.1 jars under /usr/hdp/current/spark2-client/ from the HDP3.0 cluster. Thanks.
... View more
08-30-2018
03:42 PM
Hi, My spark structured streaming jobs working in HDP2.6 failed in HDP3.0: java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem cannot access its superinterface org.apache.hadoop.hdfs.web.TokenAspect$TokenManagementDelegator
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:3268)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3313)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3352)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:85)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:46)
at org.apache.spark.sql.execution.datasources.json.TextInputJsonDataSource$.readFile(JsonDataSource.scala:125)
at org.apache.spark.sql.execution.datasources.json.JsonFileFormat$$anonfun$buildReader$2.apply(JsonFileFormat.scala:132)
at org.apache.spark.sql.execution.datasources.json.JsonFileFormat$$anonfun$buildReader$2.apply(JsonFileFormat.scala:130)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:148)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:132)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:128)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:216)
at org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:108)
at org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:101)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) I did not find useful info online. Any clue is appreciated.
... View more
Labels:
08-28-2018
04:21 PM
Thanks guys. Rolling back to python 2.7 made livy server succeed to start.
... View more