Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Sandbox HDP 2.5.0 - Spark 1.6.2 - Issues: GPLNativeCodeLoader: Could not load native gpl library - LzoCodec: Cannot load native-lzo without native-hadoop

avatar
Rising Star

Sandbox HDP-2.5.0 TP Spark 1.6.2 - I am encounterning the following ERROR GPLNativeCodeLoader: Could not load native gpl library - ERROR LzoCodec: Cannot load native-lzo without native-hadoop

while running a simple word count on spark-shell

[root@sandbox ~]# cd $SPARK_HOME

[root@sandbox spark-client]# ./bin/spark-shell --master yarn-client --driver-memory 512m --executor-memory 512m --jars /us r/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar

The following code is submitted at the Spark CLI

  1. val file = sc.textFile("/tmp/data")
  2. val counts = file.flatMap(line => line.split(" ")).map(word =>(word,1)).
  3. reduceByKey(_ + _)
  4. counts.saveAsTextFile("/tmp/wordcount")

This yields the following error:

ERROR GPLNativeCodeLoader: Could not load native gpl library

ERROR LzoCodec: Cannot load native-lzo without native-hadoop

The same error appear with or without adding the --jars parameter as here under:

--jars /us r/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar

Full Log:

  1. [root@sandbox ~]# cd $SPARK_HOME
  2. [root@sandbox spark-client]# ./bin/spark-shell --master yarn-client --driver-memory 512m --executor-memory 512m --jars /us
  3. r/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar
  4. 16/08/2716:28:23 INFO SecurityManager:Changing view acls to: root
  5. 16/08/2716:28:23 INFO SecurityManager:Changing modify acls to: root
  6. 16/08/2716:28:23 INFO SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permis
  7. sions:Set(root); users with modify permissions:Set(root)
  8. 16/08/2716:28:23 INFO HttpServer:Starting HTTP Server
  9. 16/08/2716:28:23 INFO Server: jetty-8.y.z-SNAPSHOT
  10. 16/08/2716:28:23 INFO AbstractConnector:StartedSocketConnector@0.0.0.0:43011
  11. 16/08/2716:28:23 INFO Utils:Successfully started service 'HTTP class server' on port 43011.
  12. Welcome to
  13. ____ __
  14. / __/__ ___ _____/ /__
  15. _\ \/ _ \/ _ `/ __/'_/
  16. /___/ .__/\_,_/_/ /_/\_\ version 1.6.2
  17. /_/
  18. Using Scala version 2.10.5 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)
  19. Type in expressions to have them evaluated.
  20. Type :help for more information.
  21. 16/08/27 16:28:26 INFO SparkContext: Running Spark version 1.6.2
  22. 16/08/27 16:28:26 INFO SecurityManager: Changing view acls to: root
  23. 16/08/27 16:28:26 INFO SecurityManager: Changing modify acls to: root
  24. 16/08/27 16:28:26 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permis
  25. sions: Set(root); users with modify permissions: Set(root)
  26. 16/08/27 16:28:26 INFO Utils: Successfully started service 'sparkDriver' on port 45506.
  27. 16/08/27 16:28:27 INFO Slf4jLogger: Slf4jLogger started
  28. 16/08/27 16:28:27 INFO Remoting: Starting remoting
  29. 16/08/27 16:28:27 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.0.2.15:44
  30. 829]
  31. 16/08/27 16:28:27 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 44829.
  32. 16/08/27 16:28:27 INFO SparkEnv: Registering MapOutputTracker
  33. 16/08/27 16:28:27 INFO SparkEnv: Registering BlockManagerMaster
  34. 16/08/27 16:28:27 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-0776b175-5dd7-49b9-adf7-f2cbd85a1e1b
  35. 16/08/27 16:28:27 INFO MemoryStore: MemoryStore started with capacity 143.6 MB
  36. 16/08/27 16:28:27 INFO SparkEnv: Registering OutputCommitCoordinator
  37. 16/08/27 16:28:27 INFO Server: jetty-8.y.z-SNAPSHOT
  38. 16/08/27 16:28:27 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
  39. 16/08/27 16:28:27 INFO Utils: Successfully started service 'SparkUI' on port 4040.
  40. 16/08/27 16:28:27 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.2.15:4040
  41. 16/08/27 16:28:27 INFO HttpFileServer: HTTP File server directory is /tmp/spark-61ecb98e-989c-4396-9b30-032c4d5a2b90/httpd
  42. -857ce699-7db0-428c-9af5-1dca4ec5330d
  43. 16/08/27 16:28:27 INFO HttpServer: Starting HTTP Server
  44. 16/08/27 16:28:27 INFO Server: jetty-8.y.z-SNAPSHOT
  45. 16/08/27 16:28:27 INFO AbstractConnector: Started SocketConnector@0.0.0.0:37515
  46. 16/08/27 16:28:27 INFO Utils: Successfully started service 'HTTP file server' on port 37515.
  47. 16/08/27 16:28:27 INFO SparkContext: Added JAR file:/usr/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar at ht
  48. tp://10.0.2.15:37515/jars/hadoop-lzo-0.6.0.2.5.0.0-817.jar with timestamp 1472315307772
  49. spark.yarn.driver.memoryOverhead is set but does not apply in client mode.
  50. 16/08/27 16:28:28 INFO TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
  51. 16/08/27 16:28:28 INFO RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
  52. 16/08/27 16:28:28 INFO Client: Requesting a new application from cluster with 1 NodeManagers
  53. 16/08/27 16:28:28 INFO Client: Verifying our application has not requested more than the maximum memory capability of the
  54. cluster (2250 MB per container)
  55. 16/08/27 16:28:28 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
  56. 16/08/27 16:28:28 INFO Client: Setting up container launch context for our AM
  57. 16/08/27 16:28:28 INFO Client: Setting up the launch environment for our AM container
  58. 16/08/27 16:28:28 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs:/
  59. /sandbox.hortonworks.com:8020/hdp/apps/2.5.0.0-817/spark/spark-hdp-assembly.jar
  60. 16/08/27 16:28:28 INFO Client: Preparing resources for our AM container
  61. 16/08/27 16:28:28 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs:/
  62. /sandbox.hortonworks.com:8020/hdp/apps/2.5.0.0-817/spark/spark-hdp-assembly.jar
  63. 16/08/27 16:28:28 INFO Client: Source and destination file systems are the same. Not copying hdfs://sandbox.hortonworks.co
  64. m:8020/hdp/apps/2.5.0.0-817/spark/spark-hdp-assembly.jar
  65. 16/08/27 16:28:29 INFO Client: Uploading resource file:/tmp/spark-61ecb98e-989c-4396-9b30-032c4d5a2b90/__spark_conf__50848
  66. 04354575467223.zip -> hdfs://sandbox.hortonworks.com:8020/user/root/.sparkStaging/application_1472312154461_0006/__spark_c
  67. onf__5084804354575467223.zip
  68. 16/08/27 16:28:29 INFO SecurityManager: Changing view acls to: root
  69. 16/08/27 16:28:29 INFO SecurityManager: Changing modify acls to: root
  70. 16/08/27 16:28:29 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permis
  71. sions: Set(root); users with modify permissions: Set(root)
  72. 16/08/27 16:28:29 INFO Client: Submitting application 6 to ResourceManager
  73. 16/08/27 16:28:29 INFO YarnClientImpl: Submitted application application_1472312154461_0006
  74. 16/08/27 16:28:29 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1472312154461_000
  75. 6 and attemptId None
  76. 16/08/27 16:28:30 INFO Client: Application report for application_1472312154461_0006 (state: ACCEPTED)
  77. 16/08/27 16:28:30 INFO Client:
  78. client token: N/A
  79. diagnostics: AM container is launched, waiting for AM container to Register with RM
  80. ApplicationMaster host: N/A
  81. ApplicationMaster RPC port: -1
  82. queue: default
  83. start time: 1472315309252
  84. final status: UNDEFINED
  85. tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472312154461_0006/
  86. user: root
  87. 16/08/27 16:28:31 INFO Client: Application report for application_1472312154461_0006 (state: ACCEPTED)
  88. 16/08/27 16:28:32 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(nul
  89. l)
  90. 16/08/27 16:28:32 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpF
  91. ilter, Map(PROXY_HOSTS -> sandbox.hortonworks.com, PROXY_URI_BASES -> http://sandbox.hortonworks.com:8088/proxy/applicatio
  92. n_1472312154461_0006), /proxy/application_1472312154461_0006
  93. 16/08/27 16:28:32 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
  94. 16/08/27 16:28:32 INFO Client: Application report for application_1472312154461_0006 (state: RUNNING)
  95. 16/08/27 16:28:32 INFO Client:
  96. client token: N/A
  97. diagnostics: N/A
  98. ApplicationMaster host: 10.0.2.15
  99. ApplicationMaster RPC port: 0
  100. queue: default
  101. start time: 1472315309252
  102. final status: UNDEFINED
  103. tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472312154461_0006/
  104. user: root
  105. 16/08/27 16:28:32 INFO YarnClientSchedulerBackend: Application application_1472312154461_0006 has started running.
  106. 16/08/27 16:28:32 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on p
  107. ort 34124.
  108. 16/08/27 16:28:32 INFO NettyBlockTransferService: Server created on 34124
  109. 16/08/27 16:28:32 INFO BlockManagerMaster: Trying to register BlockManager
  110. 16/08/27 16:28:32 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.2.15:34124 with 143.6 MB RAM, BlockManag
  111. erId(driver, 10.0.2.15, 34124)
  112. 16/08/27 16:28:32 INFO BlockManagerMaster: Registered BlockManager
  113. 16/08/27 16:28:32 INFO EventLoggingListener: Logging events to hdfs:///spark-history/application_1472312154461_0006
  114. 16/08/27 16:28:36 INFO YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (sandbox.hortonworks.com:
  115. 39728) with ID 1
  116. 16/08/27 16:28:36 INFO BlockManagerMasterEndpoint: Registering block manager sandbox.hortonworks.com:38362 with 143.6 MB R
  117. AM, BlockManagerId(1, sandbox.hortonworks.com, 38362)
  118. 16/08/27 16:28:57 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxReg
  119. isteredResourcesWaitingTime: 30000(ms)
  120. 16/08/27 16:28:57 INFO SparkILoop: Created spark context..
  121. Spark context available as sc.
  122. 16/08/27 16:28:58 INFO HiveContext: Initializing execution hive, version 1.2.1
  123. 16/08/27 16:28:58 INFO ClientWrapper: Inspected Hadoop version: 2.7.1.2.5.0.0-817
  124. 16/08/27 16:28:58 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.7.1.2.5.0.0-8
  125. 17
  126. 16/08/27 16:28:58 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.Objec
  127. tStore
  128. 16/08/27 16:28:58 INFO ObjectStore: ObjectStore, initialize called
  129. 16/08/27 16:28:58 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
  130. 16/08/27 16:28:58 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
  131. 16/08/27 16:28:59 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
  132. 16/08/27 16:28:59 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
  133. 16/08/27 16:29:00 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,Stor
  134. ageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
  135. 16/08/27 16:29:01 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-o
  136. nly" so does not have its own datastore table.
  137. 16/08/27 16:29:01 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" s
  138. o does not have its own datastore table.
  139. 16/08/27 16:29:02 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-o
  140. nly" so does not have its own datastore table.
  141. 16/08/27 16:29:02 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" s
  142. o does not have its own datastore table.
  143. 16/08/27 16:29:02 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
  144. 16/08/27 16:29:02 INFO ObjectStore: Initialized ObjectStore
  145. 16/08/27 16:29:02 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not
  146. enabled so recording the schema version 1.2.0
  147. 16/08/27 16:29:02 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
  148. 16/08/27 16:29:03 INFO HiveMetaStore: Added admin role in metastore
  149. 16/08/27 16:29:03 INFO HiveMetaStore: Added public role in metastore
  150. 16/08/27 16:29:03 INFO HiveMetaStore: No user is added in admin role, since config is empty
  151. 16/08/27 16:29:03 INFO HiveMetaStore: 0: get_all_databases
  152. 16/08/27 16:29:03 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
  153. 16/08/27 16:29:03 INFO HiveMetaStore: 0: get_functions: db=default pat=*
  154. 16/08/27 16:29:03 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
  155. 16/08/27 16:29:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-o
  156. nly" so does not have its own datastore table.
  157. 16/08/27 16:29:03 INFO SessionState: Created local directory: /tmp/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec_resources
  158. 16/08/27 16:29:03 INFO SessionState: Created HDFS directory: /tmp/hive/root/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec
  159. 16/08/27 16:29:03 INFO SessionState: Created local directory: /tmp/root/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec
  160. 16/08/27 16:29:03 INFO SessionState: Created HDFS directory: /tmp/hive/root/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec/_tmp_spac
  161. e.db
  162. 16/08/27 16:29:03 INFO HiveContext: default warehouse location is /user/hive/warehouse
  163. 16/08/27 16:29:03 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
  164. 16/08/27 16:29:03 INFO ClientWrapper: Inspected Hadoop version: 2.7.1.2.5.0.0-817
  165. 16/08/27 16:29:03 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.7.1.2.5.0.0-8
  166. 17
  167. 16/08/27 16:29:04 INFO metastore: Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083
  168. 16/08/27 16:29:04 INFO metastore: Connected to metastore.
  169. 16/08/27 16:29:04 INFO SessionState: Created local directory: /tmp/83a1e2d3-8c24-4f12-9841-fab259a77514_resources
  170. 16/08/27 16:29:04 INFO SessionState: Created HDFS directory: /tmp/hive/root/83a1e2d3-8c24-4f12-9841-fab259a77514
  171. 16/08/27 16:29:04 INFO SessionState: Created local directory: /tmp/root/83a1e2d3-8c24-4f12-9841-fab259a77514
  172. 16/08/27 16:29:04 INFO SessionState: Created HDFS directory: /tmp/hive/root/83a1e2d3-8c24-4f12-9841-fab259a77514/_tmp_spac
  173. e.db
  174. 16/08/27 16:29:04 INFO SparkILoop: Created sql context (with Hive support)..
  175. SQL context available as sqlContext.
  176. scala> val file = sc.textFile("/tmp/data")
  177. 16/08/27 16:29:20 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 234.8 KB, free 234.8 KB)
  178. 16/08/27 16:29:20 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.1 KB, free 262.9
  179. KB)
  180. 16/08/27 16:29:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.0.2.15:34124 (size: 28.1 KB, free: 143.6
  181. MB)
  182. 16/08/27 16:29:20 INFO SparkContext: Created broadcast 0 from textFile at <console>:27
  183. file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:27
  184. scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
  185. 16/08/27 16:29:35 ERROR GPLNativeCodeLoader: Could not load native gpl library
  186. java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
  187. at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1889)
  188. at java.lang.Runtime.loadLibrary0(Runtime.java:849)
  189. at java.lang.System.loadLibrary(System.java:1088)
  190. at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
  191. at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
  192. at java.lang.Class.forName0(Native Method)
  193. at java.lang.Class.forName(Class.java:278)
  194. at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2147)
  195. at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2112)
  196. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132)
  197. at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:179)
  198. at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
  199. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  200. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  201. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  202. at java.lang.reflect.Method.invoke(Method.java:606)
  203. at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
  204. at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
  205. at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
  206. at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:189)
  207. at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
  208. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
  209. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
  210. at scala.Option.getOrElse(Option.scala:120)
  211. at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
  212. at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  213. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
  214. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
  215. at scala.Option.getOrElse(Option.scala:120)
  216. at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
  217. at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  218. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
  219. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
  220. at scala.Option.getOrElse(Option.scala:120)
  221. at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
  222. at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  223. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
  224. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
  225. at scala.Option.getOrElse(Option.scala:120)
  226. at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
  227. at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65)
  228. at org.apache.spark.rdd.PairRDDFunctions$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:331)
  229. at org.apache.spark.rdd.PairRDDFunctions$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:331)
  230. at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
  231. at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
  232. at org.apache.spark.rdd.RDD.withScope(RDD.scala:323)
  233. at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:330)
  234. at $line19.$read$iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:29)
  235. at $line19.$read$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:34)
  236. at $line19.$read$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:36)
  237. at $line19.$read$iwC$iwC$iwC$iwC$iwC.<init>(<console>:38)
  238. at $line19.$read$iwC$iwC$iwC$iwC.<init>(<console>:40)
  239. at $line19.$read$iwC$iwC$iwC.<init>(<console>:42)
  240. at $line19.$read$iwC$iwC.<init>(<console>:44)
  241. at $line19.$read$iwC.<init>(<console>:46)
  242. at $line19.$read.<init>(<console>:48)
  243. at $line19.$read$.<init>(<console>:52)
  244. at $line19.$read$.<clinit>(<console>)
  245. at $line19.$eval$.<init>(<console>:7)
  246. at $line19.$eval$.<clinit>(<console>)
  247. at $line19.$eval.$print(<console>)
  248. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  249. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  250. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  251. at java.lang.reflect.Method.invoke(Method.java:606)
  252. at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
  253. at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
  254. at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
  255. at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
  256. at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
  257. at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
  258. at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
  259. at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
  260. at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
  261. at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
  262. at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$loop(SparkILoop.scala:670)
  263. at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply$mcZ$sp(SparkILoop.s
  264. cala:997)
  265. at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply(SparkILoop.scala:94
  266. 5)
  267. at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply(SparkILoop.scala:94
  268. 5)
  269. at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
  270. at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$process(SparkILoop.scala:945)
  271. at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
  272. at org.apache.spark.repl.Main$.main(Main.scala:31)
  273. at org.apache.spark.repl.Main.main(Main.scala)
  274. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  275. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  276. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  277. at java.lang.reflect.Method.invoke(Method.java:606)
  278. at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:731)
  279. at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
  280. at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
  281. at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
  282. at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
  283. 16/08/27 16:29:35 ERROR LzoCodec: Cannot load native-lzo without native-hadoop
  284. 16/08/27 16:29:35 INFO FileInputFormat: Total input paths to process : 1
  285. counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:29
  286. scala>

Please help to fix this issue.

1 ACCEPTED SOLUTION

avatar
Rising Star

Resolution done for Spark 2.0.0

Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/

  1. [root@sandbox conf]# cat java-opts
  2. -Dhdp.version=2.5.0.0-817

Spark Submit working example:

  1. [root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex
  2. ecutor-cores 1 examples/jars/spark-examples*.jar 10
  3. 16/08/2917:44:57 WARN util.NativeCodeLoader:Unable to load native-hadoop library for your platform...using builtin-java classes where applicable
  4. 16/08/2917:44:58 WARN shortcircuit.DomainSocketFactory:Theshort-circuit local reads feature cannot be used because libhadoop cannot be loaded.
  5. 16/08/2917:44:58 INFO client.RMProxy:Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
  6. 16/08/2917:44:58 INFO yarn.Client:Requesting a new application from cluster with1NodeManagers
  7. 16/08/2917:44:58 INFO yarn.Client:Verifyingour application has not requested more than the maximum memory capability of the cluster (7680 MB per container)
  8. 16/08/2917:44:58 INFO yarn.Client:Will allocate AM container,with2248 MB memory including 200 MB overhead
  9. 16/08/2917:44:58 INFO yarn.Client:Setting up container launch context forour AM
  10. 16/08/2917:44:58 INFO yarn.Client:Setting up the launch environment forour AM container
  11. 16/08/2917:44:58 INFO yarn.Client:Preparing resources forour AM container
  12. 16/08/2917:44:58 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
  13. 16/08/2917:45:00 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw
  14. orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip
  15. 16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar-> hdfs://sandbox.hortonworks.com:8020/
  16. user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar
  17. 16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw
  18. orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip
  19. 16/08/2917:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode
  20. 16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls to: root
  21. 16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls to: root
  22. 16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls groups to:
  23. 16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls groups to:
  24. 16/08/2917:45:01 INFO spark.SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permissions:Set(root); groups with view permiss
  25. ions:Set(); users with modify permissions:Set(root); groups with modify permissions:Set()
  26. 16/08/2917:45:01 INFO yarn.Client:Submitting application application_1472397144295_0006 to ResourceManager
  27. 16/08/2917:45:01 INFO impl.YarnClientImpl:Submitted application application_1472397144295_0006
  28. 16/08/2917:45:02 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  29. 16/08/2917:45:02 INFO yarn.Client:
  30. client token: N/A
  31. diagnostics: AM container is launched, waiting for AM container to Registerwith RM
  32. ApplicationMaster host: N/A
  33. ApplicationMaster RPC port:-1
  34. queue:default
  35. start time:1472492701409
  36. final status: UNDEFINED
  37. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  38. user: root
  39. 16/08/2917:45:03 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  40. 16/08/2917:45:04 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  41. 16/08/2917:45:05 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  42. 16/08/2917:45:06 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  43. 16/08/2917:45:06 INFO yarn.Client:
  44. client token: N/A
  45. diagnostics: N/A
  46. ApplicationMaster host:10.0.2.15
  47. ApplicationMaster RPC port:0
  48. queue:default
  49. start time:1472492701409
  50. final status: UNDEFINED
  51. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  52. user: root
  53. 16/08/2917:45:07 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  54. 16/08/2917:45:08 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  55. 16/08/2917:45:09 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  56. 16/08/2917:45:10 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  57. 16/08/2917:45:11 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  58. 16/08/2917:45:12 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  59. 16/08/2917:45:13 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  60. 16/08/2917:45:14 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  61. 16/08/2917:45:15 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  62. 16/08/2917:45:16 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  63. 16/08/2917:45:17 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  64. 16/08/2917:45:18 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  65. 16/08/2917:45:19 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  66. 16/08/2917:45:20 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  67. 16/08/2917:45:21 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  68. 16/08/2917:45:22 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  69. 16/08/2917:45:23 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  70. 16/08/2917:45:24 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  71. 16/08/2917:45:25 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  72. 16/08/2917:45:26 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  73. 16/08/2917:45:27 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  74. 16/08/2917:45:28 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  75. 16/08/2917:45:29 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  76. 16/08/2917:45:30 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  77. 16/08/2917:45:31 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  78. 16/08/2917:45:32 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  79. 16/08/2917:45:33 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  80. 16/08/2917:45:34 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  81. 16/08/2917:45:35 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  82. 16/08/2917:45:36 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  83. 16/08/2917:45:37 INFO yarn.Client:Application report for application_1472397144295_0006 (state: FINISHED)
  84. 16/08/2917:45:37 INFO yarn.Client:
  85. client token: N/A
  86. diagnostics: N/A
  87. ApplicationMaster host:10.0.2.15
  88. ApplicationMaster RPC port:0
  89. queue:default
  90. start time:1472492701409
  91. final status: SUCCEEDED
  92. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  93. user: root
  94. 16/08/2917:45:37 INFO util.ShutdownHookManager:Shutdown hook called
  95. 16/08/2917:45:37 INFO util.ShutdownHookManager:Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b
  96. [root@sandbox spark2-client]#

Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf

  1. spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar
  2. spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64

Spark Shell working example:

  1. [root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1
  2. Settingdefault log level to "WARN".
  3. To adjust logging level use sc.setLogLevel(newLevel).
  4. 16/08/2917:47:09 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
  5. 16/08/2917:47:21 WARN spark.SparkContext:Use an existing SparkContext, some configuration may not take effect.
  6. Spark context Web UI available at http://10.0.2.15:4041
  7. Spark context available as'sc'(master = yarn, app id = application_1472397144295_0007).
  8. Spark session available as'spark'.
  9. Welcome to
  10. ____ __
  11. / __/__ ___ _____/ /__
  12. _\ \/ _ \/ _ `/ __/'_/
  13. /___/ .__/\_,_/_/ /_/\_\ version 2.0.0
  14. /_/
  15. Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)
  16. Type in expressions to have them evaluated.
  17. Type :help for more information.
  18. scala> sc.getConf.getAll.foreach(println)
  19. (spark.eventLog.enabled,true)
  20. (spark.yarn.scheduler.heartbeat.interval-ms,5000)
  21. (hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse)
  22. (spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173)
  23. (spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
  24. (spark.yarn.containerLauncherMaxThreads,25)
  25. (spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
  26. (spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64)
  27. (spark.driver.appUIAddress,http://10.0.2.15:4041)
  28. (spark.driver.host,10.0.2.15)
  29. (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007)
  30. (spark.yarn.preserve.staging.files,false)
  31. (spark.home,/usr/hdp/current/spark2-client)
  32. (spark.app.name,Spark shell)
  33. (spark.repl.class.uri,spark://10.0.2.15:37426/classes)
  34. (spark.ui.port,4041)
  35. (spark.yarn.max.executor.failures,3)
  36. (spark.submit.deployMode,client)
  37. (spark.yarn.executor.memoryOverhead,200)
  38. (spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter)
  39. (spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar)
  40. (spark.executor.memory,2g)
  41. (spark.yarn.driver.memoryOverhead,200)
  42. (spark.hadoop.yarn.timeline-service.enabled,false)
  43. (spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native)
  44. (spark.app.id,application_1472397144295_0007)
  45. (spark.executor.id,driver)
  46. (spark.yarn.queue,default)
  47. (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com)
  48. (spark.eventLog.dir,hdfs:///spark-history)
  49. (spark.master,yarn)
  50. (spark.driver.port,37426)
  51. (spark.yarn.submit.file.replication,3)
  52. (spark.sql.catalogImplementation,hive)
  53. (spark.driver.memory,2g)
  54. (spark.jars,)
  55. (spark.executor.cores,1)
  56. scala> val file = sc.textFile("/tmp/data")
  57. file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24
  58. scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
  59. counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26
  60. scala> counts.take(10)
  61. res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se
  62. rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA.
  63. layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac
  64. he.log4j.PatternLayout,1))
  65. scala>

View solution in original post

2 REPLIES 2

avatar
Rising Star

Resolution done for Spark 2.0.0

Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/

  1. [root@sandbox conf]# cat java-opts
  2. -Dhdp.version=2.5.0.0-817

Spark Submit working example:

  1. [root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex
  2. ecutor-cores 1 examples/jars/spark-examples*.jar 10
  3. 16/08/2917:44:57 WARN util.NativeCodeLoader:Unable to load native-hadoop library for your platform...using builtin-java classes where applicable
  4. 16/08/2917:44:58 WARN shortcircuit.DomainSocketFactory:Theshort-circuit local reads feature cannot be used because libhadoop cannot be loaded.
  5. 16/08/2917:44:58 INFO client.RMProxy:Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
  6. 16/08/2917:44:58 INFO yarn.Client:Requesting a new application from cluster with1NodeManagers
  7. 16/08/2917:44:58 INFO yarn.Client:Verifyingour application has not requested more than the maximum memory capability of the cluster (7680 MB per container)
  8. 16/08/2917:44:58 INFO yarn.Client:Will allocate AM container,with2248 MB memory including 200 MB overhead
  9. 16/08/2917:44:58 INFO yarn.Client:Setting up container launch context forour AM
  10. 16/08/2917:44:58 INFO yarn.Client:Setting up the launch environment forour AM container
  11. 16/08/2917:44:58 INFO yarn.Client:Preparing resources forour AM container
  12. 16/08/2917:44:58 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
  13. 16/08/2917:45:00 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw
  14. orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip
  15. 16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar-> hdfs://sandbox.hortonworks.com:8020/
  16. user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar
  17. 16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw
  18. orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip
  19. 16/08/2917:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode
  20. 16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls to: root
  21. 16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls to: root
  22. 16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls groups to:
  23. 16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls groups to:
  24. 16/08/2917:45:01 INFO spark.SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permissions:Set(root); groups with view permiss
  25. ions:Set(); users with modify permissions:Set(root); groups with modify permissions:Set()
  26. 16/08/2917:45:01 INFO yarn.Client:Submitting application application_1472397144295_0006 to ResourceManager
  27. 16/08/2917:45:01 INFO impl.YarnClientImpl:Submitted application application_1472397144295_0006
  28. 16/08/2917:45:02 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  29. 16/08/2917:45:02 INFO yarn.Client:
  30. client token: N/A
  31. diagnostics: AM container is launched, waiting for AM container to Registerwith RM
  32. ApplicationMaster host: N/A
  33. ApplicationMaster RPC port:-1
  34. queue:default
  35. start time:1472492701409
  36. final status: UNDEFINED
  37. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  38. user: root
  39. 16/08/2917:45:03 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  40. 16/08/2917:45:04 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  41. 16/08/2917:45:05 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  42. 16/08/2917:45:06 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  43. 16/08/2917:45:06 INFO yarn.Client:
  44. client token: N/A
  45. diagnostics: N/A
  46. ApplicationMaster host:10.0.2.15
  47. ApplicationMaster RPC port:0
  48. queue:default
  49. start time:1472492701409
  50. final status: UNDEFINED
  51. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  52. user: root
  53. 16/08/2917:45:07 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  54. 16/08/2917:45:08 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  55. 16/08/2917:45:09 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  56. 16/08/2917:45:10 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  57. 16/08/2917:45:11 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  58. 16/08/2917:45:12 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  59. 16/08/2917:45:13 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  60. 16/08/2917:45:14 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  61. 16/08/2917:45:15 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  62. 16/08/2917:45:16 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  63. 16/08/2917:45:17 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  64. 16/08/2917:45:18 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  65. 16/08/2917:45:19 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  66. 16/08/2917:45:20 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  67. 16/08/2917:45:21 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  68. 16/08/2917:45:22 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  69. 16/08/2917:45:23 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  70. 16/08/2917:45:24 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  71. 16/08/2917:45:25 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  72. 16/08/2917:45:26 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  73. 16/08/2917:45:27 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  74. 16/08/2917:45:28 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  75. 16/08/2917:45:29 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  76. 16/08/2917:45:30 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  77. 16/08/2917:45:31 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  78. 16/08/2917:45:32 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  79. 16/08/2917:45:33 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  80. 16/08/2917:45:34 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  81. 16/08/2917:45:35 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  82. 16/08/2917:45:36 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  83. 16/08/2917:45:37 INFO yarn.Client:Application report for application_1472397144295_0006 (state: FINISHED)
  84. 16/08/2917:45:37 INFO yarn.Client:
  85. client token: N/A
  86. diagnostics: N/A
  87. ApplicationMaster host:10.0.2.15
  88. ApplicationMaster RPC port:0
  89. queue:default
  90. start time:1472492701409
  91. final status: SUCCEEDED
  92. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  93. user: root
  94. 16/08/2917:45:37 INFO util.ShutdownHookManager:Shutdown hook called
  95. 16/08/2917:45:37 INFO util.ShutdownHookManager:Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b
  96. [root@sandbox spark2-client]#

Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf

  1. spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar
  2. spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64

Spark Shell working example:

  1. [root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1
  2. Settingdefault log level to "WARN".
  3. To adjust logging level use sc.setLogLevel(newLevel).
  4. 16/08/2917:47:09 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
  5. 16/08/2917:47:21 WARN spark.SparkContext:Use an existing SparkContext, some configuration may not take effect.
  6. Spark context Web UI available at http://10.0.2.15:4041
  7. Spark context available as'sc'(master = yarn, app id = application_1472397144295_0007).
  8. Spark session available as'spark'.
  9. Welcome to
  10. ____ __
  11. / __/__ ___ _____/ /__
  12. _\ \/ _ \/ _ `/ __/'_/
  13. /___/ .__/\_,_/_/ /_/\_\ version 2.0.0
  14. /_/
  15. Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)
  16. Type in expressions to have them evaluated.
  17. Type :help for more information.
  18. scala> sc.getConf.getAll.foreach(println)
  19. (spark.eventLog.enabled,true)
  20. (spark.yarn.scheduler.heartbeat.interval-ms,5000)
  21. (hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse)
  22. (spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173)
  23. (spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
  24. (spark.yarn.containerLauncherMaxThreads,25)
  25. (spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
  26. (spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64)
  27. (spark.driver.appUIAddress,http://10.0.2.15:4041)
  28. (spark.driver.host,10.0.2.15)
  29. (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007)
  30. (spark.yarn.preserve.staging.files,false)
  31. (spark.home,/usr/hdp/current/spark2-client)
  32. (spark.app.name,Spark shell)
  33. (spark.repl.class.uri,spark://10.0.2.15:37426/classes)
  34. (spark.ui.port,4041)
  35. (spark.yarn.max.executor.failures,3)
  36. (spark.submit.deployMode,client)
  37. (spark.yarn.executor.memoryOverhead,200)
  38. (spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter)
  39. (spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar)
  40. (spark.executor.memory,2g)
  41. (spark.yarn.driver.memoryOverhead,200)
  42. (spark.hadoop.yarn.timeline-service.enabled,false)
  43. (spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native)
  44. (spark.app.id,application_1472397144295_0007)
  45. (spark.executor.id,driver)
  46. (spark.yarn.queue,default)
  47. (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com)
  48. (spark.eventLog.dir,hdfs:///spark-history)
  49. (spark.master,yarn)
  50. (spark.driver.port,37426)
  51. (spark.yarn.submit.file.replication,3)
  52. (spark.sql.catalogImplementation,hive)
  53. (spark.driver.memory,2g)
  54. (spark.jars,)
  55. (spark.executor.cores,1)
  56. scala> val file = sc.textFile("/tmp/data")
  57. file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24
  58. scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
  59. counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26
  60. scala> counts.take(10)
  61. res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se
  62. rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA.
  63. layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac
  64. he.log4j.PatternLayout,1))
  65. scala>

avatar
Not applicable

@anandi Thanks, the word count fix works great! However I applied the fix by editing/adding the properties in Ambari, so they won't get overwritten if I make another change at that level. In my opinion that is preferable to editing the config file directly.