Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Sandbox HDP 2.5.0 - Spark 1.6.2 - Issues: GPLNativeCodeLoader: Could not load native gpl library - LzoCodec: Cannot load native-lzo without native-hadoop

avatar
Rising Star

Sandbox HDP-2.5.0 TP Spark 1.6.2 - I am encounterning the following ERROR GPLNativeCodeLoader: Could not load native gpl library - ERROR LzoCodec: Cannot load native-lzo without native-hadoop

while running a simple word count on spark-shell

[root@sandbox ~]# cd $SPARK_HOME

[root@sandbox spark-client]# ./bin/spark-shell --master yarn-client --driver-memory 512m --executor-memory 512m --jars /us r/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar

The following code is submitted at the Spark CLI

  1. val file = sc.textFile("/tmp/data")
  2. val counts = file.flatMap(line => line.split(" ")).map(word =>(word,1)).
  3. reduceByKey(_ + _)
  4. counts.saveAsTextFile("/tmp/wordcount")

This yields the following error:

ERROR GPLNativeCodeLoader: Could not load native gpl library

ERROR LzoCodec: Cannot load native-lzo without native-hadoop

The same error appear with or without adding the --jars parameter as here under:

--jars /us r/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar

Full Log:

  1. [root@sandbox ~]# cd $SPARK_HOME
  2. [root@sandbox spark-client]# ./bin/spark-shell --master yarn-client --driver-memory 512m --executor-memory 512m --jars /us
  3. r/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar
  4. 16/08/2716:28:23 INFO SecurityManager:Changing view acls to: root
  5. 16/08/2716:28:23 INFO SecurityManager:Changing modify acls to: root
  6. 16/08/2716:28:23 INFO SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permis
  7. sions:Set(root); users with modify permissions:Set(root)
  8. 16/08/2716:28:23 INFO HttpServer:Starting HTTP Server
  9. 16/08/2716:28:23 INFO Server: jetty-8.y.z-SNAPSHOT
  10. 16/08/2716:28:23 INFO AbstractConnector:StartedSocketConnector@0.0.0.0:43011
  11. 16/08/2716:28:23 INFO Utils:Successfully started service 'HTTP class server' on port 43011.
  12. Welcome to
  13. ____ __
  14. / __/__ ___ _____/ /__
  15. _\ \/ _ \/ _ `/ __/'_/
  16. /___/ .__/\_,_/_/ /_/\_\ version 1.6.2
  17. /_/
  18. Using Scala version 2.10.5 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)
  19. Type in expressions to have them evaluated.
  20. Type :help for more information.
  21. 16/08/27 16:28:26 INFO SparkContext: Running Spark version 1.6.2
  22. 16/08/27 16:28:26 INFO SecurityManager: Changing view acls to: root
  23. 16/08/27 16:28:26 INFO SecurityManager: Changing modify acls to: root
  24. 16/08/27 16:28:26 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permis
  25. sions: Set(root); users with modify permissions: Set(root)
  26. 16/08/27 16:28:26 INFO Utils: Successfully started service 'sparkDriver' on port 45506.
  27. 16/08/27 16:28:27 INFO Slf4jLogger: Slf4jLogger started
  28. 16/08/27 16:28:27 INFO Remoting: Starting remoting
  29. 16/08/27 16:28:27 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.0.2.15:44
  30. 829]
  31. 16/08/27 16:28:27 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 44829.
  32. 16/08/27 16:28:27 INFO SparkEnv: Registering MapOutputTracker
  33. 16/08/27 16:28:27 INFO SparkEnv: Registering BlockManagerMaster
  34. 16/08/27 16:28:27 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-0776b175-5dd7-49b9-adf7-f2cbd85a1e1b
  35. 16/08/27 16:28:27 INFO MemoryStore: MemoryStore started with capacity 143.6 MB
  36. 16/08/27 16:28:27 INFO SparkEnv: Registering OutputCommitCoordinator
  37. 16/08/27 16:28:27 INFO Server: jetty-8.y.z-SNAPSHOT
  38. 16/08/27 16:28:27 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
  39. 16/08/27 16:28:27 INFO Utils: Successfully started service 'SparkUI' on port 4040.
  40. 16/08/27 16:28:27 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.2.15:4040
  41. 16/08/27 16:28:27 INFO HttpFileServer: HTTP File server directory is /tmp/spark-61ecb98e-989c-4396-9b30-032c4d5a2b90/httpd
  42. -857ce699-7db0-428c-9af5-1dca4ec5330d
  43. 16/08/27 16:28:27 INFO HttpServer: Starting HTTP Server
  44. 16/08/27 16:28:27 INFO Server: jetty-8.y.z-SNAPSHOT
  45. 16/08/27 16:28:27 INFO AbstractConnector: Started SocketConnector@0.0.0.0:37515
  46. 16/08/27 16:28:27 INFO Utils: Successfully started service 'HTTP file server' on port 37515.
  47. 16/08/27 16:28:27 INFO SparkContext: Added JAR file:/usr/hdp/2.5.0.0-817/hadoop/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar at ht
  48. tp://10.0.2.15:37515/jars/hadoop-lzo-0.6.0.2.5.0.0-817.jar with timestamp 1472315307772
  49. spark.yarn.driver.memoryOverhead is set but does not apply in client mode.
  50. 16/08/27 16:28:28 INFO TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
  51. 16/08/27 16:28:28 INFO RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
  52. 16/08/27 16:28:28 INFO Client: Requesting a new application from cluster with 1 NodeManagers
  53. 16/08/27 16:28:28 INFO Client: Verifying our application has not requested more than the maximum memory capability of the
  54. cluster (2250 MB per container)
  55. 16/08/27 16:28:28 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
  56. 16/08/27 16:28:28 INFO Client: Setting up container launch context for our AM
  57. 16/08/27 16:28:28 INFO Client: Setting up the launch environment for our AM container
  58. 16/08/27 16:28:28 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs:/
  59. /sandbox.hortonworks.com:8020/hdp/apps/2.5.0.0-817/spark/spark-hdp-assembly.jar
  60. 16/08/27 16:28:28 INFO Client: Preparing resources for our AM container
  61. 16/08/27 16:28:28 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs:/
  62. /sandbox.hortonworks.com:8020/hdp/apps/2.5.0.0-817/spark/spark-hdp-assembly.jar
  63. 16/08/27 16:28:28 INFO Client: Source and destination file systems are the same. Not copying hdfs://sandbox.hortonworks.co
  64. m:8020/hdp/apps/2.5.0.0-817/spark/spark-hdp-assembly.jar
  65. 16/08/27 16:28:29 INFO Client: Uploading resource file:/tmp/spark-61ecb98e-989c-4396-9b30-032c4d5a2b90/__spark_conf__50848
  66. 04354575467223.zip -> hdfs://sandbox.hortonworks.com:8020/user/root/.sparkStaging/application_1472312154461_0006/__spark_c
  67. onf__5084804354575467223.zip
  68. 16/08/27 16:28:29 INFO SecurityManager: Changing view acls to: root
  69. 16/08/27 16:28:29 INFO SecurityManager: Changing modify acls to: root
  70. 16/08/27 16:28:29 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permis
  71. sions: Set(root); users with modify permissions: Set(root)
  72. 16/08/27 16:28:29 INFO Client: Submitting application 6 to ResourceManager
  73. 16/08/27 16:28:29 INFO YarnClientImpl: Submitted application application_1472312154461_0006
  74. 16/08/27 16:28:29 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1472312154461_000
  75. 6 and attemptId None
  76. 16/08/27 16:28:30 INFO Client: Application report for application_1472312154461_0006 (state: ACCEPTED)
  77. 16/08/27 16:28:30 INFO Client:
  78. client token: N/A
  79. diagnostics: AM container is launched, waiting for AM container to Register with RM
  80. ApplicationMaster host: N/A
  81. ApplicationMaster RPC port: -1
  82. queue: default
  83. start time: 1472315309252
  84. final status: UNDEFINED
  85. tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472312154461_0006/
  86. user: root
  87. 16/08/27 16:28:31 INFO Client: Application report for application_1472312154461_0006 (state: ACCEPTED)
  88. 16/08/27 16:28:32 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(nul
  89. l)
  90. 16/08/27 16:28:32 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpF
  91. ilter, Map(PROXY_HOSTS -> sandbox.hortonworks.com, PROXY_URI_BASES -> http://sandbox.hortonworks.com:8088/proxy/applicatio
  92. n_1472312154461_0006), /proxy/application_1472312154461_0006
  93. 16/08/27 16:28:32 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
  94. 16/08/27 16:28:32 INFO Client: Application report for application_1472312154461_0006 (state: RUNNING)
  95. 16/08/27 16:28:32 INFO Client:
  96. client token: N/A
  97. diagnostics: N/A
  98. ApplicationMaster host: 10.0.2.15
  99. ApplicationMaster RPC port: 0
  100. queue: default
  101. start time: 1472315309252
  102. final status: UNDEFINED
  103. tracking URL: http://sandbox.hortonworks.com:8088/proxy/application_1472312154461_0006/
  104. user: root
  105. 16/08/27 16:28:32 INFO YarnClientSchedulerBackend: Application application_1472312154461_0006 has started running.
  106. 16/08/27 16:28:32 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on p
  107. ort 34124.
  108. 16/08/27 16:28:32 INFO NettyBlockTransferService: Server created on 34124
  109. 16/08/27 16:28:32 INFO BlockManagerMaster: Trying to register BlockManager
  110. 16/08/27 16:28:32 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.2.15:34124 with 143.6 MB RAM, BlockManag
  111. erId(driver, 10.0.2.15, 34124)
  112. 16/08/27 16:28:32 INFO BlockManagerMaster: Registered BlockManager
  113. 16/08/27 16:28:32 INFO EventLoggingListener: Logging events to hdfs:///spark-history/application_1472312154461_0006
  114. 16/08/27 16:28:36 INFO YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (sandbox.hortonworks.com:
  115. 39728) with ID 1
  116. 16/08/27 16:28:36 INFO BlockManagerMasterEndpoint: Registering block manager sandbox.hortonworks.com:38362 with 143.6 MB R
  117. AM, BlockManagerId(1, sandbox.hortonworks.com, 38362)
  118. 16/08/27 16:28:57 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxReg
  119. isteredResourcesWaitingTime: 30000(ms)
  120. 16/08/27 16:28:57 INFO SparkILoop: Created spark context..
  121. Spark context available as sc.
  122. 16/08/27 16:28:58 INFO HiveContext: Initializing execution hive, version 1.2.1
  123. 16/08/27 16:28:58 INFO ClientWrapper: Inspected Hadoop version: 2.7.1.2.5.0.0-817
  124. 16/08/27 16:28:58 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.7.1.2.5.0.0-8
  125. 17
  126. 16/08/27 16:28:58 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.Objec
  127. tStore
  128. 16/08/27 16:28:58 INFO ObjectStore: ObjectStore, initialize called
  129. 16/08/27 16:28:58 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
  130. 16/08/27 16:28:58 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
  131. 16/08/27 16:28:59 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
  132. 16/08/27 16:28:59 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
  133. 16/08/27 16:29:00 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,Stor
  134. ageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
  135. 16/08/27 16:29:01 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-o
  136. nly" so does not have its own datastore table.
  137. 16/08/27 16:29:01 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" s
  138. o does not have its own datastore table.
  139. 16/08/27 16:29:02 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-o
  140. nly" so does not have its own datastore table.
  141. 16/08/27 16:29:02 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" s
  142. o does not have its own datastore table.
  143. 16/08/27 16:29:02 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
  144. 16/08/27 16:29:02 INFO ObjectStore: Initialized ObjectStore
  145. 16/08/27 16:29:02 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not
  146. enabled so recording the schema version 1.2.0
  147. 16/08/27 16:29:02 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
  148. 16/08/27 16:29:03 INFO HiveMetaStore: Added admin role in metastore
  149. 16/08/27 16:29:03 INFO HiveMetaStore: Added public role in metastore
  150. 16/08/27 16:29:03 INFO HiveMetaStore: No user is added in admin role, since config is empty
  151. 16/08/27 16:29:03 INFO HiveMetaStore: 0: get_all_databases
  152. 16/08/27 16:29:03 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_all_databases
  153. 16/08/27 16:29:03 INFO HiveMetaStore: 0: get_functions: db=default pat=*
  154. 16/08/27 16:29:03 INFO audit: ugi=root ip=unknown-ip-addr cmd=get_functions: db=default pat=*
  155. 16/08/27 16:29:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-o
  156. nly" so does not have its own datastore table.
  157. 16/08/27 16:29:03 INFO SessionState: Created local directory: /tmp/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec_resources
  158. 16/08/27 16:29:03 INFO SessionState: Created HDFS directory: /tmp/hive/root/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec
  159. 16/08/27 16:29:03 INFO SessionState: Created local directory: /tmp/root/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec
  160. 16/08/27 16:29:03 INFO SessionState: Created HDFS directory: /tmp/hive/root/6ebb0a60-b229-4dad-94a3-e2386ba7b4ec/_tmp_spac
  161. e.db
  162. 16/08/27 16:29:03 INFO HiveContext: default warehouse location is /user/hive/warehouse
  163. 16/08/27 16:29:03 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
  164. 16/08/27 16:29:03 INFO ClientWrapper: Inspected Hadoop version: 2.7.1.2.5.0.0-817
  165. 16/08/27 16:29:03 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.7.1.2.5.0.0-8
  166. 17
  167. 16/08/27 16:29:04 INFO metastore: Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083
  168. 16/08/27 16:29:04 INFO metastore: Connected to metastore.
  169. 16/08/27 16:29:04 INFO SessionState: Created local directory: /tmp/83a1e2d3-8c24-4f12-9841-fab259a77514_resources
  170. 16/08/27 16:29:04 INFO SessionState: Created HDFS directory: /tmp/hive/root/83a1e2d3-8c24-4f12-9841-fab259a77514
  171. 16/08/27 16:29:04 INFO SessionState: Created local directory: /tmp/root/83a1e2d3-8c24-4f12-9841-fab259a77514
  172. 16/08/27 16:29:04 INFO SessionState: Created HDFS directory: /tmp/hive/root/83a1e2d3-8c24-4f12-9841-fab259a77514/_tmp_spac
  173. e.db
  174. 16/08/27 16:29:04 INFO SparkILoop: Created sql context (with Hive support)..
  175. SQL context available as sqlContext.
  176. scala> val file = sc.textFile("/tmp/data")
  177. 16/08/27 16:29:20 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 234.8 KB, free 234.8 KB)
  178. 16/08/27 16:29:20 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.1 KB, free 262.9
  179. KB)
  180. 16/08/27 16:29:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.0.2.15:34124 (size: 28.1 KB, free: 143.6
  181. MB)
  182. 16/08/27 16:29:20 INFO SparkContext: Created broadcast 0 from textFile at <console>:27
  183. file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:27
  184. scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
  185. 16/08/27 16:29:35 ERROR GPLNativeCodeLoader: Could not load native gpl library
  186. java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
  187. at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1889)
  188. at java.lang.Runtime.loadLibrary0(Runtime.java:849)
  189. at java.lang.System.loadLibrary(System.java:1088)
  190. at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
  191. at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
  192. at java.lang.Class.forName0(Native Method)
  193. at java.lang.Class.forName(Class.java:278)
  194. at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2147)
  195. at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2112)
  196. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132)
  197. at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:179)
  198. at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
  199. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  200. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  201. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  202. at java.lang.reflect.Method.invoke(Method.java:606)
  203. at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
  204. at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
  205. at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
  206. at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:189)
  207. at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
  208. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
  209. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
  210. at scala.Option.getOrElse(Option.scala:120)
  211. at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
  212. at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  213. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
  214. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
  215. at scala.Option.getOrElse(Option.scala:120)
  216. at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
  217. at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  218. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
  219. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
  220. at scala.Option.getOrElse(Option.scala:120)
  221. at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
  222. at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  223. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:242)
  224. at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:240)
  225. at scala.Option.getOrElse(Option.scala:120)
  226. at org.apache.spark.rdd.RDD.partitions(RDD.scala:240)
  227. at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65)
  228. at org.apache.spark.rdd.PairRDDFunctions$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:331)
  229. at org.apache.spark.rdd.PairRDDFunctions$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:331)
  230. at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
  231. at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
  232. at org.apache.spark.rdd.RDD.withScope(RDD.scala:323)
  233. at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:330)
  234. at $line19.$read$iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:29)
  235. at $line19.$read$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:34)
  236. at $line19.$read$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:36)
  237. at $line19.$read$iwC$iwC$iwC$iwC$iwC.<init>(<console>:38)
  238. at $line19.$read$iwC$iwC$iwC$iwC.<init>(<console>:40)
  239. at $line19.$read$iwC$iwC$iwC.<init>(<console>:42)
  240. at $line19.$read$iwC$iwC.<init>(<console>:44)
  241. at $line19.$read$iwC.<init>(<console>:46)
  242. at $line19.$read.<init>(<console>:48)
  243. at $line19.$read$.<init>(<console>:52)
  244. at $line19.$read$.<clinit>(<console>)
  245. at $line19.$eval$.<init>(<console>:7)
  246. at $line19.$eval$.<clinit>(<console>)
  247. at $line19.$eval.$print(<console>)
  248. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  249. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  250. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  251. at java.lang.reflect.Method.invoke(Method.java:606)
  252. at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
  253. at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
  254. at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
  255. at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
  256. at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
  257. at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
  258. at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
  259. at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
  260. at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
  261. at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
  262. at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$loop(SparkILoop.scala:670)
  263. at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply$mcZ$sp(SparkILoop.s
  264. cala:997)
  265. at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply(SparkILoop.scala:94
  266. 5)
  267. at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply(SparkILoop.scala:94
  268. 5)
  269. at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
  270. at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$process(SparkILoop.scala:945)
  271. at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
  272. at org.apache.spark.repl.Main$.main(Main.scala:31)
  273. at org.apache.spark.repl.Main.main(Main.scala)
  274. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  275. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  276. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  277. at java.lang.reflect.Method.invoke(Method.java:606)
  278. at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:731)
  279. at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
  280. at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
  281. at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
  282. at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
  283. 16/08/27 16:29:35 ERROR LzoCodec: Cannot load native-lzo without native-hadoop
  284. 16/08/27 16:29:35 INFO FileInputFormat: Total input paths to process : 1
  285. counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:29
  286. scala>

Please help to fix this issue.

1 ACCEPTED SOLUTION

avatar
Rising Star

Resolution done for Spark 2.0.0

Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/

  1. [root@sandbox conf]# cat java-opts
  2. -Dhdp.version=2.5.0.0-817

Spark Submit working example:

  1. [root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex
  2. ecutor-cores 1 examples/jars/spark-examples*.jar 10
  3. 16/08/2917:44:57 WARN util.NativeCodeLoader:Unable to load native-hadoop library for your platform...using builtin-java classes where applicable
  4. 16/08/2917:44:58 WARN shortcircuit.DomainSocketFactory:Theshort-circuit local reads feature cannot be used because libhadoop cannot be loaded.
  5. 16/08/2917:44:58 INFO client.RMProxy:Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
  6. 16/08/2917:44:58 INFO yarn.Client:Requesting a new application from cluster with1NodeManagers
  7. 16/08/2917:44:58 INFO yarn.Client:Verifyingour application has not requested more than the maximum memory capability of the cluster (7680 MB per container)
  8. 16/08/2917:44:58 INFO yarn.Client:Will allocate AM container,with2248 MB memory including 200 MB overhead
  9. 16/08/2917:44:58 INFO yarn.Client:Setting up container launch context forour AM
  10. 16/08/2917:44:58 INFO yarn.Client:Setting up the launch environment forour AM container
  11. 16/08/2917:44:58 INFO yarn.Client:Preparing resources forour AM container
  12. 16/08/2917:44:58 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
  13. 16/08/2917:45:00 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw
  14. orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip
  15. 16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar-> hdfs://sandbox.hortonworks.com:8020/
  16. user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar
  17. 16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw
  18. orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip
  19. 16/08/2917:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode
  20. 16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls to: root
  21. 16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls to: root
  22. 16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls groups to:
  23. 16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls groups to:
  24. 16/08/2917:45:01 INFO spark.SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permissions:Set(root); groups with view permiss
  25. ions:Set(); users with modify permissions:Set(root); groups with modify permissions:Set()
  26. 16/08/2917:45:01 INFO yarn.Client:Submitting application application_1472397144295_0006 to ResourceManager
  27. 16/08/2917:45:01 INFO impl.YarnClientImpl:Submitted application application_1472397144295_0006
  28. 16/08/2917:45:02 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  29. 16/08/2917:45:02 INFO yarn.Client:
  30. client token: N/A
  31. diagnostics: AM container is launched, waiting for AM container to Registerwith RM
  32. ApplicationMaster host: N/A
  33. ApplicationMaster RPC port:-1
  34. queue:default
  35. start time:1472492701409
  36. final status: UNDEFINED
  37. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  38. user: root
  39. 16/08/2917:45:03 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  40. 16/08/2917:45:04 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  41. 16/08/2917:45:05 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  42. 16/08/2917:45:06 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  43. 16/08/2917:45:06 INFO yarn.Client:
  44. client token: N/A
  45. diagnostics: N/A
  46. ApplicationMaster host:10.0.2.15
  47. ApplicationMaster RPC port:0
  48. queue:default
  49. start time:1472492701409
  50. final status: UNDEFINED
  51. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  52. user: root
  53. 16/08/2917:45:07 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  54. 16/08/2917:45:08 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  55. 16/08/2917:45:09 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  56. 16/08/2917:45:10 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  57. 16/08/2917:45:11 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  58. 16/08/2917:45:12 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  59. 16/08/2917:45:13 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  60. 16/08/2917:45:14 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  61. 16/08/2917:45:15 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  62. 16/08/2917:45:16 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  63. 16/08/2917:45:17 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  64. 16/08/2917:45:18 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  65. 16/08/2917:45:19 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  66. 16/08/2917:45:20 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  67. 16/08/2917:45:21 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  68. 16/08/2917:45:22 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  69. 16/08/2917:45:23 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  70. 16/08/2917:45:24 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  71. 16/08/2917:45:25 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  72. 16/08/2917:45:26 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  73. 16/08/2917:45:27 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  74. 16/08/2917:45:28 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  75. 16/08/2917:45:29 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  76. 16/08/2917:45:30 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  77. 16/08/2917:45:31 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  78. 16/08/2917:45:32 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  79. 16/08/2917:45:33 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  80. 16/08/2917:45:34 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  81. 16/08/2917:45:35 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  82. 16/08/2917:45:36 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  83. 16/08/2917:45:37 INFO yarn.Client:Application report for application_1472397144295_0006 (state: FINISHED)
  84. 16/08/2917:45:37 INFO yarn.Client:
  85. client token: N/A
  86. diagnostics: N/A
  87. ApplicationMaster host:10.0.2.15
  88. ApplicationMaster RPC port:0
  89. queue:default
  90. start time:1472492701409
  91. final status: SUCCEEDED
  92. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  93. user: root
  94. 16/08/2917:45:37 INFO util.ShutdownHookManager:Shutdown hook called
  95. 16/08/2917:45:37 INFO util.ShutdownHookManager:Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b
  96. [root@sandbox spark2-client]#

Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf

  1. spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar
  2. spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64

Spark Shell working example:

  1. [root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1
  2. Settingdefault log level to "WARN".
  3. To adjust logging level use sc.setLogLevel(newLevel).
  4. 16/08/2917:47:09 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
  5. 16/08/2917:47:21 WARN spark.SparkContext:Use an existing SparkContext, some configuration may not take effect.
  6. Spark context Web UI available at http://10.0.2.15:4041
  7. Spark context available as'sc'(master = yarn, app id = application_1472397144295_0007).
  8. Spark session available as'spark'.
  9. Welcome to
  10. ____ __
  11. / __/__ ___ _____/ /__
  12. _\ \/ _ \/ _ `/ __/'_/
  13. /___/ .__/\_,_/_/ /_/\_\ version 2.0.0
  14. /_/
  15. Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)
  16. Type in expressions to have them evaluated.
  17. Type :help for more information.
  18. scala> sc.getConf.getAll.foreach(println)
  19. (spark.eventLog.enabled,true)
  20. (spark.yarn.scheduler.heartbeat.interval-ms,5000)
  21. (hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse)
  22. (spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173)
  23. (spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
  24. (spark.yarn.containerLauncherMaxThreads,25)
  25. (spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
  26. (spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64)
  27. (spark.driver.appUIAddress,http://10.0.2.15:4041)
  28. (spark.driver.host,10.0.2.15)
  29. (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007)
  30. (spark.yarn.preserve.staging.files,false)
  31. (spark.home,/usr/hdp/current/spark2-client)
  32. (spark.app.name,Spark shell)
  33. (spark.repl.class.uri,spark://10.0.2.15:37426/classes)
  34. (spark.ui.port,4041)
  35. (spark.yarn.max.executor.failures,3)
  36. (spark.submit.deployMode,client)
  37. (spark.yarn.executor.memoryOverhead,200)
  38. (spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter)
  39. (spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar)
  40. (spark.executor.memory,2g)
  41. (spark.yarn.driver.memoryOverhead,200)
  42. (spark.hadoop.yarn.timeline-service.enabled,false)
  43. (spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native)
  44. (spark.app.id,application_1472397144295_0007)
  45. (spark.executor.id,driver)
  46. (spark.yarn.queue,default)
  47. (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com)
  48. (spark.eventLog.dir,hdfs:///spark-history)
  49. (spark.master,yarn)
  50. (spark.driver.port,37426)
  51. (spark.yarn.submit.file.replication,3)
  52. (spark.sql.catalogImplementation,hive)
  53. (spark.driver.memory,2g)
  54. (spark.jars,)
  55. (spark.executor.cores,1)
  56. scala> val file = sc.textFile("/tmp/data")
  57. file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24
  58. scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
  59. counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26
  60. scala> counts.take(10)
  61. res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se
  62. rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA.
  63. layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac
  64. he.log4j.PatternLayout,1))
  65. scala>

View solution in original post

2 REPLIES 2

avatar
Rising Star

Resolution done for Spark 2.0.0

Resolution for Spark Submit issue: add java-opts file in /usr/hdp/current/spark2-client/conf/

  1. [root@sandbox conf]# cat java-opts
  2. -Dhdp.version=2.5.0.0-817

Spark Submit working example:

  1. [root@sandbox spark2-client]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 2g --executor-memory 2g --ex
  2. ecutor-cores 1 examples/jars/spark-examples*.jar 10
  3. 16/08/2917:44:57 WARN util.NativeCodeLoader:Unable to load native-hadoop library for your platform...using builtin-java classes where applicable
  4. 16/08/2917:44:58 WARN shortcircuit.DomainSocketFactory:Theshort-circuit local reads feature cannot be used because libhadoop cannot be loaded.
  5. 16/08/2917:44:58 INFO client.RMProxy:Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
  6. 16/08/2917:44:58 INFO yarn.Client:Requesting a new application from cluster with1NodeManagers
  7. 16/08/2917:44:58 INFO yarn.Client:Verifyingour application has not requested more than the maximum memory capability of the cluster (7680 MB per container)
  8. 16/08/2917:44:58 INFO yarn.Client:Will allocate AM container,with2248 MB memory including 200 MB overhead
  9. 16/08/2917:44:58 INFO yarn.Client:Setting up container launch context forour AM
  10. 16/08/2917:44:58 INFO yarn.Client:Setting up the launch environment forour AM container
  11. 16/08/2917:44:58 INFO yarn.Client:Preparing resources forour AM container
  12. 16/08/2917:44:58 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
  13. 16/08/2917:45:00 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_libs__3503948162159958877.zip -> hdfs://sandbox.hortonw
  14. orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_libs__3503948162159958877.zip
  15. 16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/usr/hdp/2.5.0.0-817/spark2/examples/jars/spark-examples_2.11-2.0.0.jar-> hdfs://sandbox.hortonworks.com:8020/
  16. user/root/.sparkStaging/application_1472397144295_0006/spark-examples_2.11-2.0.0.jar
  17. 16/08/2917:45:01 INFO yarn.Client:Uploading resource file:/tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b/__spark_conf__4613069544481307021.zip -> hdfs://sandbox.hortonw
  18. orks.com:8020/user/root/.sparkStaging/application_1472397144295_0006/__spark_conf__.zip
  19. 16/08/2917:45:01 WARN yarn.Client: spark.yarn.am.extraJavaOptions will not take effect in cluster mode
  20. 16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls to: root
  21. 16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls to: root
  22. 16/08/2917:45:01 INFO spark.SecurityManager:Changing view acls groups to:
  23. 16/08/2917:45:01 INFO spark.SecurityManager:Changing modify acls groups to:
  24. 16/08/2917:45:01 INFO spark.SecurityManager:SecurityManager: authentication disabled; ui acls disabled; users with view permissions:Set(root); groups with view permiss
  25. ions:Set(); users with modify permissions:Set(root); groups with modify permissions:Set()
  26. 16/08/2917:45:01 INFO yarn.Client:Submitting application application_1472397144295_0006 to ResourceManager
  27. 16/08/2917:45:01 INFO impl.YarnClientImpl:Submitted application application_1472397144295_0006
  28. 16/08/2917:45:02 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  29. 16/08/2917:45:02 INFO yarn.Client:
  30. client token: N/A
  31. diagnostics: AM container is launched, waiting for AM container to Registerwith RM
  32. ApplicationMaster host: N/A
  33. ApplicationMaster RPC port:-1
  34. queue:default
  35. start time:1472492701409
  36. final status: UNDEFINED
  37. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  38. user: root
  39. 16/08/2917:45:03 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  40. 16/08/2917:45:04 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  41. 16/08/2917:45:05 INFO yarn.Client:Application report for application_1472397144295_0006 (state: ACCEPTED)
  42. 16/08/2917:45:06 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  43. 16/08/2917:45:06 INFO yarn.Client:
  44. client token: N/A
  45. diagnostics: N/A
  46. ApplicationMaster host:10.0.2.15
  47. ApplicationMaster RPC port:0
  48. queue:default
  49. start time:1472492701409
  50. final status: UNDEFINED
  51. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  52. user: root
  53. 16/08/2917:45:07 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  54. 16/08/2917:45:08 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  55. 16/08/2917:45:09 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  56. 16/08/2917:45:10 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  57. 16/08/2917:45:11 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  58. 16/08/2917:45:12 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  59. 16/08/2917:45:13 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  60. 16/08/2917:45:14 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  61. 16/08/2917:45:15 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  62. 16/08/2917:45:16 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  63. 16/08/2917:45:17 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  64. 16/08/2917:45:18 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  65. 16/08/2917:45:19 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  66. 16/08/2917:45:20 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  67. 16/08/2917:45:21 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  68. 16/08/2917:45:22 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  69. 16/08/2917:45:23 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  70. 16/08/2917:45:24 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  71. 16/08/2917:45:25 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  72. 16/08/2917:45:26 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  73. 16/08/2917:45:27 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  74. 16/08/2917:45:28 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  75. 16/08/2917:45:29 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  76. 16/08/2917:45:30 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  77. 16/08/2917:45:31 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  78. 16/08/2917:45:32 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  79. 16/08/2917:45:33 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  80. 16/08/2917:45:34 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  81. 16/08/2917:45:35 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  82. 16/08/2917:45:36 INFO yarn.Client:Application report for application_1472397144295_0006 (state: RUNNING)
  83. 16/08/2917:45:37 INFO yarn.Client:Application report for application_1472397144295_0006 (state: FINISHED)
  84. 16/08/2917:45:37 INFO yarn.Client:
  85. client token: N/A
  86. diagnostics: N/A
  87. ApplicationMaster host:10.0.2.15
  88. ApplicationMaster RPC port:0
  89. queue:default
  90. start time:1472492701409
  91. final status: SUCCEEDED
  92. tracking URL:http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0006/
  93. user: root
  94. 16/08/2917:45:37 INFO util.ShutdownHookManager:Shutdown hook called
  95. 16/08/2917:45:37 INFO util.ShutdownHookManager:Deleting directory /tmp/spark-38890bfc-d672-4c7d-bef9-d646c420836b
  96. [root@sandbox spark2-client]#

Resolution for Spark Shell issue (lzo-codec): add the following 2 lines in your spark-defaults.conf

  1. spark.driver.extraClassPath /usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar
  2. spark.driver.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64

Spark Shell working example:

  1. [root@sandbox spark2-client]# ./bin/spark-shell --master yarn --deploy-mode client --driver-memory 2g --executor-memory 2g --executor-cores 1
  2. Settingdefault log level to "WARN".
  3. To adjust logging level use sc.setLogLevel(newLevel).
  4. 16/08/2917:47:09 WARN yarn.Client:Neither spark.yarn.jars nor spark.yarn.archive isset, falling back to uploading libraries under SPARK_HOME.
  5. 16/08/2917:47:21 WARN spark.SparkContext:Use an existing SparkContext, some configuration may not take effect.
  6. Spark context Web UI available at http://10.0.2.15:4041
  7. Spark context available as'sc'(master = yarn, app id = application_1472397144295_0007).
  8. Spark session available as'spark'.
  9. Welcome to
  10. ____ __
  11. / __/__ ___ _____/ /__
  12. _\ \/ _ \/ _ `/ __/'_/
  13. /___/ .__/\_,_/_/ /_/\_\ version 2.0.0
  14. /_/
  15. Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.7.0_101)
  16. Type in expressions to have them evaluated.
  17. Type :help for more information.
  18. scala> sc.getConf.getAll.foreach(println)
  19. (spark.eventLog.enabled,true)
  20. (spark.yarn.scheduler.heartbeat.interval-ms,5000)
  21. (hive.metastore.warehouse.dir,file:/usr/hdp/2.5.0.0-817/spark2/spark-warehouse)
  22. (spark.repl.class.outputDir,/tmp/spark-fa16d4d3-8ec8-4b0e-a1da-5a2dffe39d08/repl-5dd28f29-ae03-4965-a535-18a95173b173)
  23. (spark.yarn.am.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
  24. (spark.yarn.containerLauncherMaxThreads,25)
  25. (spark.driver.extraJavaOptions,-Dhdp.version=2.5.0.0-817)
  26. (spark.driver.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64)
  27. (spark.driver.appUIAddress,http://10.0.2.15:4041)
  28. (spark.driver.host,10.0.2.15)
  29. (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES,http://sandbox.hortonworks.com:8088/proxy/application_1472397144295_0007)
  30. (spark.yarn.preserve.staging.files,false)
  31. (spark.home,/usr/hdp/current/spark2-client)
  32. (spark.app.name,Spark shell)
  33. (spark.repl.class.uri,spark://10.0.2.15:37426/classes)
  34. (spark.ui.port,4041)
  35. (spark.yarn.max.executor.failures,3)
  36. (spark.submit.deployMode,client)
  37. (spark.yarn.executor.memoryOverhead,200)
  38. (spark.ui.filters,org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter)
  39. (spark.driver.extraClassPath,/usr/hdp/current/hadoop-client/lib/hadoop-lzo-0.6.0.2.5.0.0-817.jar)
  40. (spark.executor.memory,2g)
  41. (spark.yarn.driver.memoryOverhead,200)
  42. (spark.hadoop.yarn.timeline-service.enabled,false)
  43. (spark.executor.extraLibraryPath,/usr/hdp/current/hadoop-client/lib/native)
  44. (spark.app.id,application_1472397144295_0007)
  45. (spark.executor.id,driver)
  46. (spark.yarn.queue,default)
  47. (spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS,sandbox.hortonworks.com)
  48. (spark.eventLog.dir,hdfs:///spark-history)
  49. (spark.master,yarn)
  50. (spark.driver.port,37426)
  51. (spark.yarn.submit.file.replication,3)
  52. (spark.sql.catalogImplementation,hive)
  53. (spark.driver.memory,2g)
  54. (spark.jars,)
  55. (spark.executor.cores,1)
  56. scala> val file = sc.textFile("/tmp/data")
  57. file: org.apache.spark.rdd.RDD[String] = /tmp/data MapPartitionsRDD[1] at textFile at <console>:24
  58. scala> val counts = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
  59. counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:26
  60. scala> counts.take(10)
  61. res1: Array[(String, Int)] = Array((hadoop.tasklog.noKeepSplits=4,1), (log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.se
  62. rver.resourcemanager.appsummary.logger},1), (Unless,1), (this,4), (hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log,1), (under,4), (log4j.appender.RFA.
  63. layout.ConversionPattern=%d{ISO8601},2), (log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout,1), (AppSummaryLogging,1), (log4j.appender.RMAUDIT.layout=org.apac
  64. he.log4j.PatternLayout,1))
  65. scala>

avatar

@anandi Thanks, the word count fix works great! However I applied the fix by editing/adding the properties in Ambari, so they won't get overwritten if I make another change at that level. In my opinion that is preferable to editing the config file directly.