Created 05-05-2016 02:37 PM
Hi:
I am trying to SparkR but doesnt work well
the code is:
Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client/") .libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"),.libPaths())) library(SparkR) sc <- SparkR::sparkR.init(master = "yarn-client") sqlContext <-sparkRSQL.init(sc) path <-file.path("/RSI/staging/input/log_json/f6327t.json") info <-read.json(sqlContext, path) printSchema(info)
and the log is:
> sc <- SparkR::sparkR.init(master = "yarn-client") Launching java with spark-submit command /usr/hdp/current/spark-client//bin/spark-submit sparkr-shell /tmp/RtmpxnCWXx/backend_port502d157a15ac 16/05/05 16:33:22 INFO SparkContext: Running Spark version 1.6.0 16/05/05 16:33:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/05/05 16:33:23 INFO SecurityManager: Changing view acls to: bigotes 16/05/05 16:33:23 INFO SecurityManager: Changing modify acls to: bigotes 16/05/05 16:33:23 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(bigotes); users with modify permissions: Set(bigotes) 16/05/05 16:33:23 INFO Utils: Successfully started service 'sparkDriver' on port 39914. 16/05/05 16:33:23 INFO Slf4jLogger: Slf4jLogger started 16/05/05 16:33:23 INFO Remoting: Starting remoting 16/05/05 16:33:24 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.1.246.19:55278] 16/05/05 16:33:24 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 55278. 16/05/05 16:33:24 INFO SparkEnv: Registering MapOutputTracker 16/05/05 16:33:24 INFO SparkEnv: Registering BlockManagerMaster 16/05/05 16:33:24 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-fc4a72de-f470-4c3c-9692-bcf941a4b674 16/05/05 16:33:24 INFO MemoryStore: MemoryStore started with capacity 511.1 MB 16/05/05 16:33:24 INFO SparkEnv: Registering OutputCommitCoordinator 16/05/05 16:33:24 INFO Server: jetty-8.y.z-SNAPSHOT 16/05/05 16:33:24 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 16/05/05 16:33:24 INFO Utils: Successfully started service 'SparkUI' on port 4040. 16/05/05 16:33:24 INFO SparkUI: Started SparkUI at http://10.1.246.19:4040 spark.yarn.driver.memoryOverhead is set but does not apply in client mode. 16/05/05 16:33:24 INFO TimelineClientImpl: Timeline service address: http://lnxbig06.cajarural.gcr:8188/ws/v1/timeline/ 16/05/05 16:33:25 INFO RMProxy: Connecting to ResourceManager at lnxbig05.cajarural.gcr/10.1.246.19:8050 16/05/05 16:33:25 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 16/05/05 16:33:25 INFO Client: Requesting a new application from cluster with 5 NodeManagers 16/05/05 16:33:25 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (40192 MB per container) 16/05/05 16:33:25 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 16/05/05 16:33:25 INFO Client: Setting up container launch context for our AM 16/05/05 16:33:25 INFO Client: Setting up the launch environment for our AM container 16/05/05 16:33:25 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs://lnxbig05.cajarural.gcr:8020/hdp/apps/2.4.0.0-169/spark/spark-hdp-assembly.jar 16/05/05 16:33:25 INFO Client: Preparing resources for our AM container 16/05/05 16:33:25 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs://lnxbig05.cajarural.gcr:8020/hdp/apps/2.4.0.0-169/spark/spark-hdp-assembly.jar 16/05/05 16:33:25 INFO Client: Source and destination file systems are the same. Not copying hdfs://lnxbig05.cajarural.gcr:8020/hdp/apps/2.4.0.0-169/spark/spark-hdp-assembly.jar 16/05/05 16:33:25 INFO Client: Uploading resource file:/tmp/spark-7c7224cd-1fa8-43d6-b049-a85ce21f18e7/__spark_conf__5347166147727015442.zip -> hdfs://lnxbig05.cajarural.gcr:8020/user/bigotes/.sparkStaging/application_1461739406783_0151/__spark_conf__5347166147727015442.zip 16/05/05 16:33:26 INFO SecurityManager: Changing view acls to: bigotes 16/05/05 16:33:26 INFO SecurityManager: Changing modify acls to: bigotes 16/05/05 16:33:26 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(bigotes); users with modify permissions: Set(bigotes) 16/05/05 16:33:26 INFO Client: Submitting application 151 to ResourceManager 16/05/05 16:33:26 INFO YarnClientImpl: Submitted application application_1461739406783_0151 16/05/05 16:33:26 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1461739406783_0151 and attemptId None 16/05/05 16:33:27 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:27 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1462458806216 final status: UNDEFINED tracking URL: http://lnxbig05.cajarural.gcr:8088/proxy/application_1461739406783_0151/ user: bigotes 16/05/05 16:33:28 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:29 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:30 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:31 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:32 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:33 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:34 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:35 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:36 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:37 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:38 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:39 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:40 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED) 16/05/05 16:33:41 INFO Client: Application report for application_1461739406783_0151 (state: ACCEPTED)
Is correct my code???
Thanks
Created 05-05-2016 07:08 PM
Hi:
finally its working with this code:
Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client/") .libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"),.libPaths())) library(SparkR) #sparkR.stop() sparkR.stop() sc <- SparkR::sparkR.init(master = "yarn-client", sparkEnvir = list(spark.driver.memory="4g")) hiveContext <- sparkRHive.init(sc)
Created 05-05-2016 06:58 PM
Looks like you are running the code as 'bigotes' user. Can you check if that is correct and you have sufficient write privileges in the user directory?
Created 05-05-2016 07:08 PM
Hi:
finally its working with this code:
Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client/") .libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"),.libPaths())) library(SparkR) #sparkR.stop() sparkR.stop() sc <- SparkR::sparkR.init(master = "yarn-client", sparkEnvir = list(spark.driver.memory="4g")) hiveContext <- sparkRHive.init(sc)