Member since
03-20-2016
56
Posts
17
Kudos Received
0
Solutions
06-17-2016
09:33 PM
Thanks for your help. I understood well the shuffle part, I just didnt understand this part "So for multi-node clusters, you can basically add up the various input sizes and that should be relatively equivalent to the full data size." So this 2,8GB that appears in the second image its just for one specific node of the 3 nodes and dont shows the size for the other two nodes?
... View more
06-17-2016
07:14 PM
I run some queries on spark with just one node and then with 3 nodes. And in the spark:4040 UI I see something that I am not understanding. For example after executing a query with 3 nodes and check the results in the spark UI, in the "input" tab appears 2,8gb, so spark read 2,8gb from hadoop.
The same query on hadoop with just one node in local mode appears 7,3gb, the spark read 7,3GB from hadoop. But this value shouldnt be equal? For example the value of shuffle remains +- equal in one node vs 3. Why the input value doesn't stay equal? The same amount of data must be read from the hdfs, so I am not understanding. Do you know? Single node: Below the same query on multinode, as you can see input is less but the shuffle remains +- icual, do you know why?
... View more
Labels:
06-04-2016
02:23 AM
Hi, Im studing about the catalyst optimizer but Im with some doubts in some of its phases. In the first phase what Im understanding is that, it is created a first logical plan but its not definitive because there are unresolved attributes. And to resolve this attributes Spark SQL uses a catalog, that connects to the tables to check the name and data type of the attribute. My doubt in this first phase is about the meaning of unresolved attributes, and how this catalog works. For example if we have the tables stored in hive, the catalog connects to the hive metastore to check the name and datatype of the attribute? If it is this its not a bit expensive do this in every query we execute? In the second phase are applied rules to the logical plan created before. Some of that rule are predicate pushdown and projection pruning. What Im understanding about predicate pushdown, is that when we submit a query in the spark over hive tables that are on a different machine, we can have a lot of data across the network, and this is not good. But Im not understanding how catalyst works to fix this issue, do you know?
... View more
Labels:
05-30-2016
12:11 PM
Thank for your help really! Now I get it!
... View more
05-30-2016
01:49 AM
Hi thanks for your answer, it really helped, but Im with just some little doubs if you can help it was fine. For example for what Im understanding with your answer the steps are: 1- Hivescan to read data stored in hdfs and create a RDD based on this data (create 1 or more rdd?) 2 - Filter based on the shipdate 3 - Projects 4 - The first TungestenAggregate is to aggregate the data in each rdd based on the agregation keys (returnflag,linestatus) (in each RDD, so we have more than 1?) 5 - The TungstenExchange will distribute the data based on the agregation keys to a new set of RDDs so that each values with the same agregation key will end up in the same RDD? (so we really have more than 1 rdd lol...?) 6 - Ant the last TungstenAggregate that aggregates the pre aggregation I didnt understand very well, can you explain better? So my doubts about your answer its just about the last TungstenAggregate, about the number of RDDS created by spark and also you say "filter locally" "read partitions locally", what you mean by locally? Thanks again really!
... View more
05-29-2016
01:35 AM
1 Kudo
Hi, Im trying to understand physical plans on spark but Im not understanding some parts because they seem different from traditional rdbms. For example in this plan below, its a plan about a query over a hive table. The query is this: select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= '1998-09-16'
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;
The physical plan: == Physical Plan ==
Sort [l_returnflag#35 ASC,l_linestatus#36 ASC], true, 0
+- ConvertToUnsafe
+- Exchange rangepartitioning(l_returnflag#35 ASC,l_linestatus#36 ASC,200), None
+- ConvertToSafe
+- TungstenAggregate(key=[l_returnflag#35,l_linestatus#36], functions=[(sum(l_quantity#31),mode=Final,isDistinct=false),(sum(l_extendedpr#32),mode=Final,isDistinct=false),(sum((l_extendedprice#32 * (1.0 - l_discount#33))),mode=Final,isDistinct=false),(sum(((l_extendedprice#32 * (1.0l_discount#33)) * (1.0 + l_tax#34))),mode=Final,isDistinct=false),(avg(l_quantity#31),mode=Final,isDistinct=false),(avg(l_extendedprice#32),mode=Fl,isDistinct=false),(avg(l_discount#33),mode=Final,isDistinct=false),(count(1),mode=Final,isDistinct=false)], output=[l_returnflag#35,l_linestatus,sum_qty#0,sum_base_price#1,sum_disc_price#2,sum_charge#3,avg_qty#4,avg_price#5,avg_disc#6,count_order#7L])
+- TungstenExchange hashpartitioning(l_returnflag#35,l_linestatus#36,200), None
+- TungstenAggregate(key=[l_returnflag#35,l_linestatus#36], functions=[(sum(l_quantity#31),mode=Partial,isDistinct=false),(sum(l_exdedprice#32),mode=Partial,isDistinct=false),(sum((l_extendedprice#32 * (1.0 - l_discount#33))),mode=Partial,isDistinct=false),(sum(((l_extendedpri32 * (1.0 - l_discount#33)) * (1.0 + l_tax#34))),mode=Partial,isDistinct=false),(avg(l_quantity#31),mode=Partial,isDistinct=false),(avg(l_extendedce#32),mode=Partial,isDistinct=false),(avg(l_discount#33),mode=Partial,isDistinct=false),(count(1),mode=Partial,isDistinct=false)], output=[l_retulag#35,l_linestatus#36,sum#64,sum#65,sum#66,sum#67,sum#68,count#69L,sum#70,count#71L,sum#72,count#73L,count#74L])
+- Project [l_discount#33,l_linestatus#36,l_tax#34,l_quantity#31,l_extendedprice#32,l_returnflag#35]
+- Filter (l_shipdate#37 <= 1998-09-16)
+- HiveTableScan [l_discount#33,l_linestatus#36,l_tax#34,l_quantity#31,l_extendedprice#32,l_shipdate#37,l_returnflag#35], astoreRelation default, lineitem, None For what Im understanding in the plan is: 1- Frist starts with a Hive table scan 2- Then it filter using where the condition 3- Then project to get the columns we want 4- Then TungstenAggregate, what is this? 5- Then TungstenExchange, what is this? 6- Then TungstenAggregate again what is this? 7- Then ConvertToSafe what is this? 8- Then sorts the final resut But Im not understanding the 4,5,6 and 7 steps. Do you know what they are? Im looking for information about this so I can understand the plan but Im not finding nothing in concrete.
... View more
Labels:
05-26-2016
01:11 AM
Hi, thanks for your answer. But Im not understanding. I think the answer that I accpted fixed the issue. Because starting the spark-shell with spark-shell --master spark://masterhost:7077 in the 8080 port I get: Cores in use: 4 Total, 4 Used Memory in use: 4.0 GB Total, 2.0 GB Used Applications: 1 Running, 0 Completed Drivers: 0 Running, 0 Completed Status: ALIVE So it seems that it is already working starting spark-shell with thay way, right? But you are suggesting that should be spark-shell --master "local" spark:///mastehost:7077?
... View more
05-25-2016
11:23 AM
I just see your comment now, but I think its working fine now, it seems that I was setting more memory than the memory available.
... View more
05-25-2016
11:22 AM
I decrease the memory in spark-env.sh and now it seems that its working, thanks!
... View more
05-25-2016
11:18 AM
This warn appears when the query starts to execute in stage 0 and then appears the error.
... View more
05-25-2016
11:14 AM
Thanks, but now Im getting this error when I try to execute a query: "16/05/25 12:15:15 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo@5547fcb1)". And this Warn: 16/05/25 12:15:05 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resourcesDo you know why?
... View more
05-25-2016
10:34 AM
Hi, Im executing the job on shell. To start shell I use the command "spark-shell". So I need to use spark-shell --master?
... View more
05-25-2016
09:54 AM
Im executing a query on spark and it is working Im getting the result. I did not configure any cluster so spark should be using its own cluster manager. But in the spark page: master:8080 I get this: Alive Workers: 2
Cores in use: 4 Total, 0 Used
Memory in use: 6.0 GB Total, 0.0 B Used
Applications: 0 Running, 0 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE But when Im executing the query I get the same result while Im refresinh the page: Alive Workers: 2
Cores in use: 4 Total, 0 Used
Memory in use: 6.0 GB Total, 0.0 B Used
Applications: 0 Running, 0 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE
And after the execution of the query this is the same again...Do you know why? Its very strange, it seems that spark is executing the query without using any hardware which is not possible, so why this info is not updating do you know?
... View more
Labels:
05-22-2016
06:22 PM
Thanks for your answer. I update the question with that.
... View more
05-22-2016
06:21 PM
when I execute jps command datanode and nodemanagger appears, but it seems that is not starting correctly, because if I check logs it seems that they arent running correctly. Namenode log: <code>2016-05-22 11:40:37,725 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2016-05-22 11:40:37,731 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: createNameNode []
2016-05-22 11:40:38,109 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2016-05-22 11:40:38,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-05-22 11:40:38,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started
2016-05-22 11:40:38,220 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: fs.defaultFS is hdfs://masternode:9000
2016-05-22 11:40:38,221 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Clients are to use masternode:9000 to access this namenode/service.
2016-05-22 11:40:38,412 INFO org.apache.hadoop.hdfs.DFSUtil: Starting Web-server for hdfs at: http://0.0.0.0:50070
2016-05-22 11:40:38,486 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2016-05-22 11:40:38,496 INFO org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2016-05-22 11:40:38,503 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.namenode is not defined
2016-05-22 11:40:38,509 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2016-05-22 11:40:38,513 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs
2016-05-22 11:40:38,514 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2016-05-22 11:40:38,514 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2016-05-22 11:40:38,543 INFO org.apache.hadoop.http.HttpServer2: Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter)
2016-05-22 11:40:38,544 INFO org.apache.hadoop.http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2016-05-22 11:40:38,564 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 50070
2016-05-22 11:40:38,564 INFO org.mortbay.log: jetty-6.1.26
2016-05-22 11:40:38,751 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:50070
2016-05-22 11:40:38,785 WARN org.apache.hadoop.hdfs.server.common.Util: Path /home/hadoopadmin/namenodeeeeeeeee should be specified as a URI in configuration files. Please update hdfs configuration.
2016-05-22 11:40:38,785 WARN org.apache.hadoop.hdfs.server.common.Util: Path /home/hadoopadmin/namenodeeeeeeeee should be specified as a URI in configuration files. Please update hdfs configuration.
2016-05-22 11:40:38,785 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!
2016-05-22 11:40:38,785 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace edits storage directory (dfs.namenode.edits.dir) configured. Beware of data loss due to lack of redundant storage directories!
2016-05-22 11:40:38,791 WARN org.apache.hadoop.hdfs.server.common.Util: Path /home/hadoopadmin/namenodeeeeeeeee should be specified as a URI in configuration files. Please update hdfs configuration.
2016-05-22 11:40:38,791 WARN org.apache.hadoop.hdfs.server.common.Util: Path /home/hadoopadmin/namenodeeeeeeeee should be specified as a URI in configuration files. Please update hdfs configuration.
2016-05-22 11:40:38,823 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: No KeyProvider found.
2016-05-22 11:40:38,824 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsLock is fair:true
2016-05-22 11:40:38,866 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
2016-05-22 11:40:38,867 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2016-05-22 11:40:38,868 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2016-05-22 11:40:38,870 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: The block deletion will start around 2016 May 22 11:40:38
2016-05-22 11:40:38,872 INFO org.apache.hadoop.util.GSet: Computing capacity for map BlocksMap
2016-05-22 11:40:38,872 INFO org.apache.hadoop.util.GSet: VM type = 64-bit
2016-05-22 11:40:38,873 INFO org.apache.hadoop.util.GSet: 2.0% max memory 889 MB = 17.8 MB
2016-05-22 11:40:38,873 INFO org.apache.hadoop.util.GSet: capacity = 2^21 = 2097152 entries
2016-05-22 11:40:38,880 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.block.access.token.enable=false
2016-05-22 11:40:38,881 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication = 1
2016-05-22 11:40:38,881 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication = 512
2016-05-22 11:40:38,881 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication = 1
2016-05-22 11:40:38,881 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams = 2
2016-05-22 11:40:38,881 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
2016-05-22 11:40:38,881 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: replicationRecheckInterval = 3000
2016-05-22 11:40:38,881 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer = false
2016-05-22 11:40:38,881 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxNumBlocksToLog = 1000
2016-05-22 11:40:38,888 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner = hadoopadmin (auth:SIMPLE)
2016-05-22 11:40:38,889 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup = supergroup
2016-05-22 11:40:38,889 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = true
2016-05-22 11:40:38,889 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false
2016-05-22 11:40:38,890 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true
2016-05-22 11:40:39,102 INFO org.apache.hadoop.util.GSet: Computing capacity for map INodeMap
2016-05-22 11:40:39,102 INFO org.apache.hadoop.util.GSet: VM type = 64-bit
2016-05-22 11:40:39,102 INFO org.apache.hadoop.util.GSet: 1.0% max memory 889 MB = 8.9 MB
2016-05-22 11:40:39,102 INFO org.apache.hadoop.util.GSet: capacity = 2^20 = 1048576 entries
2016-05-22 11:40:39,104 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: ACLs enabled? false
2016-05-22 11:40:39,104 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: XAttrs enabled? true
2016-05-22 11:40:39,104 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: Maximum size of an xattr: 16384
2016-05-22 11:40:39,104 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2016-05-22 11:40:39,113 INFO org.apache.hadoop.util.GSet: Computing capacity for map cachedBlocks
2016-05-22 11:40:39,113 INFO org.apache.hadoop.util.GSet: VM type = 64-bit
2016-05-22 11:40:39,113 INFO org.apache.hadoop.util.GSet: 0.25% max memory 889 MB = 2.2 MB
2016-05-22 11:40:39,113 INFO org.apache.hadoop.util.GSet: capacity = 2^18 = 262144 entries
2016-05-22 11:40:39,115 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2016-05-22 11:40:39,115 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
2016-05-22 11:40:39,115 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
2016-05-22 11:40:39,118 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2016-05-22 11:40:39,118 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2016-05-22 11:40:39,118 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2016-05-22 11:40:39,119 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache on namenode is enabled
2016-05-22 11:40:39,119 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2016-05-22 11:40:39,122 INFO org.apache.hadoop.util.GSet: Computing capacity for map NameNodeRetryCache
2016-05-22 11:40:39,122 INFO org.apache.hadoop.util.GSet: VM type = 64-bit
2016-05-22 11:40:39,122 INFO org.apache.hadoop.util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
2016-05-22 11:40:39,122 INFO org.apache.hadoop.util.GSet: capacity = 2^15 = 32768 entries
2016-05-22 11:40:39,135 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/hadoopadmin/namenodeeeeeeeee/in_use.lock acquired by nodename 30068@masternode
2016-05-22 11:40:39,190 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering unfinalized segments in /home/hadoopadmin/namenodeeeeeeeee/current
2016-05-22 11:40:39,191 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: No edit log streams selected.
2016-05-22 11:40:39,232 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 1 INodes.
2016-05-22 11:40:39,265 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf: Loaded FSImage in 0 seconds.
2016-05-22 11:40:39,265 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Loaded image for txid 0 from /home/hadoopadmin/namenodeeeeeeeee/current/fsimage_0000000000000000000
2016-05-22 11:40:39,275 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Need to save fs image? false (staleImage=false, haEnabled=false, isRollingUpgrade=false)
2016-05-22 11:40:39,275 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 1
2016-05-22 11:40:39,423 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 0 entries 0 lookups
2016-05-22 11:40:39,423 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 297 msecs
2016-05-22 11:40:39,686 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: RPC server is binding to masternode:9000
2016-05-22 11:40:39,693 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-05-22 11:40:39,707 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 9000
2016-05-22 11:40:39,735 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean
2016-05-22 11:40:39,736 WARN org.apache.hadoop.hdfs.server.common.Util: Path /home/hadoopadmin/namenodeeeeeeeee should be specified as a URI in configuration files. Please update hdfs configuration.
2016-05-22 11:40:39,748 INFO org.apache.hadoop.hdfs.server.namenode.LeaseManager: Number of blocks under construction: 0
2016-05-22 11:40:39,748 INFO org.apache.hadoop.hdfs.server.namenode.LeaseManager: Number of blocks under construction: 0
2016-05-22 11:40:39,748 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: initializing replication queues
2016-05-22 11:40:39,749 INFO org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 0 secs
2016-05-22 11:40:39,749 INFO org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes
2016-05-22 11:40:39,749 INFO org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks
2016-05-22 11:40:39,760 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor: Number of failed storage changes from 0 to 0
2016-05-22 11:40:39,786 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Total number of blocks = 0
2016-05-22 11:40:39,786 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of invalid blocks = 0
2016-05-22 11:40:39,786 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of under-replicated blocks = 0
2016-05-22 11:40:39,786 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of over-replicated blocks = 0
2016-05-22 11:40:39,786 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Number of blocks being written = 0
2016-05-22 11:40:39,786 INFO org.apache.hadoop.hdfs.StateChange: STATE* Replication Queue initialization scan for invalid, over- and under-replicated blocks completed in 36 msec
2016-05-22 11:40:39,800 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 9000: starting
2016-05-22 11:40:39,801 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-05-22 11:40:39,803 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode RPC up at: masternode/10.18.0.50:9000
2016-05-22 11:40:39,803 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for active state
2016-05-22 11:40:39,807 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Starting CacheReplicationMonitor with interval 30000 milliseconds
2016-05-22 11:41:49,208 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:41:49,208 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:41:49,208 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 1
2016-05-22 11:41:49,208 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 5
2016-05-22 11:41:49,209 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 5
2016-05-22 11:41:49,212 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000001 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000001-0000000000000000002
2016-05-22 11:41:49,214 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 3
2016-05-22 11:42:49,280 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:42:49,280 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:42:49,280 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 3
2016-05-22 11:42:49,281 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 4
2016-05-22 11:42:49,281 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 4
2016-05-22 11:42:49,282 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000003 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000003-0000000000000000004
2016-05-22 11:42:49,282 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 5
2016-05-22 11:43:49,303 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:43:49,303 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:43:49,303 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 5
2016-05-22 11:43:49,304 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 4
2016-05-22 11:43:49,305 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 5
2016-05-22 11:43:49,306 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000005 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000005-0000000000000000006
2016-05-22 11:43:49,306 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 7
2016-05-22 11:44:49,320 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:44:49,320 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:44:49,320 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 7
2016-05-22 11:44:49,320 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 5
2016-05-22 11:44:49,322 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 5
2016-05-22 11:44:49,326 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000007 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000007-0000000000000000008
2016-05-22 11:44:49,326 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 9
2016-05-22 11:45:49,340 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:45:49,340 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:45:49,340 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 9
2016-05-22 11:45:49,341 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 1 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 4
2016-05-22 11:45:49,342 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 1 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 5
2016-05-22 11:45:49,343 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000009 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000009-0000000000000000010
2016-05-22 11:45:49,343 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 11
2016-05-22 11:46:49,380 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:46:49,380 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:46:49,381 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 11
2016-05-22 11:46:49,381 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 12
2016-05-22 11:46:49,382 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 13
2016-05-22 11:46:49,383 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000011 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000011-0000000000000000012
2016-05-22 11:46:49,383 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 13
2016-05-22 11:47:49,399 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:47:49,399 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:47:49,399 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 13
2016-05-22 11:47:49,399 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 4
2016-05-22 11:47:49,400 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 4
2016-05-22 11:47:49,401 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000013 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000013-0000000000000000014
2016-05-22 11:47:49,401 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 15
2016-05-22 11:48:49,419 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:48:49,419 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:48:49,419 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 15
2016-05-22 11:48:49,421 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 2 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 4
2016-05-22 11:48:49,424 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 2 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 7
2016-05-22 11:48:49,425 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000015 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000015-0000000000000000016
2016-05-22 11:48:49,425 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 17
2016-05-22 11:49:49,446 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:49:49,446 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:49:49,446 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 17
2016-05-22 11:49:49,447 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 5
2016-05-22 11:49:49,449 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 7
2016-05-22 11:49:49,450 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000017 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000017-0000000000000000018
2016-05-22 11:49:49,450 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 19
2016-05-22 11:50:49,475 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:50:49,475 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:50:49,475 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 19
2016-05-22 11:50:49,476 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 1 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 13
2016-05-22 11:50:49,477 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 1 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 13
2016-05-22 11:50:49,478 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000019 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000019-0000000000000000020
2016-05-22 11:50:49,478 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 21
2016-05-22 11:51:49,497 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:51:49,497 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:51:49,497 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 21
2016-05-22 11:51:49,499 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 2 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 4
2016-05-22 11:51:49,500 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 2 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 5
2016-05-22 11:51:49,501 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000021 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000021-0000000000000000022
2016-05-22 11:51:49,501 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 23
2016-05-22 11:52:49,516 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:52:49,516 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:52:49,516 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 23
2016-05-22 11:52:49,517 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 1 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 5
2016-05-22 11:52:49,518 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 1 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 6
2016-05-22 11:52:49,519 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000023 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000023-0000000000000000024
2016-05-22 11:52:49,519 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 25
2016-05-22 11:53:49,538 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:53:49,538 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:53:49,538 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 25
2016-05-22 11:53:49,538 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 5
2016-05-22 11:53:49,539 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 6
2016-05-22 11:53:49,540 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000025 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000025-0000000000000000026
2016-05-22 11:53:49,540 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 27
2016-05-22 11:54:49,570 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:54:49,571 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:54:49,571 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 27
2016-05-22 11:54:49,571 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 15
2016-05-22 11:54:49,572 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 16
2016-05-22 11:54:49,573 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000027 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000027-0000000000000000028
2016-05-22 11:54:49,573 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 29
2016-05-22 11:55:49,586 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:55:49,586 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:55:49,586 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 29
2016-05-22 11:55:49,586 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 4
2016-05-22 11:55:49,587 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 4
2016-05-22 11:55:49,588 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000029 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000029-0000000000000000030
2016-05-22 11:55:49,588 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 31
2016-05-22 11:56:49,647 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.18.0.50
2016-05-22 11:56:49,647 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2016-05-22 11:56:49,647 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 31
2016-05-22 11:56:49,647 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 5
2016-05-22 11:56:49,649 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 7
2016-05-22 11:56:49,649 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /home/hadoopadmin/namenodeeeeeeeee/current/edits_inprogress_0000000000000000031 -> /home/hadoopadmin/namenodeeeeeeeee/current/edits_0000000000000000031-0000000000000000032
2016-05-22 11:56:49,650 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 33
Nodemanager log: <code>STARTUP_MSG: java = 1.8.0_91
************************************************************/
2016-05-22 11:41:11,219 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: registered UNIX signal handlers for [TERM, HUP, INT]
2016-05-22 11:41:12,264 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher
2016-05-22 11:41:12,265 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher
2016-05-22 11:41:12,266 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizationEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService
2016-05-22 11:41:12,266 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServicesEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices
2016-05-22 11:41:12,266 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
2016-05-22 11:41:12,267 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher
2016-05-22 11:41:12,286 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.ContainerManagerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl
2016-05-22 11:41:12,286 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.NodeManagerEventType for class org.apache.hadoop.yarn.server.nodemanager.NodeManager
2016-05-22 11:41:12,326 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2016-05-22 11:41:12,397 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-05-22 11:41:12,398 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system started
2016-05-22 11:41:12,420 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler
2016-05-22 11:41:12,421 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploadEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploadService
2016-05-22 11:41:12,421 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: per directory file limit = 8192
2016-05-22 11:41:12,478 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: usercache path : file:/tmp/hadoop-hadoopadmin/nm-local-dir/usercache_DEL_1463913672424
2016-05-22 11:41:12,529 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.LocalizerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker
2016-05-22 11:41:12,548 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Using ResourceCalculatorPlugin : org.apache.hadoop.yarn.util.LinuxResourceCalculatorPlugin@2dfaea86
2016-05-22 11:41:12,548 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Using ResourceCalculatorProcessTree : null
2016-05-22 11:41:12,549 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Physical memory check enabled: true
2016-05-22 11:41:12,549 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Virtual memory check enabled: true
2016-05-22 11:41:12,552 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: NodeManager configured with 8 G physical memory allocated to containers, which is more than 80% of the total physical memory available (3.9 G). Thrashing might happen.
2016-05-22 11:41:12,557 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Initialized nodemanager for null: physical-memory=8192 virtual-memory=17204 virtual-cores=8
2016-05-22 11:41:12,596 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-05-22 11:41:12,619 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 40484
2016-05-22 11:41:12,651 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.api.ContainerManagementProtocolPB to the server
2016-05-22 11:41:12,651 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Blocking new container-requests as container manager rpc server is still starting.
2016-05-22 11:41:12,651 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-05-22 11:41:12,652 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 40484: starting
2016-05-22 11:41:12,661 INFO org.apache.hadoop.yarn.server.nodemanager.security.NMContainerTokenSecretManager: Updating node address : ubuntuslave:40484
2016-05-22 11:41:12,668 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-05-22 11:41:12,669 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8040
2016-05-22 11:41:12,671 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB to the server
2016-05-22 11:41:12,672 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8040: starting
2016-05-22 11:41:12,672 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-05-22 11:41:12,673 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Localizer started on port 8040
2016-05-22 11:41:12,675 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: ContainerManager started at ubuntuslave/10.17.0.89:40484
2016-05-22 11:41:12,675 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: ContainerManager bound to 0.0.0.0/0.0.0.0:0
2016-05-22 11:41:12,676 INFO org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer: Instantiating NMWebApp at 0.0.0.0:8042
2016-05-22 11:41:12,749 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2016-05-22 11:41:12,758 INFO org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2016-05-22 11:41:12,763 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.nodemanager is not defined
2016-05-22 11:41:12,771 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2016-05-22 11:41:12,773 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context node
2016-05-22 11:41:12,773 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2016-05-22 11:41:12,773 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2016-05-22 11:41:12,776 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /node/*
2016-05-22 11:41:12,777 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2016-05-22 11:41:12,786 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 8042
2016-05-22 11:41:12,786 INFO org.mortbay.log: jetty-6.1.26
2016-05-22 11:41:12,813 INFO org.mortbay.log: Extract jar:file:/usr/local/hadoop-2.7.1/share/hadoop/yarn/hadoop-yarn-common-2.7.1.jar!/webapps/node to /tmp/Jetty_0_0_0_0_8042_node____19tj0x/webapp
2016-05-22 11:41:13,010 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042
2016-05-22 11:41:13,010 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app /node started at 8042
2016-05-22 11:41:13,316 INFO org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2016-05-22 11:41:13,324 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at masternode/10.18.0.50:8031
2016-05-22 11:41:13,417 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 0 NM container statuses: []
2016-05-22 11:41:13,426 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[]
2016-05-22 11:41:33,471 INFO org.apache.hadoop.ipc.Client: Retrying connect to server
datanode log: <code>STARTUP_MSG: java = 1.8.0_91
************************************************************/
2016-05-22 11:40:40,852 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
2016-05-22 11:40:41,523 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2016-05-22 11:40:41,607 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-05-22 11:40:41,607 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2016-05-22 11:40:41,612 INFO org.apache.hadoop.hdfs.server.datanode.BlockScanner: Initialized block scanner with targetBytesPerSec 1048576
2016-05-22 11:40:41,614 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is ubuntuslave
2016-05-22 11:40:41,620 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting DataNode with maxLockedMemory = 0
2016-05-22 11:40:41,644 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened streaming server at /0.0.0.0:50010
2016-05-22 11:40:41,646 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s
2016-05-22 11:40:41,646 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Number threads for balancing is 5
2016-05-22 11:40:41,739 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2016-05-22 11:40:41,750 INFO org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2016-05-22 11:40:41,768 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.datanode is not defined
2016-05-22 11:40:41,776 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2016-05-22 11:40:41,779 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context datanode
2016-05-22 11:40:41,780 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2016-05-22 11:40:41,780 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2016-05-22 11:40:41,796 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 52013
2016-05-22 11:40:41,796 INFO org.mortbay.log: jetty-6.1.26
2016-05-22 11:40:41,990 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@localhost:52013
2016-05-22 11:40:42,109 INFO org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer: Listening HTTP traffic on /0.0.0.0:50075
2016-05-22 11:40:42,298 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnUserName = hadoopadmin
2016-05-22 11:40:42,298 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: supergroup = supergroup
2016-05-22 11:40:42,343 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-05-22 11:40:42,361 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 50020
2016-05-22 11:40:42,388 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened IPC server at /0.0.0.0:50020
2016-05-22 11:40:42,400 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Refresh request received for nameservices: null
2016-05-22 11:40:42,424 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting BPOfferServices for nameservices: <default>
2016-05-22 11:40:42,436 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool <registering> (Datanode Uuid unassigned) service to masternode/10.18.0.50:9000 starting to offer service
2016-05-22 11:40:42,444 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2016-05-22 11:40:42,445 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-05-22 11:41:02,555 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/10.18.0.50:9000. Already tried 0 time(s); maxRetries=45
yarn-site.xml: <code><configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>masternode:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>masternode:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>masternode:8030</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>masternode:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>masternode:8088</value>
</property>
</configuration>
core-site.xml: <code><configuration>
<property>
<name>fs.defaultFS</name>
<value>masternode:9000</value>
</property>
</configuration>
hdfs-site.xml: <configuration>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoopadmin/hadooptmp</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoopadmin/hadooptmp</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
masters file: masternode
slaves file: ubuntuslave Do you understand why its not working?
... View more
Labels:
05-19-2016
08:02 PM
Hi, Im testing on hadoop 2-7.1, java 1.8 and spark-1.6.1-bin-hadoop2.6.
... View more
05-19-2016
08:02 PM
Thanks for your help. I tried to start spark with command that you said, but I have the exact same error.
... View more
05-19-2016
08:02 PM
When I start the spark-yarn using this command " spark-shell --master yarn-client " Im getting an error saying: <code>ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.NullPointerException
The full error I got in starting spark shell with yarn is below, the logs about yarn containers is here: <code>Container: container_1463670715317_0002_01_000001 on masternode_52694
============================================================================
LogType:stderr
Log Upload Time:Thu May 19 16:19:54 +0100 2016
LogLength:5748
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/tmp/hadoop-hadoopadmin/nm-local-dir/usercache /hadoopadmin/filecache/13/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/S taticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.1/share/hadoop/common/li b/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/05/19 16:19:44 INFO yarn.ApplicationMaster: Registered signal handlers for [T ERM, HUP, INT]
16/05/19 16:19:45 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_ 1463670715317_0002_000001
16/05/19 16:19:46 INFO spark.SecurityManager: Changing view acls to: hadoopadmin
16/05/19 16:19:46 INFO spark.SecurityManager: Changing modify acls to: hadoopadm in
16/05/19 16:19:46 INFO spark.SecurityManager: SecurityManager: authentication di sabled; ui acls disabled; users with view permissions: Set(hadoopadmin); users w ith modify permissions: Set(hadoopadmin)
16/05/19 16:19:46 INFO yarn.ApplicationMaster: Waiting for Spark driver to be re achable.
16/05/19 16:19:46 INFO yarn.ApplicationMaster: Driver now available: 10.17.0.50: 43771
16/05/19 16:19:47 INFO yarn.ApplicationMaster$AMEndpoint: Add WebUI Filter. AddW ebUIFilter(org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter,Map(PROXY_ HOSTS -> masternode, PROXY_URI_BASES -> http://masternode:8088/proxy/a pplication_1463670715317_0002),/proxy/application_1463670715317_0002)
16/05/19 16:19:47 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0 :8030
16/05/19 16:19:47 INFO yarn.YarnRMClient: Registering the ApplicationMaster
16/05/19 16:19:47 INFO yarn.YarnAllocator: Will request 2 executor containers, e ach with 1 cores and 1408 MB memory including 384 MB overhead
16/05/19 16:19:47 INFO yarn.YarnAllocator: Container request (host: Any, capabil ity: <memory:1408, vCores:1>)
16/05/19 16:19:47 INFO yarn.YarnAllocator: Container request (host: Any, capabil ity: <memory:1408, vCores:1>)
16/05/19 16:19:47 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
16/05/19 16:19:47 INFO impl.AMRMClientImpl: Received new token for : masternode:52694
16/05/19 16:19:47 INFO yarn.YarnAllocator: Launching container container_1463670 715317_0002_01_000002 for on host masternode
16/05/19 16:19:47 INFO yarn.YarnAllocator: Launching ExecutorRunnable. driverUrl : spark://CoarseGrainedScheduler@10.17.0.50:43771, executorHostname: masternode
16/05/19 16:19:47 INFO yarn.ExecutorRunnable: Starting Executor Container
16/05/19 16:19:47 INFO yarn.YarnAllocator: Received 1 containers from YARN, laun ching executors on 1 of them.
16/05/19 16:19:47 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-ca ched-nodemanagers-proxies : 0
16/05/19 16:19:47 INFO yarn.ExecutorRunnable: Setting up ContainerLaunchContext
16/05/19 16:19:47 INFO yarn.ExecutorRunnable: Preparing Local resources
16/05/19 16:19:47 INFO yarn.ExecutorRunnable: Prepared Local resources Map(_spa rk_.jar -> resource
{ scheme: "hdfs" host: "localhost" port: 9000 file: "/user/ hadoopadmin/.sparkStaging/application_1463670715317_0002/spark-assembly-1.6.1-ha doop2.6.0.jar" }
size: 187698038 timestamp: 1463671182405 type: FILE visibility: PRIVATE)
16/05/19 16:19:48 INFO yarn.ExecutorRunnable:
===============================================================================
YARN executor launch context:
env:
CLASSPATH -> PWD<CPS>PWD/_spark_.jar<CPS>$HADOOP_CONF_DIR<CPS>$HAD OOP_COMMON_HOME/share/hadoop/common/<CPS>$HADOOP_COMMON_HOME/share/hadoop/commo n/lib/<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/<CPS>$HADOOP_HDFS_HOME/share/ha doop/hdfs/lib/<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/<CPS>$HADOOP_YARN_HOME/ share/hadoop/yarn/lib/<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/<CPS>$HA DOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/
SPARK_LOG_URL_STDERR -> http://masternode:8042/node/containerlogs/conta iner_1463670715317_0002_01_000002/hadoopadmin/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1463670715317_0002
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 187698038
SPARK_USER -> hadoopadmin
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1463671182405
SPARK_LOG_URL_STDOUT -> http://masternode:8042/node/containerlogs/conta iner_1463670715317_0002_01_000002/hadoopadmin/stdout?start=-4096
SPARK_YARN_CACHE_FILES -> hdfs://localhost:9000/user/hadoopadmin/.sparkStagi ng/application_1463670715317_0002/spark-assembly-1.6.1-hadoop2.6.0.jar#_spark_ .jar
command:
JAVA_HOME/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms1024m -X mx1024m -Djava.io.tmpdir=PWD/tmp '-Dspark.driver.port=43771' -Dspark.yarn.ap p.container.log.dir=<LOG_DIR> org.apache.spark.executor.CoarseGrainedExecutorBac kend --driver-url spark://CoarseGrainedScheduler@10.17.0.50:43771 --executor-id 1 --hostname masternode --cores 1 --app-id application_1463670715317_0002 - -user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
===============================================================================
16/05/19 16:19:48 INFO impl.ContainerManagementProtocolProxy: Opening proxy : masternode:52694
16/05/19 16:19:48 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
16/05/19 16:19:48 INFO yarn.ApplicationMaster: Final app status: UNDEFINED, exit Code: 0, (reason: Shutdown hook called before final status was reported.)
16/05/19 16:19:48 INFO util.ShutdownHookManager: Shutdown hook called
End of LogType:stderr
LogType:stdout
Log Upload Time:Thu May 19 16:19:54 +0100 2016
LogLength:0
Log Contents:
End of LogType:stdout
Container: container_1463670715317_0002_02_000002 on masternode_52694
============================================================================
LogType:stderr
Log Upload Time:Thu May 19 16:19:54 +0100 2016
LogLength:737
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/tmp/hadoop-hadoopadmin/nm-local-dir/usercache /hadoopadmin/filecache/13/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/S taticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.1/share/hadoop/common/li b/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/05/19 16:19:54 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
16/05/19 16:19:54 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 1 5: SIGTERM
End of LogType:stderr
LogType:stdout
Log Upload Time:Thu May 19 16:19:54 +0100 2016
LogLength:0
Log Contents:
End of LogType:stdout
hadoopadmin@master:~$
The full error that it shows when I try to start spark with " spark-shell --master yarn-client ": <code>hadoopadmin@master:~$ spark-shell --master yarn-client
16/05/19 16:19:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/05/19 16:19:33 INFO spark.SecurityManager: Changing view acls to: hadoopadmin
16/05/19 16:19:33 INFO spark.SecurityManager: Changing modify acls to: hadoopadmin
16/05/19 16:19:33 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoopadmin); users with modify permissions: Set(hadoopadmin)
16/05/19 16:19:33 INFO spark.HttpServer: Starting HTTP Server
16/05/19 16:19:33 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/05/19 16:19:33 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:37052
16/05/19 16:19:33 INFO util.Utils: Successfully started service 'HTTP class server' on port 37052.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.1
/_/
Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77)
Type in expressions to have them evaluated.
Type :help for more information.
16/05/19 16:19:37 INFO spark.SparkContext: Running Spark version 1.6.1
16/05/19 16:19:37 INFO spark.SecurityManager: Changing view acls to: hadoopadmin
16/05/19 16:19:37 INFO spark.SecurityManager: Changing modify acls to: hadoopadmin
16/05/19 16:19:37 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoopadmin); users with modify permissions: Set(hadoopadmin)
16/05/19 16:19:38 INFO util.Utils: Successfully started service 'sparkDriver' on port 43771.
16/05/19 16:19:38 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/05/19 16:19:38 INFO Remoting: Starting remoting
16/05/19 16:19:38 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.17.0.50:57722]
16/05/19 16:19:38 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 57722.
16/05/19 16:19:38 INFO spark.SparkEnv: Registering MapOutputTracker
16/05/19 16:19:38 INFO spark.SparkEnv: Registering BlockManagerMaster
16/05/19 16:19:38 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-e8de3854-2526-4725-8c73-edb3fce2df33
16/05/19 16:19:38 INFO storage.MemoryStore: MemoryStore started with capacity 511.1 MB
16/05/19 16:19:38 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/05/19 16:19:39 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/05/19 16:19:39 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
16/05/19 16:19:39 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/05/19 16:19:39 INFO ui.SparkUI: Started SparkUI at http://10.17.0.50:4040
16/05/19 16:19:39 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/05/19 16:19:39 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers
16/05/19 16:19:39 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/05/19 16:19:39 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/05/19 16:19:39 INFO yarn.Client: Setting up container launch context for our AM
16/05/19 16:19:39 INFO yarn.Client: Setting up the launch environment for our AM container
16/05/19 16:19:39 INFO yarn.Client: Preparing resources for our AM container
16/05/19 16:19:40 INFO yarn.Client: Uploading resource file:/usr/local/spark-1.6.1-bin-hadoop2.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar -> hdfs://localhost:9000/user/hadoopadmin/.sparkStaging/application_1463670715317_0002/spark-assembly-1.6.1-hadoop2.6.0.jar
16/05/19 16:19:42 INFO yarn.Client: Uploading resource file:/tmp/spark-942afe6a-95ca-4b8b-b06f-e9e3ac6aa751/__spark_conf__5009784131719458516.zip -> hdfs://localhost:9000/user/hadoopadmin/.sparkStaging/application_1463670715317_0002/__spark_conf__5009784131719458516.zip
16/05/19 16:19:42 INFO spark.SecurityManager: Changing view acls to: hadoopadmin
16/05/19 16:19:42 INFO spark.SecurityManager: Changing modify acls to: hadoopadmin
16/05/19 16:19:42 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoopadmin); users with modify permissions: Set(hadoopadmin)
16/05/19 16:19:42 INFO yarn.Client: Submitting application 2 to ResourceManager
16/05/19 16:19:42 INFO impl.YarnClientImpl: Submitted application application_1463670715317_0002
16/05/19 16:19:43 INFO yarn.Client: Application report for application_1463670715317_0002 (state: ACCEPTED)
16/05/19 16:19:43 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1463671182634
final status: UNDEFINED
tracking URL: http://masternode:8088/proxy/application_1463670715317_0002/
user: hadoopadmin
16/05/19 16:19:44 INFO yarn.Client: Application report for application_1463670715317_0002 (state: ACCEPTED)
16/05/19 16:19:45 INFO yarn.Client: Application report for application_1463670715317_0002 (state: ACCEPTED)
16/05/19 16:19:46 INFO yarn.Client: Application report for application_1463670715317_0002 (state: ACCEPTED)
16/05/19 16:19:47 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/05/19 16:19:47 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> masternode, PROXY_URI_BASES -> http://masternode:8088/proxy/application_1463670715317_0002), /proxy/application_1463670715317_0002
16/05/19 16:19:47 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/05/19 16:19:47 INFO yarn.Client: Application report for application_1463670715317_0002 (state: RUNNING)
16/05/19 16:19:47 INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.17.0.50
ApplicationMaster RPC port: 0
queue: default
start time: 1463671182634
final status: UNDEFINED
tracking URL: http://masternode:8088/proxy/application_1463670715317_0002/
user: hadoopadmin
16/05/19 16:19:47 INFO cluster.YarnClientSchedulerBackend: Application application_1463670715317_0002 has started running.
16/05/19 16:19:47 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 49183.
16/05/19 16:19:47 INFO netty.NettyBlockTransferService: Server created on 49183
16/05/19 16:19:47 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/05/19 16:19:47 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.17.0.50:49183 with 511.1 MB RAM, BlockManagerId(driver, 10.17.0.50, 49183)
16/05/19 16:19:47 INFO storage.BlockManagerMaster: Registered BlockManager
16/05/19 16:19:51 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/05/19 16:19:51 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> masternode, PROXY_URI_BASES -> http://masternode:8088/proxy/application_1463670715317_0002), /proxy/application_1463670715317_0002
16/05/19 16:19:51 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/05/19 16:19:54 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
16/05/19 16:19:54 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
16/05/19 16:19:54 INFO ui.SparkUI: Stopped Spark web UI at http://10.17.0.50:4040
16/05/19 16:19:54 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
16/05/19 16:19:54 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
16/05/19 16:19:54 INFO cluster.YarnClientSchedulerBackend: Stopped
16/05/19 16:19:54 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/05/19 16:19:54 INFO storage.MemoryStore: MemoryStore cleared
16/05/19 16:19:54 INFO storage.BlockManager: BlockManager stopped
16/05/19 16:19:54 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
16/05/19 16:19:54 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/05/19 16:19:54 INFO spark.SparkContext: Successfully stopped SparkContext
16/05/19 16:19:54 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/05/19 16:19:54 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/05/19 16:19:54 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/05/19 16:20:09 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
16/05/19 16:20:09 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.NullPointerException
at org.apache.spark.SparkContext.<init>(SparkContext.scala:584)
at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
at $line3.$read$iwC$iwC.<init>(<console>:15)
at $line3.$read$iwC.<init>(<console>:24)
at $line3.$read.<init>(<console>:26)
at $line3.$read$.<init>(<console>:30)
at $line3.$read$.<clinit>(<console>)
at $line3.$eval$.<init>(<console>:7)
at $line3.$eval$.<clinit>(<console>)
at $line3.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoopInit$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:125)
at org.apache.spark.repl.SparkILoopInit$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply$mcZ$sp(SparkILoop.scala:991)
at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$anonfun$org$apache$spark$repl$SparkILoop$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/05/19 16:20:09 INFO spark.SparkContext: SparkContext already stopped.
java.lang.NullPointerException
at org.apache.spark.SparkContext.<init>(SparkContext.scala:584)
at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
at $iwC$iwC.<init>(<console>:15)
at $iwC.<init>(<console>:24)
at <init>(<console>:26)
at .<init>(<console>:30)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at ... org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
<console>:16: error: not found: value sqlContext
import sqlContext.implicits._
^
<console>:16: error: not found: value sqlContext
import sqlContext.sql
^
... View more
Labels:
05-14-2016
05:18 PM
Thanks for your help. Im studing about spark configurations and functionality. Im trying to run spark with yarn-client but Im getting that error above. Just works if I start spark with "spark-shell" (without yarn). But Im not understanding why, because all configurations seem ok.
... View more
05-14-2016
12:11 AM
I have a single node with hadoop cluster running. With jps command I get all nodes running. When I start spark with "spark-shell" the spark starts correctly. Now Im trying to start spark-shell like this "spark-shell --master yarn-client", but like this Im getting the error below. For spark I just download, extract and configure the spark-env.sh file, and start all spark processes like this: SPARK_JAVA_OPTS=-Dspark.driver.port=53411
HADOOP_CONF_DIR=/usr/local/hadoop-2.7.1/conf
SPARK_MASTER_IP=master
And then I start all processes of spark with ` ./start-all.sh
Hadoop "yarn-site.xml": <configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration> Error: 16/05/14 18:06:31 INFO Client: Application report for application_1463245231113_0003 (state: RUNNING)
16/05/14 18:06:31 DEBUG Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.10.0.11
ApplicationMaster RPC port: 0
queue: default
start time: 1463245585250
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1463245231113_0003/
user: hadoopadmin
16/05/14 18:06:31 INFO YarnClientSchedulerBackend: Application application_1463245231113_0003 has started running.
16/05/14 18:06:31 DEBUG TransportServer: Shuffle server started on port :40948
16/05/14 18:06:31 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 40948.
16/05/14 18:06:31 INFO NettyBlockTransferService: Server created on 40948
16/05/14 18:06:31 INFO BlockManagerMaster: Trying to register BlockManager
16/05/14 18:06:31 INFO BlockManagerMasterEndpoint: Registering block manager 10.10.0.11:40948 with 511.1 MB RAM, BlockManagerId(driver, 10.10.0.11, 40948)
16/05/14 18:06:31 INFO BlockManagerMaster: Registered BlockManager
16/05/14 18:06:32 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin sending #33
16/05/14 18:06:32 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin got value #33
16/05/14 18:06:32 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 3ms
16/05/14 18:06:33 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin sending #34
16/05/14 18:06:33 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin got value #34
16/05/14 18:06:33 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 2ms
16/05/14 18:06:34 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin sending #35
16/05/14 18:06:34 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin got value #35
16/05/14 18:06:34 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 2ms
16/05/14 18:06:34 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
16/05/14 18:06:34 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1463245231113_0003), /proxy/application_1463245231113_0003
16/05/14 18:06:34 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/05/14 18:06:35 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:9000 from hadoopadmin: closed
16/05/14 18:06:35 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:9000 from hadoopadmin: stopped, remaining connections 1
16/05/14 18:06:35 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin sending #36
16/05/14 18:06:35 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin got value #36
16/05/14 18:06:35 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 2ms
16/05/14 18:06:36 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin sending #37
16/05/14 18:06:36 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin got value #37
16/05/14 18:06:36 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 2ms
16/05/14 18:06:37 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin sending #38
16/05/14 18:06:37 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin got value #38
16/05/14 18:06:37 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 3ms
16/05/14 18:06:38 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin sending #39
16/05/14 18:06:38 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin got value #39
16/05/14 18:06:38 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 3ms
16/05/14 18:06:38 DEBUG Client: The ping interval is 60000 ms.
16/05/14 18:06:38 DEBUG Client: Connecting to master/10.10.0.11:9000
16/05/14 18:06:38 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:9000 from hadoopadmin: starting, having connections 2
16/05/14 18:06:38 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:9000 from hadoopadmin sending #40
16/05/14 18:06:38 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:9000 from hadoopadmin got value #40
16/05/14 18:06:38 DEBUG ProtobufRpcEngine: Call: getFileInfo took 5ms
16/05/14 18:06:38 ERROR YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
16/05/14 18:06:38 INFO SparkUI: Stopped Spark web UI at http://10.10.0.11:4040
16/05/14 18:06:38 INFO YarnClientSchedulerBackend: Shutting down all executors
16/05/14 18:06:38 INFO YarnClientSchedulerBackend: Asking each executor to shut down
16/05/14 18:06:38 DEBUG AbstractService: Service: org.apache.hadoop.yarn.client.api.impl.YarnClientImpl entered state STOPPED
16/05/14 18:06:38 DEBUG Client: stopping client from cache: org.apache.hadoop.ipc.Client@27dfd12b
16/05/14 18:06:38 INFO YarnClientSchedulerBackend: Stopped
16/05/14 18:06:38 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/05/14 18:06:38 INFO MemoryStore: MemoryStore cleared
16/05/14 18:06:38 INFO BlockManager: BlockManager stopped
16/05/14 18:06:38 INFO BlockManagerMaster: BlockManagerMaster stopped
16/05/14 18:06:38 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/05/14 18:06:38 INFO SparkContext: Successfully stopped SparkContext
16/05/14 18:06:38 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/05/14 18:06:38 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/05/14 18:06:38 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/05/14 18:06:40 DEBUG PoolThreadCache: Freed 1 thread-local buffer(s) from thread: shuffle-server-1
16/05/14 18:06:40 DEBUG PoolThreadCache: Freed 9 thread-local buffer(s) from thread: shuffle-server-0
16/05/14 18:06:48 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin: closed
16/05/14 18:06:48 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:8032 from hadoopadmin: stopped, remaining connections 1
16/05/14 18:06:48 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:9000 from hadoopadmin: closed
16/05/14 18:06:48 DEBUG Client: IPC Client (1374243709) connection to master/10.10.0.11:9000 from hadoopadmin: stopped, remaining connections 0
16/05/14 18:06:49 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
16/05/14 18:06:49 ERROR SparkContext: Error initializing SparkContext.
java.lang.NullPointerException
at org.apache.spark.SparkContext.<init>(SparkContext.scala:584)
at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
at $line3.$read$$iwC$$iwC.<init>(<console>:15)
at $line3.$read$$iwC.<init>(<console>:24)
at $line3.$read.<init>(<console>:26)
at $line3.$read$.<init>(<console>:30)
at $line3.$read$.<clinit>(<console>)
at $line3.$eval$.<init>(<console>:7)
at $line3.$eval$.<clinit>(<console>)
at $line3.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:125)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/05/14 18:06:49 INFO SparkContext: SparkContext already stopped.
java.lang.NullPointerException
at org.apache.spark.SparkContext.<init>(SparkContext.scala:584)
at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
at $iwC$$iwC.<init>(<console>:15)
at $iwC.<init>(<console>:24)
at <init>(<console>:26)
at .<init>(<console>:30)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:125)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
java.lang.NullPointerException
at org.apache.spark.sql.SQLContext$.createListenerAndUI(SQLContext.scala:1367)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
at $iwC$$iwC.<init>(<console>:15)
at $iwC.<init>(<console>:24)
at <init>(<console>:26)
at .<init>(<console>:30)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:132)
at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/05/14 18:06:51 DEBUG LeaseRenewer: Lease renewer daemon for [] with renew id 1 executed
<console>:16: error: not found: value sqlContext
import sqlContext.implicits._
^
<console>:16: error: not found: value sqlContext
import sqlContext.sql
^
scala> exit16/05/14 18:07:21 DEBUG LeaseRenewer: Lease renewer daemon for [] with renew id 1 executed
16/05/14 18:07:25 DEBUG LeaseRenewer: Lease renewer daemon for [] with renew id 1 expired
... View more
Labels:
05-10-2016
11:21 AM
When I acess the spark UI in master:4040 url Im getting this error when spark with yarn is running: WARN amfilter.AmIpFilter: Could not find proxy-user cookie, so user will not be set Do you know how to disable this so I dont have this error and can acess master:4040 normally?
... View more
- Tags:
- Hadoop Core
- Spark
- YARN
Labels:
05-09-2016
12:11 PM
Thanks for your answer, it really helped understand better the logic. I just have one more doubt about your second answer. So the actions are collect or show in this case. But about transformations, select is a transformation? And also if the query its not only a "select * from customers", but have some operations like group by, filter, join operations, this operations will be transformations that spark will aplply on the dataframe during the query execution?
... View more
05-09-2016
12:02 AM
Hi, Im studing the interaction of spark with hive, to execute queries over hive tables with spark sql using hiveContex. But, Im having some doubts to understanding the logic. From the spark documentation, the basic code for this is this: // sc is an existing SparkContext.
var hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
var query = hiveContext.sql("select * from customers");
query.collect() I have three main doubts. I read that spark works with rdds, and then spark can apply actions or transformations in that rdds. 1) It seems that we can create a rdd by loading an external dataset, so in this above code where the RDD is created? Is here "query = hiveContext.sql("select * from customers");" ? var query is the RDD? 2) And then after the RDD is created we can do transformations and actions, but in this case of execute queries over hive tables we just do actions right? There is no need for transformations right? And the action here is collect() right? 3) And third, I also read that spark computes rdds in a lazy way to save storage space. In this use case of execute queries over hive tables with above code, where or how this lazy evaluation mechanism happens, so the spark can save storage space? Can you give some help to understand this better?
... View more
Labels:
05-08-2016
06:47 PM
Thanks really. so that is what Im trying to do I guess. And your code in your first answer its working now. But when I execute a query in hive just to test if the data is inside the table lile "select * from partsupp", dont return any results, because it shows this error: "Failed: Execution error, return code 2 from org.apache.hadoop.hive.sql.exec.mr.MapRedTask". Do you have any idea for this?
... View more
05-08-2016
05:48 PM
Thanks for your help again. Im doing this and then I will compare with hive on tez to check the difference. But now I didnt understand well what you said, Im still a beginner in big data, and I read that store the tables in hive is better because then the queries are fastest because orc is a compressed format so the data size is smaller. But you are saying that dont, and we should use orc in hadoop? I have a .tbl file so I should convert that file into orc before store into hadoop?
... View more
05-08-2016
03:32 PM
Thanks for your answer, Im using hive 1.2.1. And I read that parquet and orc formats because they are columnar are fastest. And I want to query this data from spark later. But so, its better store the data in the table as text?
... View more
05-08-2016
01:28 PM
Im trying to create a table in hive with orc format and load this table with data that I have in a ".tbl" file. In the ".tbl" files each row have this format:
1|Customer#000000001|IVhzIApeRb ot,c,E|15|711.56|BUILDING|to the even, regular platelets. regular, ironic epitaphs nag e|
I create a hive table with orc format like this:
create table if not exists partsupp (PS_PARTKEY BIGINT, PS_SUPPKEY BIGINT, PS_AVAILQTY INT, PS_SUPPLYCOST DOUBLE, PS_COMMENT STRING)STORED AS ORC TBLPROPERTIES ("orc.compress"="SNAPPY")
Now Im trying to load data into the table like this:
LOAD DATA LOCAL INPATH '/tables/partsupp/partsupp.tbl' [OVERWRITE] INTO TABLE partsupp;
My questions are, do you know if this is a correct method to do this? And if it is, do you know why this error is happening when I do the load data inpatch command? Failed: Parse exception mismatched input '[' expecting into near '/tables/partsupp/partsupp.tbl in load statement
... View more
Labels:
05-06-2016
07:06 PM
Thanks for your help. And do you know if the diagram of the jobs executed after we execute a query, the DAG visualization is about what? That visualization shows the physical or logical plan?
... View more
05-04-2016
10:28 PM
Thanks for your answer, now I can see the plans. And the diagram that appears in the spark user interface about each job, the DAG Visualization what is? Is the logical or physical plan? Or its another thing? And the diagram that you refer in your first phrase is which?
... View more