Member since
08-16-2016
642
Posts
131
Kudos Received
68
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3458 | 10-13-2017 09:42 PM | |
6227 | 09-14-2017 11:15 AM | |
3183 | 09-13-2017 10:35 PM | |
5114 | 09-13-2017 10:25 PM | |
5760 | 09-13-2017 10:05 PM |
02-03-2017
12:30 PM
saranvisa is correct in that you should set a minimum and the max should not push the a single nodes memory limits as a single container cannot run across nodes. There is still the mismatch in what is in the configs versus what YARN is using and reporting. On the RM machine get the process id for the RM, sudo su yarn -c "jps" and then get the process info for that id, ps -ef | grep <id>. Does that show that ti is using the configs from the path that you changed, it should be listed in -classpath?
... View more
01-30-2017
12:07 AM
I'm not terrible familiar with Oozie but I believe the launcher was desperately from the actual job. Also, from the log "-Xmx4096m -Xmx4608m" it is launching with 4 GB container size and the heap is set to 3 GB. Is it set in the Oozie job settings?
... View more
01-29-2017
09:31 PM
It will work. This will diminish the network throughput and could impact the cluster performance if the typical workload is Network IO bound. In my experience, with predominantly 10 Ge networks, I have not been bound by the network running at the default 1500.
... View more
01-29-2017
09:16 PM
Track down container container_e29_1484466365663_87038_02_000001. It is most likely a reducer. I say that since you said both the Map and AM container size was set to 2 GB. Therefor the Reduce container size must be 3 GB. Well, in theory the user launching it could have overridden any of them. What is the value of mapreduce.reduce.memory.mb? Lets try another route as well, in the RM UI, in the job in question, does it have any failed maps or reducers? If yes, drill down to the failed one and view the logs. If not, then the AM container OOM'd. From my recollection though, that is the line the AM logs concerning one of the containers it is responsible for. Anyway, the short of it is, either the Reduce container size is 3 GB or the user set their own value to 3 GB as the values in the cluster configs are only the defaults.
... View more
01-29-2017
09:05 PM
Does this MR job access HBase at all? This error indicates that the Region trade_all was not accessible. Any errors on the HBase RegionServers? Access the HBase Master UI to see what RS are serving this region and split.
... View more
01-28-2017
12:21 PM
I don't know of any way. Hadoop in general doesn't care how long it takes; it is more concerned an auto-recover of the platform so that jobs can finish no matter what. You can limit the number of queries or jobs by user or group, you can limit the resources to users or groups. I just don't think there is a way to automatically kill jobs or queries running longer than X. I know other products, like Pepperdata, can track and alert you. It still require manually intervention. Can we step back and you explain what your issue is with long running jobs? As maybe the root cause can be addressed there so job do not run for so long or hold back others.
... View more
01-28-2017
12:16 PM
Can you post the container logs for one of the containers that was killed? In the RM UI drill down through the job until you get the list of Mappers/Reducers that succeeded or failed. Click through to a failed task and then open the logs. You should find an exception in it on the reason. The code mentioned usually does indicate a heap issue but I have seen it reported for other reason a container was killed, such as when preemption strikes.
... View more
01-26-2017
09:29 PM
1 Kudo
1. Yes it could. I personally don't like the threshold. It is not a great indicator of there being a small file issue. 2. The number reported by the DN is for all the replicas. It could mean a lot of small files or just a lot of data. At the defaults it could mean that the DN heap could use a boost although I always end up bumping it sooner. 3. Yes. 4. Yes. Each file takes up one or more blocks. The NN has to track it and it's replicas in its memeory. So a lot of small files can chew through the NN heap quickly. The DN heap is less concerned with Metadata associated with a block as it is related to the blocks being read,written, or replicated. 5. I'd worry less on the block count and more on the heap.
... View more
01-26-2017
09:16 PM
The cmd will use the instance profile from where it is launched. So if you want access but not for all you need to specify the key in the S3 URI.
... View more
01-19-2017
04:55 PM
Yes check there. I don't know the HIve source code but I do know that HDFS still does a username/group lookup against the OS.
... View more