Member since
01-23-2017
114
Posts
19
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2216 | 03-26-2018 04:53 AM | |
27641 | 12-01-2017 07:15 AM | |
913 | 11-28-2016 11:30 AM | |
1580 | 10-25-2016 11:26 AM |
07-31-2017
01:52 PM
@Rakesh Enjala we were getting the similar issue, where all of our blocks under HDFS were coming up as Under Replicated Blockshdfs-under-replicated-blocks.png the default value for ipc.maximum.data.length is 67108864Bytes (64MB) from https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml in our case we were getting this as about 100MB to avoid the issue we have increased the value to 128MB and able to get the cluster back to normal but before this we have done some feasts 🙂 and which caused us some unexpected behaviors in our cluster including the data loss this is happened due to: 1) we were thinking deleting the under replicated blocks using hdfs fsck / -delete will delete only the under replicated blocks which it did but in our case we lost the some of the data as well due to the ipc.maximum.data.length issue NameNode doesn't have the actual metadata because of this we lost the blocks (data) but the files were existing with 0Bytes. 2) One of the design issues we have in our cluster was we only have a single mount point (72TB) for Datanodes which is a big mistake where it have been made at least into 6 each with 12TB. 3) Never run the hdfs fsck / -delete when you see the Requested data length 97568122 is longer than maximum configured RPC length 67108864 from the NameNode logs. Hope this helps someone
... View more
07-28-2017
08:19 AM
We were into the same scenario where Zeppelin was always launching the 3 Containers in YARN even after having the Dynamic allocation parameters enabled from Spark but Zeppelin is not able to pick these parameters,
To get the Zeppelin to launch more than 3 containers (the default it is launching) we need to configure in the Zeppelin Spark interpreter spark.dynamicAllocation.enabled=true
spark.shuffle.service.enabled=true
spark.dynamicAllocation.initialExecutors=0
spark.dynamicAllocation.minExecutors=2 --> Start this value with the lower number, if not it will launch number of the minimum containers specified and will only use the required containers (memory and VCores) and rest of the memory and VCores will be marked as reserved memory and causes memory issues
spark.dynamicAllocation.maxExecutors=10
And it is always good to start with less executor memory (e.g 10/15g) and more executors (20/30) Our scenario we have observed that giving the executor memory (50/100g) and executors as (5/10) the query took 3min 48secs (228sec) --> which is obvious as the parallelism is very less and reducing the executor memory (10/15g) and increasing the executors (25/30) the same query took on 54secs. Please note the number of executors and executor memory are usecase dependent and we have done few trails before getting the optimal performance for our scenario.
... View more
07-20-2017
01:18 PM
@John Cod As given above Hive metastore holds the details related to metadata (columns, datatypes, compression, input and output formats and many more that includes the HDFS location of the table and Database as well) with this information any tools/services that connects with Hive will invoke a NameNode call to get the Metadata (about the files, directories and the corresponding blocks etc) which is pretty much needed for the jobs that will be launched by Hive.
... View more
01-12-2017
11:15 AM
@ilhami Kalkan How much data do you have on these policies? and most of the times it occurs due to data load.
... View more
01-09-2017
11:05 AM
1 Kudo
@Sridevi Kaup it is 2.4.2
... View more
11-28-2016
11:30 AM
1 Kudo
@Nikolay Kanov You can do the Ambari Agents restarts and Sqoop jobs won't be having any interference as it doesn't have anything to do with Ambari Agent restarts.
... View more
11-11-2016
10:12 AM
@Imtiaz Yousaf please set the properties given in http://stackoverflow.com/questions/24390227/hadoop-jobs-fail-when-submitted-by-users-other-than-yarn-mrv2-or-mapred-mrv1 hope this helps... Please do let me know if you still have the issue after doing the changes. Thanks Venkat
... View more
11-01-2016
03:17 AM
Is the issue still exists? if yes can you please share the logs of ambari.
... View more
10-25-2016
11:26 AM
Can you please start and stop the ambari-server and ambari-agent where kafka is running and also form ambari DB check the hostcomponentstate table and look for the status of KAFKA service if it is still found in this table then start the ambari-agent and server and run the Delete command once again, it should help with the issue. If not please let me know the error you are getting.
... View more
10-25-2016
09:29 AM
@Laurent lau Did you also update your ambari-agents to the latest version same as Ambari Server? if not please do that, if yes then stop Ambari Server and Ambari Agents and then start these services again and also make sure that the hostnames and the corresponding IP Adderesses didn't change. Thanks Venkat
... View more
- « Previous
- Next »