Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Expert Contributor

On HDP 2.4, some services may have corrupted jar and tar.gz files on HDFS. The specific files I have seen broken are as follows:

  • hive.tar.gz
  • mapreduce.tar.gz
  • hadoop-streaming.jar
  • pig.tar.gz
  • spark-hdp-assembly.jar
  • sqoop.tar.gz
  • tez.tar.gz

All of these are found in the /hdp/apps/<hdp-version> directory. On my install, they all had zero size (reported as 0.1 kB on HDFS File View). This led to errors in a variety of services, including the following:

  • gzip: /foo/bar/yarn/local/filecache/11_tmp/tmp_mapreduce.tar.gz: unexpected end of file
  • tar: This does not look like a tar archive
  • Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher

There may be other errors, but those are the ones I personally experienced. This is fairly easy to fix. Each corrupt file has a healthy version on the local file system. The healthy version must be copied from the local system to HDFS, replacing the corrupt version. For example, to update Tez, perform the following:

$ hdfs dfs -rm /hdp/apps/<hdp-version>/tez/*

$ hdfs dfs put /usr/hdp/current/tez-client/lib/tez.tar.gz /hdp/apps/<hdp-version>/tez/

$ hdfs dfs -chmod 444 /hdp/apps/<hdp_version>/tez/tez.tar.gz

Problems caused by a corrupt tar on Tez should now be fixed

4,132 Views
Comments
avatar
Contributor

I am having the same issue with broken links. Any idea why the links will break? Its happening every week.