Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1969 | 07-09-2019 12:53 AM | |
| 11878 | 06-23-2019 08:37 PM | |
| 9141 | 06-18-2019 11:28 PM | |
| 10127 | 05-23-2019 08:46 PM | |
| 4577 | 05-20-2019 01:14 AM |
07-06-2015
10:15 PM
2 Kudos
Thank you for the additional details! at org.apache.hadoop.util.RunJar.main(RunJar.java:212) This indicates a problem in the driver-end, or as you say 'during the execution of the job controller'. The issue is that even if you do add something to the MR distributed cache classpath, your executor class also references the same class. The act of adding a jar to the distributed task's classpath does not also add it to the local one. Here's how you can ensure that, if you use 'hadoop jar' to execute your job: ~> export HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH/lib/hive/lib/hive-exec.jar
~> hadoop jar your-app.jar your.main.Class [arguments] This will add it also to your local JVM classpath, while your code will further add it onto the remote execution classpaths. > Optimally, I shouldn't have to stuff this one in the distributed cache since it sits in /opt/cloudera/parcels/CDH-5.3.5-1.cdh5.3.5.p0.4/jars/hive-exec-0.13.1-cdh5.3.5.jar on all of my slave nodes, but I also can't figure out how to tell MapReduce to look there. MR remote execution classpath is governed by the classpath entry defined in the mapred-site.xml and yarn-site.xml, and the additonal elements you add to the DistributedCache. They do not use the entire /opt/cloudera/parcels/CDH/jars/* path - this is so for isolation and flexibility purposes, as that area may carry multiple versions of the same dependencies, etc. Does this help?
... View more
07-01-2015
12:28 AM
What form of HDFS path are you configuring in your Flume agent configs? For HA, you must use the HA service name, such as hdfs://nameservice1/user/foo instead of hdfs://namenode-host:8020/user/foo. This will protect your agents from failures during HA failovers.
... View more
06-25-2015
01:31 PM
You can use the CM -> HBase -> Configuration -> RegionServer Safety Valve (for hbase-site.xml) to make the HFile V3 property setting change, since there's no direct UI field for it. CM does separate client configs from server ones, to isolate and configure server specific items independently. This is better explained in the architecture docs at http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_intro_primer.html
... View more
06-25-2015
01:24 PM
2 Kudos
Yes, that'd be a good idea. Glad to hear it worked! Feel free to also mark the discussion as solved so others looking at similar issues may find this thread faster.
... View more
06-25-2015
07:14 AM
Did you follow the guide at http://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#_visibility_labels? What error do you specifically get in trying to use the feature? Also, if you did change the HFile version, also ensure to run a major compaction on all tables to make the existing data migrate to it.
... View more
06-24-2015
03:46 PM
Thanks for closing the loop! We do not activate v3 HFiles in CDH5.4 to avoid breaking compatibility/adding additional work for users upgrading from an earlier CDH5 release: https://github.com/cloudera/hbase/commit/c9eb03bbf2c54b8e502feef89a59484bad987ff8
... View more
06-24-2015
02:39 PM
What is the full stack trace? That'd be necessary to tell where the failure point lies. If it fails at the driver/client end, you will likely also need to add the jar to HADOOP_CLASSPATH env-var before the command invocation. If it fails at the MR task end, then you'll need to make sure your distributed-cache configs works (by checking job config xml to search your jar inside it)
... View more
06-24-2015
10:27 AM
Glad to hear - thanks for closing the loop!
... View more
06-24-2015
10:27 AM
CM currently lacks support to define storage types. If you'd like to use this feature at the moment, place your XML override in the "DataNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml" instead, which accepts <property/> tags.
... View more
06-24-2015
10:25 AM
3 Kudos
You need to raise the client heap size. For a one-off change, you can do the below: ~> export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Xmx5g" ~> hadoop fs -copyToLocal /user/docsearch/data/DiscardedAttachments /opt/ For a more permanent change, locate Gateway Client Java Heap configs in the relevant service (HDFS, YARN or Hive) in CM, raise the value and redeploy cluster-wide configs [1]. [1] - https://www.youtube.com/watch?v=4S9H3wftM_0
... View more