Member since
06-26-2013
416
Posts
104
Kudos Received
49
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 7742 | 03-23-2016 08:06 AM | |
| 13827 | 10-12-2015 01:56 PM | |
| 4925 | 03-05-2015 11:11 AM | |
| 6148 | 02-19-2015 02:41 PM | |
| 13472 | 01-26-2015 09:55 AM |
05-05-2014
11:37 AM
Thanks for reporting the solution back to this thread, Murthy. Glad it's resolved!
... View more
04-16-2014
11:51 AM
Finally resolved the Issue. Due to space crunch on root mount space we have created a soft link for /opt/cloudera/ to point to /data/opt/cloudera/ As a result of this the the local repo path was changed as well. Hence while installing package it was trying to download the packages again and again. But unable to find the same on the disk as correctly distributed by CM. As I changed the local repo directory in CM Administration/parcel-repo path Every thing started working as required. Thanks every one for your support and suggestions. Priyabrata Patnaik
... View more
04-07-2014
08:26 AM
So one more question I had. Is it purely a non-functional performance consideration based on workloads? Is it ever a concern that any of the software components in the Cloudera stack would actually cause job failures (or even worse successful completions by creating a corrupt dataset) through mixing say bonded 1GE and 10GE racks of servers? We're running HBase, MapReduce and very light impala on our cluster of over 60 nodes, and we're thinking of moving to 10GE for nodes 60 - 100. But we're not sure if we should also upgrade the existing 60 nodes. We'll do some investigation now to determine whether our jobs are network bound. But there doesn't seem to be an easy way of measuring other than through the Chart views and looking at total bytes received on all interfaces across time across each node. Any other suggestions? Would anyone recommend that in order to move to 10GE networking that all potential components of the solution MUST be upgraded? Or is it purely a call to be made based on the performance attributes of jobs running?
... View more
04-02-2014
10:05 AM
Cloudera Manager is proprietary software, source code is not available. Sorry!
... View more
04-01-2014
02:59 AM
Hi just to follow up on this, I have now solved the problem. There were two things that I needed to do: 1. In addition to adding oozie.libpath to my job.properties, I also needed to include oozie.use.system.libpath=true 2. Before I was using the following line to add files to the DistributedCache: FileStatus[] status = fs.listStatus("/application/lib");
if (status != null) {
for (int i = 0; i < status.length; ++i) {
if (!status[i].isDir()) {
DistributedCache.addFileToClassPath(status[i].getPath(), job.getConfiguration(), fs);
}
}
} This appeared to be causing a classpath issue because it was adding hdfs://hostname before the hdfs path. Now I am using the following to remove that and only add the absolute hdfs path: FileStatus[] status = fs.listStatus("/application/lib");
if (status != null) {
for (int i = 0; i < status.length; ++i) {
if (!status[i].isDir()) { Path distCachePath = new Path(status[i].getPath().toUri().getPath());
DistributedCache.addFileToClassPath(distCachePath, job.getConfiguration(), fs);
}
}
} Thankyou to those that replied to my original query for pointing me in the right direction. Andrew
... View more
03-31-2014
01:14 AM
I have set "yarn.nodemanager.delete.debug-delay-sec" to 6000,and the container log dir is : <property> <name>yarn.nodemanager.log-dirs</name> <value>/hadoop/hadoop-2.0.0-cdh4.5.0/yarn/containers</value> </property> <property> <description>Where to aggregate logs</description> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/var/log/hadoop-yarn/app</value> </property> The dir /hadoop/hadoop-2.0.0-cdh4.5.0/yarn/containers has nothing after running the task,I found the configruation in yarn-site.xml never take effect.
... View more
03-30-2014
05:36 PM
For the benefit of others that may encounter this, the root cause of this problem was eventually identified. The problem was caused by the SSH-client-launched remote command running under a much older version of "bash", a version that had the (temporary) problem of not exporting the SSH_CLIENT variable to the environment. How can this happen and be obscure? It turns out that when the CM executes "ssh 'bash -c ...'", the remote SSH server relies on a static search PATH to locate "bash", which may be different from the path you pick-up with interactive shells. To check if you have this (unlikely) problem, run this from a machine remote from the target machine: $ ssh you@yourmachine.com 'which bash' /usr/local/bin/bash $ ssh you@yourmachine.com 'bash --version' GNU bash, version 2.05.8(1)-release (i386-redhat-linux-gnu) $ ssh you@yourmachine.com 'env | grep SSH_CLIENT' SSH_CLIENT=10.1.2.3 56617 22 $ ssh you@yourmachine.com 'bash -c "env | grep SSH_CLIENT"' (nothing) Note the really old version of bash reported here for me, and the non-standard path. Then when "bash" is explicitly invoked when checking SSH_CLIENT, it is missing. You can compare this to the results from an interactive shell session. The version of bash above and some other versions around the same time do not correctly export SSH_CLIENT. The fix for this is eliminate the bad version of bash from the target machine. Brett
... View more
03-27-2014
07:27 AM
No problem at all, welcome to our community!
... View more
03-26-2014
03:52 PM
Hey Clint, Thanks man.I did create the solr user and installation worked.I am trying to figure out why it did not create the user.
... View more
03-19-2014
08:54 AM
1 Kudo
It was a problem with specifying the TNS name.Got That resolved.Thank you all.
... View more