- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Configure hadoop-client tools to access hdfs from external computer
- Labels:
-
HDFS
Created ‎06-16-2017 12:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would like to be able to perform hdfs commands from a computer that is NOT actually part of the cloudera cluster.
For example, performing simple put/get operations or
hdfs dfs -ls /my/dir
I have installed the correct binaries, I think. I found from CM that I was using CDH4.7.1. So, I installed (sudo apt-get install hadoop-client) the binaries from here.
If I run:
hdfs dfs -ls /
I get:
Error: JAVA_HOME is not set and could not be found.
I feel that this might just be the beginning of a long tinkering and configuring process and I unfortunately know nothing about java. I do, however, know the IPs of the namenodes on my cluster and have access to all admin rights from beginning to end.
Can someone help me get things configured?
P.S. In case it wasn't clear. I can perform all desired functionality on nodes that are part of the cluster. I just want to do something similar from my development environment.
Created ‎06-26-2017 02:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. set up the cdh 5 repo
2. installed hadoop-client with my package manager
3. updated the configs manually (scp or cm api)
4. ???
5. profit
Created ‎06-16-2017 02:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I should refine my question. Part of what prompted this is that I noticed on the nodes of the cluster that $JAVA_HOME is not defined either.
That makes me think that there might be certain configuration files on the cluster nodes that maybe I can copy over to my development environment.
Created ‎06-16-2017 04:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is part of the magic that happens when you install a gateway roles on a nodes using CM. It installs the binaries, download the client configs, sets env vars in such a way that no users on that nodes need to do it. It just works. I have not dug into it to determine how. I expected that it is wrapped up in scripts somewhere that get called when using the clients.
Without it, you will need to do it all yourself. And JAVA_HOME isn't the only one, HADOOP_HOME, HADOOP_CONF, etc.
The Hadoop docs should have some information on it.
Or... you can add the machine in question to CM, add the gateway role, and then remove it from the cluster. You shouldn't need to uninstall the gateway role to remove it from the cluster. No license is required for the gateway roles either.
Created ‎06-16-2017 04:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a bunch! That's kind of what I feared. Glad I didn't go down too much of a rabbit hole yet.
For what it's worth, the process is really easy to get `impala-shell` working. I just added the Cloudera repository, installed with apt-get and then put the ip address of an impala node into the commands.
I was hoping to get something similar for hdfs. Oh well. I'll look into what you said.
Created ‎06-26-2017 02:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. set up the cdh 5 repo
2. installed hadoop-client with my package manager
3. updated the configs manually (scp or cm api)
4. ???
5. profit
Created ‎06-26-2017 07:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@mbigelow this is really exciting. Thanks for following up on this thread.
I am way back on CM 4.8.5 and CDH 4. Nevertheless, I downloaded and installed the repos similar to what you mention in step two.
Step three is a little foggy for me. Can you elaborate on "updated the configs manually"? Specifically, which configs should I copy over?
Thanks again. This would be a huge win for me if this works.
Created ‎06-26-2017 06:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎06-27-2017 07:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry. I'm really, really new at Hadoop and Cloudera. I actually have no idea where any of the config files are on the cluster nodes.
Can you give me the paths to the config files or tell me where I can find them? Or maybe you could tell me the filenames and I'll search with `find`. Whatever is easiest...
Created ‎06-27-2017 07:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎06-27-2017 09:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is you cluster managed by cloduera manager ?
if not it is pretty straight forward you can grab it from
/etc/hadoop/conf/
grab the core-site.xml , hdfs-site.xml , hive-site.xml
