Member since
04-14-2015
20
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
8708 | 11-20-2015 05:56 AM |
01-25-2017
08:39 AM
I'm trying to access an S3 buckets using the HDFS utilities like below: hdfs dfs -ls s3a://[BUCKET_NAME]/ but I'm getting the error : -ls: Fatal internal error
com.cloudera.com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain On the gateway node where I'm running the command, I don't have an AWS instance profile attached, but I do have one attached on all datanodes and namenodes. Running this command from one of the datanodes or namenodes works successfully. Is there a way I can run this command using instance profiles (no stored access keys or credentials) only on datanodes and namenodes. The reason I'm doing this is that I don't want to allow for direct S3 access from the gateway node.
... View more
Labels:
- Labels:
-
HDFS
11-22-2016
08:14 AM
Sameerbabu - I'm having a similar issue, did you ever figure this issue out?
... View more
05-06-2016
05:07 AM
Thanks - that worked!
... View more
05-02-2016
10:01 AM
I'm using the python CM API to try and enable HDFS HA on my cluster: hdfs_service.enable_nn_ha(active_name=hdfs_nn_host, nameservice="nameservice1", standby_host_id=api.get_host(hdfs_snn_host).hostId, jns=journal_nodes, zk_service_name=ZOOKEEPER_SERVICE_NAME, force_init_znode=True, clear_existing_standby_name_dirs=True, clear_existing_jn_edits_dir=True).wait() This command leads to the error: cm_api.api_client.ApiException: Could not find NameNode with name 'host1' where host1 is the name of host running the NameNode service as shown by Cloudera Manager. My question around the active_name parameter to this function is what actual value is the CM API looking for? I've tried supplying the hostId value for this node as well, with no luck. My HDFS service is up/running healthy as I am able to do all hdfs dfs * commands.
... View more
Labels:
- Labels:
-
Cloudera Manager
-
HDFS
01-05-2016
05:19 AM
1 Kudo
bulmanp - The private_key parameter should be the contents of the private key file (in your case, the 2nd option should have worked). Here is the working code I use : f = open("/root/.ssh/id_rsa", "r") id_rsa = f.read() #print id_rsa f.close() #passwordless certificate login apicommand = cm.host_install(user_name="root", private_key=id_rsa, host_names=hostIds, cm_repo_url=cm_repo_url, java_install_strategy="NONE", unlimited_jce=True).wait()
... View more
12-04-2015
12:04 PM
MJ - How is the replication schedule created first using the Java API?
... View more
11-20-2015
05:56 AM
I found out what I was doing wrong - I had a node on the source cluster with the Hive gateway role installed, but it wasn't configured 100% correctly. For some reason when the BDR jobs were launched, they kept running on this node and immediately failing so I wasn't getting any errors. The export metastore step of the Hive Replication job will run on a source cluster node that has either the HiveServer or Hive Gateway role installed
... View more
11-12-2015
10:23 AM
1 Kudo
I'm having issues with running Hive replication jobs (these worked previously in the past), but due to some unknown system/configuration changes these jobs are now aborting almost immediately in the "Export Remote Hive Metastore" phase. I've been hunting around on both the source and target clusters and I'm unable to find any trace of log files for this job. Does anyone know where I should be looking for this information?
So far I've looked in:
/var/log/hive on local filesystems where Hive Metastore and Hive Server are running
/user/hdfs/.cm/hive on target cluster
/var/run/cloudera-scm-agent/process/* on all nodes in both clusters
... View more
Labels:
- Labels:
-
Apache Hive
-
Cloudera Manager
11-04-2015
02:06 PM
It looks like I solved the issue - it seems the python CM API has now been changed for the host_install command. Before it was taking a file name as the private key and is now expecting it to be a string variable
... View more