Member since
04-14-2015
20
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4836 | 11-20-2015 05:56 AM |
01-25-2017
08:39 AM
I'm trying to access an S3 buckets using the HDFS utilities like below: hdfs dfs -ls s3a://[BUCKET_NAME]/ but I'm getting the error : -ls: Fatal internal error
com.cloudera.com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain On the gateway node where I'm running the command, I don't have an AWS instance profile attached, but I do have one attached on all datanodes and namenodes. Running this command from one of the datanodes or namenodes works successfully. Is there a way I can run this command using instance profiles (no stored access keys or credentials) only on datanodes and namenodes. The reason I'm doing this is that I don't want to allow for direct S3 access from the gateway node.
... View more
Labels:
11-22-2016
08:14 AM
Sameerbabu - I'm having a similar issue, did you ever figure this issue out?
... View more
05-06-2016
05:07 AM
Thanks - that worked!
... View more
05-02-2016
10:01 AM
I'm using the python CM API to try and enable HDFS HA on my cluster: hdfs_service .enable_nn_ha( active_name =hdfs_nn_host , nameservice = "nameservice1" , standby_host_id =api.get_host(hdfs_snn_host).hostId , jns =journal_nodes , zk_service_name =ZOOKEEPER_SERVICE_NAME , force_init_znode = True , clear_existing_standby_name_dirs = True , clear_existing_jn_edits_dir = True ).wait() This command leads to the error: cm_api.api_client.ApiException: Could not find NameNode with name 'host1' where host1 is the name of host running the NameNode service as shown by Cloudera Manager. My question around the active_name parameter to this function is what actual value is the CM API looking for? I've tried supplying the hostId value for this node as well, with no luck. My HDFS service is up/running healthy as I am able to do all hdfs dfs * commands.
... View more
04-20-2016
11:31 AM
Hue version - 3.9 Oozie - 4.1.0 Hue.ini liboozie section (omitted host and port): [liboozie] remote_data_dir=/user/hue/oozie/workspaces oozie_url=http://HOST:PORT:11000/oozie security_enabled=true I'm seeing the following stack traces in the Hue and Oozie logs: [20/Apr/2016 14:05:36 -0400] kerberos_ ERROR handle_mutual_auth(): Mutual authentication unavailable on 404 response [20/Apr/2016 14:05:36 -0400] kerberos_ ERROR handle_mutual_auth(): Mutual authentication unavailable on 404 response [20/Apr/2016 14:05:36 -0400] kerberos_ ERROR handle_mutual_auth(): Mutual authentication unavailable on 500 response [20/Apr/2016 14:05:36 -0400] views ERROR Error in config validation by liboozie: <html><head><title>Apache Tomcat/6.0.44 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 500 - </h1><HR size="1" noshade="noshade"><p><b>type</b> Exception report</p><p><b>message</b> <u></u></p><p><b>description</b> <u>The server encountered an internal error that prevented it from fulfilling this request.</u></p><p><b>exception</b> <pre>java.lang.UnsupportedOperationException org.apache.oozie.util.MetricsInstrumentation.getVariables(MetricsInstrumentation.java:333) org.apache.oozie.servlet.BaseAdminServlet.instrToJson(BaseAdminServlet.java:339) org.apache.oozie.servlet.BaseAdminServlet.sendInstrumentationResponse(BaseAdminServlet.java:396) org.apache.oozie.servlet.BaseAdminServlet.doGet(BaseAdminServlet.java:127) javax.servlet.http.HttpServlet.service(HttpServlet.java:707) org.apache.oozie.servlet.JsonRestServlet.service(JsonRestServlet.java:289) javax.servlet.http.HttpServlet.service(HttpServlet.java:820) org.apache.oozie.servlet.AuthFilter$2.doFilter(AuthFilter.java:171) org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:589) org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:552) org.apache.oozie.servlet.AuthFilter.doFilter(AuthFilter.java:176) org.apache.oozie.servlet.HostnameFilter.doFilter(HostnameFilter.java:86) </pre></p><p><b>note</b> <u>The full stack trace of the root cause is available in the Apache Tomcat/6.0.44 logs.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/6.0.44</h3></body></html> (error 500) Traceback (most recent call last): File "/opt/cloudera/fs-hue/hue/desktop/core/src/desktop/views.py", line 445, in _get_config_errors for confvar, error in validator(request.user): File "/opt/cloudera/fs-hue/hue/desktop/libs/liboozie/src/liboozie/conf.py", line 86, in config_validator intrumentation = api.get_instrumentation() File "/opt/cloudera/fs-hue/hue/desktop/libs/liboozie/src/liboozie/oozie_api.py", line 304, in get_instrumentation resp = self._root.get('admin/instrumentation', params) File "/opt/cloudera/fs-hue/hue/desktop/core/src/desktop/lib/rest/resource.py", line 97, in get return self.invoke("GET", relpath, params, headers=headers, allow_redirects=True) File "/opt/cloudera/fs-hue/hue/desktop/core/src/desktop/lib/rest/resource.py", line 78, in invoke urlencode=self._urlencode) File "/opt/cloudera/fs-hue/hue/desktop/core/src/desktop/lib/rest/http_client.py", line 161, in execute raise self._exc_class(ex) RestException: <html><head><title>Apache Tomcat/6.0.44 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 500 - </h1><HR size="1" noshade="noshade"><p><b>type</b> Exception report</p><p><b>message</b> <u></u></p><p><b>description</b> <u>The server encountered an internal error that prevented it from fulfilling this request.</u></p><p><b>exception</b> <pre>java.lang.UnsupportedOperationException org.apache.oozie.util.MetricsInstrumentation.getVariables(MetricsInstrumentation.java:333) org.apache.oozie.servlet.BaseAdminServlet.instrToJson(BaseAdminServlet.java:339) org.apache.oozie.servlet.BaseAdminServlet.sendInstrumentationResponse(BaseAdminServlet.java:396) org.apache.oozie.servlet.BaseAdminServlet.doGet(BaseAdminServlet.java:127) javax.servlet.http.HttpServlet.service(HttpServlet.java:707) org.apache.oozie.servlet.JsonRestServlet.service(JsonRestServlet.java:289) javax.servlet.http.HttpServlet.service(HttpServlet.java:820) org.apache.oozie.servlet.AuthFilter$2.doFilter(AuthFilter.java:171) org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:589) org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:552) org.apache.oozie.servlet.AuthFilter.doFilter(AuthFilter.java:176) org.apache.oozie.servlet.HostnameFilter.doFilter(HostnameFilter.java:86) </pre></p><p><b>note</b> <u>The full stack trace of the root cause is available in the Apache Tomcat/6.0.44 logs.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/6.0.44</h3></body></html> (error 500) Any idea what is going wrong?
... View more
04-14-2016
08:07 AM
@Javier - I don't know the exact version this was released in, but I think the JIRA that we were hitting was HDFS-7798
... View more
01-05-2016
05:19 AM
1 Kudo
bulmanp - The private_key parameter should be the contents of the private key file (in your case, the 2nd option should have worked). Here is the working code I use : f = open ( "/root/.ssh/id_rsa" , "r" ) id_rsa = f .read() #print id_rsa f .close() #passwordless certificate login apicommand = cm.host_install( user_name = "root" , private_key =id_rsa , host_names =hostIds , cm_repo_url =cm_repo_url , java_install_strategy = "NONE" , unlimited_jce = True ).wait()
... View more
12-07-2015
05:56 AM
Thanks, David. It turns out the fix for the error we were seeing wasn't included in the version of CDH we are running. Once we upgrade to this version, we should no longer see this issue.
... View more
12-04-2015
12:04 PM
MJ - How is the replication schedule created first using the Java API?
... View more
11-20-2015
06:37 AM
We have been seeing errors consistently in the NN logs related to checkpointing. Our NNs are not able to automatically perform a checkpoint - the only way is for us to put them in Safe Mode and manually run a Save Namespace command. We see these errors over and over in the logs: Exception in doCheckpoint
java.io.IOException: Exception during image upload: org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException: org.apache.hadoop.security.authentication.util.SignerException: Invalid signature
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:221)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.doWork(StandbyCheckpointer.java:353)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.access$700(StandbyCheckpointer.java:260)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread$1.run(StandbyCheckpointer.java:280)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:410)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.run(StandbyCheckpointer.java:276)
Caused by: org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException: org.apache.hadoop.security.authentication.util.SignerException: Invalid signature
at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImage(TransferFsImage.java:294)
at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImageFromStorage(TransferFsImage.java:222)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:207)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:204)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) Exception in doCheckpoint
java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:450)
at org.apache.hadoop.io.Text.encode(Text.java:431)
at org.apache.hadoop.io.Text.writeString(Text.java:491)
at org.apache.hadoop.fs.permission.PermissionStatus.write(PermissionStatus.java:117)
at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.writePermissionStatus(FSImageSerialization.java:99)
at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.writeINodeFileAttributes(FSImageSerialization.java:216)
at org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.write(FileDiff.java:81)
at org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotFSImageFormat.saveINodeDiffs(SnapshotFSImageFormat.java:89)
at org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotFSImageFormat.saveFileDiffList(SnapshotFSImageFormat.java:102)
at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.writeINodeFile(FSImageSerialization.java:196)
at org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.saveINode2Image(FSImageSerialization.java:332)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.saveINode2Image(FSImageFormat.java:1433)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.saveChildren(FSImageFormat.java:1335)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.saveImage(FSImageFormat.java:1393)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.saveImage(FSImageFormat.java:1408)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.saveImage(FSImageFormat.java:1408)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.saveImage(FSImageFormat.java:1408)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.saveImage(FSImageFormat.java:1408)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.saveImage(FSImageFormat.java:1408)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:1279)
at org.apache.hadoop.hdfs.server.namenode.FSImage.saveLegacyOIVImage(FSImage.java:973)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:193)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.doWork(StandbyCheckpointer.java:353)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.access$700(StandbyCheckpointer.java:260)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread$1.run(StandbyCheckpointer.java:280)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:410)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.run(StandbyCheckpointer.java:276) Has anyone seen this or found a solution for it? We are running CM 5.4.7 and CDH 5.4.0
... View more
Labels:
11-20-2015
05:56 AM
I found out what I was doing wrong - I had a node on the source cluster with the Hive gateway role installed, but it wasn't configured 100% correctly. For some reason when the BDR jobs were launched, they kept running on this node and immediately failing so I wasn't getting any errors. The export metastore step of the Hive Replication job will run on a source cluster node that has either the HiveServer or Hive Gateway role installed
... View more
11-12-2015
10:23 AM
1 Kudo
I'm having issues with running Hive replication jobs (these worked previously in the past), but due to some unknown system/configuration changes these jobs are now aborting almost immediately in the "Export Remote Hive Metastore" phase. I've been hunting around on both the source and target clusters and I'm unable to find any trace of log files for this job. Does anyone know where I should be looking for this information?
So far I've looked in:
/var/log/hive on local filesystems where Hive Metastore and Hive Server are running
/user/hdfs/.cm/hive on target cluster
/var/run/cloudera-scm-agent/process/* on all nodes in both clusters
... View more
Labels:
11-04-2015
02:06 PM
It looks like I solved the issue - it seems the python CM API has now been changed for the host_install command. Before it was taking a file name as the private key and is now expecting it to be a string variable
... View more
11-04-2015
12:15 PM
I'm experiencing this error as well when using the python CM API and calling the host_install command while providing the id_rsa key from the server I'm invoking the python script from. I tried copying the id_rsa file to the CM server /root/.ssh directory as well, but that didn't help. Where should the private key be located when referencing from the host_install command in the python CM API?
... View more
10-20-2015
05:47 AM
I was looking through the python CM API documentation and see a number of classes in endpoints.types that look to be related to replicating data (ApiReplicationCommand, ApiReplicationSchedule, etc), but I don't see anything related to their usage. Is there a way to invoke HDFS or Hive replication jobs through the python CM API?
... View more
09-10-2015
07:39 AM
Hi, I'm using CDH 5.4.1 and am trying to setup a s3distcp copy from CDH cluster to S3. We are using the jets3t framework to do this, which requires the jets3t.properties file exist in /etc/hadoop/conf directory on each node in the cluster. I have tried manually creating this file, but it looks like it is being periodically overwritten by CDH or CM. How should I go about getting this file to persist in the /etc/hadoop/conf directory? Is it possible? A safety valve will not work in this situation because an entirely different properties file needs to exist in the /etc/hadoop/conf folder outside of the view of Cloudera.
... View more
07-23-2015
05:03 AM
Thanks Michalis. I was able to get the current activated parcel by searching for the product "CDH" and selecting the one who's stage ="ACTIVATED" using the CM API.
... View more
07-15-2015
08:06 AM
I using version 10 of the python CM API and am having issues getting the reference to the correct parcel in our cluster. Currently, we have 4 or 5 parcels located in the cluster and I am not able to select the correct one. I see the ApiCluster method : get_parcel ( self , product , version ) , but I don't know where to find the full parcel version to pass into this method using other API calls. A version string = "5.4.0" is not specific enough to find the correct parcel. Thanks, Tyler
... View more
06-03-2015
05:24 AM
DaveB - Were you ever able to solve this? I'm experiencing the same issue with cloudera_manager_metastore_canary_test_db_hive_hivemetastore* databases accumulating in Hive. When I try to drop the database with cascade parameter or the cm_test_table I see the same errors as you did.
... View more