Member since
08-16-2016
14
Posts
2
Kudos Received
0
Solutions
07-10-2017
11:27 PM
We have the superuser group defined as 'supergroup' in our configuration. However, this goup does not exist in any of the nodes. If I have to set up this group and start adding a couple of other accounts to have super usr access to hdfs, where should this Linux group be created? Should it be created in all nodes in the cluster? Or is it sufficient to create the Linux group in the Namenode hosts only?
... View more
Labels:
- Labels:
-
HDFS
07-06-2017
02:05 AM
We have a few commands which use the HDFSFindtool and are being initiated from crontab of hdfs user. There is also another crontab entry to execute 'hdfs dfs -ls' from hdfs user's crontab entry. These are working without any issues. We do not issue a kinit command before running them from the crontab. However recently, when we set up a shell script which issues 'hdfs dfs -du' command from hdfs user's crontab, it started throwing out the below GSS initiate failure error "WARN security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]" Does the hdfs user have to issue a kinit command before running the script? In that case, why are the other commands working fine without that?
... View more
Labels:
- Labels:
-
Kerberos
11-22-2016
01:24 AM
@roczei Thank you for your response. Do you mean to say that functionally, there will be no difference in a gateway set up outside? I can submit applications and access HDFS as I would do from a gateway within the cluster, correct?
... View more
11-21-2016
12:46 AM
2 Kudos
We have a requirement to add a new gateway to hadoop cluster in additional to the one we already have. I have a couple of queries if anyone can help. 1. Are there any concerns in adding multiple gateways to a CDH cluster? 2. I came across the below article describing how to set up a gateway outside the cluster. https://cloudera-portal.force.com/articles/KB_Article/Install-Cloudera-Client-on-non-Cloudera-server?popup=false&navBack=H4sIAAAAAAAAAIuuVipWslLyzssvz0lNSU_1yM9NVdJRygaKFSSmp4ZkluSA-KVAvn58aaZ-NkyhPpCDosu-ODWxKDnDVqk2FgDeHqwIVQAAAA We have a kerberized cluster. Is it recommended to set up a gateway outside and will it be possible to submit jobs from this external gateway? (We have Oozie and Httpfs services running in a different host within the cluster). Are there any disadvantages to this approach? Thank you in advance for your valuable inputs.
... View more
Labels:
- Labels:
-
HDFS
11-04-2016
01:47 AM
Hi Michalis, It worked just fine for me. The dumping process took only a few seconds and there was no impact. Thank you!
... View more
10-27-2016
01:58 AM
Thank you Michalis. It makes sense. I was confused because the db name was not being specified in the Cloudera documentation. Also do we have to take any precautions before dumping the database (CM , Hive metastore and Oozie dbs)? Does it affect anything if I do it in a cluster when jobs are running?
... View more
10-26-2016
09:16 AM
The document says to run the below as root user from the host running the CM server # pg_dump -h hostname -p 7432 -U scm > /tmp/scm_server_db_backup.$(date +%Y%m%d) I have a query here. There is not database specified to be dumped (-d switch). It only specifies the user as scm (-U). How does it dump the Cloudera Manager database? We use the embedded database in our cluster. All databases like Hive metastore, Oozie etc are configured to the same host where Cloudera Manager server runs and embedded postgres database. If I have to backup the other databases, is it enough to provide the user (-U) switch only? Does it dump the database owned by that user only by default? Also would it create any perfomance issues if I dump the Cloudera Manager , Hive and Oozie databases in a production environment?
... View more
08-29-2016
07:25 AM
Thank you for the reply. In the current situation, the datadir and datalogdir are in the same location. We have the below files in the dir. acceptedEpoch currentEpoch log.xxxxxx snapshot.xxxx If I have to move the datalogdir only, I just need to copy the log.xxx files to the new location in all Zookeeper servers, update the directory in the configuration and restart the zookeeper instances only. Could you please confirm if it is correct? The Epoch files and snapshot.xxx belongs in the datadir correct?
... View more
08-25-2016
06:29 AM
I am looking for guidance on moving the zookeeper (datadir and datalogdir) to a dedicated disk as per the best practice. I am unable to find a documentation which can help me. Currently the datadir and datalogdir disk is shared with other processes. Any assistance is appreciated.
... View more
Labels:
- Labels:
-
Apache Zookeeper