Member since
01-18-2016
166
Posts
32
Kudos Received
20
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
335 | 01-14-2025 06:30 PM | |
1490 | 04-06-2018 09:24 PM | |
1528 | 05-02-2017 10:43 PM | |
4180 | 01-24-2017 08:21 PM | |
25112 | 12-05-2016 10:35 PM |
03-03-2025
05:43 AM
@Maulz - You can use Knox as a proxy through the cdp-proxy-api topology to connect to Hive or Impala with basic authentication (username/password) like Hue. Using "cdp-proxy-api" assumes Knox is configured for basic authentication instead of SAML or JWT, etc. If it's not, you can manually create a new topology for basic authentication with hive. Here's how to enable Knox to expose Hive with the cdp-proxy-api if it's not configured yet (perhaps select "all" for transport mode: https://docs.cloudera.com/cdp-private-cloud-base/7.3.1/securing-hive/topics/hive_secure_knox.html It sounds like you have both AD and MIT KDC. If your user you want to use hive with is in the MIT KDC realm, not AD, you can use a different krb5.conf file, or you can set up a on-way trust between AD and MIT KDC. You can create your own krb5.conf configured for MIT KDC (or even configured for the one-way-trust, but the trust has to be established between the AD and MIT KDCs). export KRB5_CONFIG=/path/to/your/custom_krb5.conf You can use the default ticket cache file in the krb5.conf default_ccache_name = FILE:/tmp/krb5cc_%{uid} or set it as an environment variable (KRB5CCNAME). Of course you'll need these set for python as well.
... View more
02-28-2025
11:55 AM
@Maulz - Check this document for how to query hive from python3. You don't need Hue for this. https://docs.cloudera.com/cdsw/1.10.5/import-data/topics/cdsw-accessing-data-from-apache-hive.html This example is for using Kerberos, but you can change or remove the authentication settings, depending on your authentication requiremnts.
... View more
01-14-2025
06:30 PM
@Seaport, Let's address the kerberos issue before Ranger. Can you kinit as hdfs user? (on the NN with the hdfs keytab /var/run/cloudera-scm-agent/process/<a_number>-hdfs-NAMENODE/hdfs.keytab) Once you have a hdfs kerberos ticket, can you list directories? Did you properly configure sssd, integrated with AD in example.com realm on ALL cluster nodes? For the HDFS issue you're seeing, the group mapping, via sssd, to user is required on the active NN, but eventually you need it working on all nodes. If you run the command "id mysuperuser", is he in mysupergroup? For the Solr issue, check the CM -> Solr -> Configurations -> HDFS Data Directory. It should be something like /solr. If it's correct, you need to selecting CM -> Solr -> Actions -> Create HDS Home Dir. Then restart Solr. Note that after you install Ranger, the service name, znode and HDFS Home Dir will change to something like /solr-infra. If you need Solr for your own data (not service infrastructure like Solr and Atlas), install a separate Solr instance after installing Ranger. Good luck.
... View more
11-22-2024
10:25 AM
1 Kudo
@weixin As a test, try using curl and make sure you have a kerberos ticket: curl -u : --negotiate http://YOURHOST:PORT/jmx You may need to open a support case for this. I also highly recommend upgrading to CDP 7.1.9.
... View more
02-28-2024
09:41 AM
When using two realms, there has to be a trust between realms and your krb5.conf has to be configured properly to handle both realms on both the client and server. Setting this up isn't super difficult if you've done it once or twice but can be hard if it's new to you. The krb5.conf requires proper host or domain realm mapping. If you set up a 1 way trust (but it can also be a 2 way trust), and assuming you use MIT KDC for cluster service principals but AD is the other realm, then MIT KDC has to trust AD, but AD doesn't have to trust MIT KDC. To set up the trust you need to do configurations in both environments. Here's an example: https://community.cloudera.com/t5/Community-Articles/One-Way-Trust-MIT-KDC-to-Active-Directory/ta-p/247638 If the KDC trust isn't the issue, it may be something in there's probably an issue with the driver configuration. And, if this is being done on a Windows computer, you may need to configure the Windows machine to know about the other realm. I also recommend opening a Cloudera support case.
... View more
02-28-2024
09:25 AM
Hi, Do you have a question? The HDP Sandbox is no longer available or supported.
... View more
01-19-2024
04:54 PM
2 Kudos
That's a lot of log. Some of the error messages you see are normal. I'm not sure what your issue is. Do you see Cloudera Management Service below the Cluster services in CM (at the very bottom when you click Cloudera Manager - top left)? If so, click Instances and figure out which components/roles are not started. You can also click and start them one by one. Then you can look at the startup logs in the CM UI pop-up after it starts or fails. Check in the order of STDOUT, STDERR and lastly ROLE LOG, which is the log after it is started. You may need to check the Full Log.
... View more
01-17-2024
06:41 AM
1 Kudo
Check for errors the Cloudera Manager server log file in /var/log/cloudera-scm-server/cloudera-scm-server.log. Also, I see that the URL is nitbucc-vad001:7180. You should use fully qualified domain names (e.g. nitbucc-vad001.xyz.local) for hosts and in the playbook configurations. I'm not saying that is the issue, but you may run into issues later if you don't use FQDN (especially if you secure the cluster).
... View more
06-19-2018
08:57 PM
Thanks, Pardeep. To make it 500x faster, do 500 files per call to the hadoop command. By changing the second line above, we can do this instead: $ hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files
# Now using xargs -n 500 (or --max-args 500)
$ cat /tmp/under_replicated_files |xargs -n 500 hdfs dfs -setrep 1 /tmp/under_replicated_files<br>
... View more
04-06-2018
09:24 PM
1 Kudo
The code for creating principals in AD is here: ambari-server/src/main/java/org/apache/ambari/server/serveraction/kerberos/ADKerberosOperationHandler.java
... View more