About james_jones

james_jones · ‎03-03-2025

@Maulz - You can use Knox as a proxy through the cdp-proxy-api topology to connect to Hive or Impala with basic authentication (username/password) like Hue. Using "cdp-proxy-api" assumes Knox is configured for basic authentication instead of SAML or JWT, etc. If it's not, you can manually create a new topology for basic authentication with hive. Here's how to enable Knox to expose Hive with the cdp-proxy-api if it's not configured yet (perhaps select "all" for transport mode: https://docs.cloudera.com/cdp-private-cloud-base/7.3.1/securing-hive/topics/hive_secure_knox.html It sounds like you have both AD and MIT KDC. If your user you want to use hive with is in the MIT KDC realm, not AD, you can use a different krb5.conf file, or you can set up a on-way trust between AD and MIT KDC. You can create your own krb5.conf configured for MIT KDC (or even configured for the one-way-trust, but the trust has to be established between the AD and MIT KDCs). export KRB5_CONFIG=/path/to/your/custom_krb5.conf You can use the default ticket cache file in the krb5.conf default_ccache_name = FILE:/tmp/krb5cc_%{uid} or set it as an environment variable (KRB5CCNAME). Of course you'll need these set for python as well.

james_jones · ‎02-28-2025

@Maulz - Check this document for how to query hive from python3. You don't need Hue for this. https://docs.cloudera.com/cdsw/1.10.5/import-data/topics/cdsw-accessing-data-from-apache-hive.html This example is for using Kerberos, but you can change or remove the authentication settings, depending on your authentication requiremnts.

james_jones · ‎01-14-2025

@Seaport, Let's address the kerberos issue before Ranger. Can you kinit as hdfs user? (on the NN with the hdfs keytab /var/run/cloudera-scm-agent/process/<a_number>-hdfs-NAMENODE/hdfs.keytab) Once you have a hdfs kerberos ticket, can you list directories? Did you properly configure sssd, integrated with AD in example.com realm on ALL cluster nodes? For the HDFS issue you're seeing, the group mapping, via sssd, to user is required on the active NN, but eventually you need it working on all nodes. If you run the command "id mysuperuser", is he in mysupergroup? For the Solr issue, check the CM -> Solr -> Configurations -> HDFS Data Directory. It should be something like /solr. If it's correct, you need to selecting CM -> Solr -> Actions -> Create HDS Home Dir. Then restart Solr. Note that after you install Ranger, the service name, znode and HDFS Home Dir will change to something like /solr-infra. If you need Solr for your own data (not service infrastructure like Solr and Atlas), install a separate Solr instance after installing Ranger. Good luck.

james_jones · ‎11-22-2024

@weixin As a test, try using curl and make sure you have a kerberos ticket: curl -u : --negotiate http://YOURHOST:PORT/jmx You may need to open a support case for this. I also highly recommend upgrading to CDP 7.1.9.

james_jones · ‎02-28-2024

When using two realms, there has to be a trust between realms and your krb5.conf has to be configured properly to handle both realms on both the client and server. Setting this up isn't super difficult if you've done it once or twice but can be hard if it's new to you. The krb5.conf requires proper host or domain realm mapping. If you set up a 1 way trust (but it can also be a 2 way trust), and assuming you use MIT KDC for cluster service principals but AD is the other realm, then MIT KDC has to trust AD, but AD doesn't have to trust MIT KDC. To set up the trust you need to do configurations in both environments. Here's an example: https://community.cloudera.com/t5/Community-Articles/One-Way-Trust-MIT-KDC-to-Active-Directory/ta-p/247638 If the KDC trust isn't the issue, it may be something in there's probably an issue with the driver configuration. And, if this is being done on a Windows computer, you may need to configure the Windows machine to know about the other realm. I also recommend opening a Cloudera support case.

james_jones · ‎02-28-2024

Hi, Do you have a question? The HDP Sandbox is no longer available or supported.

james_jones · ‎01-19-2024

That's a lot of log. Some of the error messages you see are normal. I'm not sure what your issue is. Do you see Cloudera Management Service below the Cluster services in CM (at the very bottom when you click Cloudera Manager - top left)? If so, click Instances and figure out which components/roles are not started. You can also click and start them one by one. Then you can look at the startup logs in the CM UI pop-up after it starts or fails. Check in the order of STDOUT, STDERR and lastly ROLE LOG, which is the log after it is started. You may need to check the Full Log.

james_jones · ‎01-17-2024

Check for errors the Cloudera Manager server log file in /var/log/cloudera-scm-server/cloudera-scm-server.log. Also, I see that the URL is nitbucc-vad001:7180. You should use fully qualified domain names (e.g. nitbucc-vad001.xyz.local) for hosts and in the playbook configurations. I'm not saying that is the issue, but you may run into issues later if you don't use FQDN (especially if you secure the cluster).

james_jones · ‎06-19-2018

Thanks, Pardeep. To make it 500x faster, do 500 files per call to the hadoop command. By changing the second line above, we can do this instead: $ hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files # Now using xargs -n 500 (or --max-args 500) $ cat /tmp/under_replicated_files |xargs -n 500 hdfs dfs -setrep 1 /tmp/under_replicated_files<br>

james_jones · ‎04-06-2018

The code for creating principals in AD is here: ambari-server/src/main/java/org/apache/ambari/server/serveraction/kerberos/ADKerberosOperationHandler.java

Online	Offline
Last Visited	‎03-07-2025 10:12 AM

Member Since	‎01-18-2016 02:01 PM
Last Visited	‎03-07-2025 10:12 AM
Posts	166
Kudos received	31

Cloudera Community

Re: How do HDFS Permissions work after Kerberos is...

Re: Ambari SPN creation on remote AD

Re: Solr on HDF

Re: Wrong timezone in Ranger admin

Re: SOLR server connection refused

Re: How to connect hive using Python for pytest da...

Re: How to connect hive using Python for pytest da...

Re: How do HDFS Permissions work after Kerberos is...

Re: CDH6.2.1 add Kerberos , hive server2 jmx thro...

Re: Connection to Hive & Impala - Kerberos Authent...

Re: hdp sandbox

Re: Getting Error while running the ansible play b...

Re: Getting Error while running the ansible play b...

Re: Fix Under-replicated blocks in HDFS manually

Re: Ambari SPN creation on remote AD