Member since
01-19-2017
3679
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 927 | 06-04-2025 11:36 PM | |
| 1532 | 03-23-2025 05:23 AM | |
| 760 | 03-17-2025 10:18 AM | |
| 2730 | 03-05-2025 01:34 PM | |
| 1809 | 03-03-2025 01:09 PM |
01-03-2021
03:36 PM
1 Kudo
@bvishal Sorry was away for a while 1) Yes, I have entered the 'admin principal' in the same format example/admin@EXAMPLE.AI. in the pop-up window. Somehow I feel your values are not correct in the ambari wizard you should enter either root/admin@EXAMPLE.AI admin/admin@EXAMPLE.AI depending on the teh value you gave when adding the admin principal when you rûn initially the kadmin.local 2) Also, I checked the krb5.conf and found a section for my realm (EXAMPLE.COM) inside the [realms] part of the file. The above part in the krb5.conf is wrong it should be EXAMPLE.AI Sample of /etc/krb5.conf' [libdefaults] default_realm = EXAMPLE.AI dns_lookup_realm = false dns_lookup_kdc = false ticket_lifetime = 24h forwardable = true udp_preference_limit = 1000000 default_tkt_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1 default_tgs_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1 permitted_enctypes = des-cbc-md5 des-cbc-crc des3-cbc-sha1 [realms] EXAMPLE.AI = { kdc = kdc.EXAMPLE.AI admin_server = kdc.EXAMPLE.AI default_domain = EXAMPLE.AI } [domain_realm] .example.ai = EXAMPLE.AI example.ai = EXAMPLE.AI [logging] kdc = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmin.log default = FILE:/var/log/krb5lib.log Replace all occurences of EXAMPLE.COM with EXAMPLE.AI in the kdc.conf and kadm5.acl Please let me know if you still need help
... View more
01-03-2021
02:49 PM
@HoldYourBreath I now see what's happening you need the start the CM and all the roles on the Quickstart VM before you can connect successfully through HUE I also think you are really short on memory as you can see the Cloudera Express needs 8GB of memory and 2 CPU while the Cloudera Enterprise needs at least 10GB and 2 CPU's you can see the highlighted parts. I would advise you to spin up a Windows 10 VM in Azure and use that for your learning beware Cloudera no longer provides access to Quickstart you have CDP trial!! Was your question answered? If so make sure to mark the answer as the accepted solution. If you find a reply useful, kudo this answer by hitting the thumbs up button.
... View more
01-03-2021
01:48 PM
@rohit_r_sharma Can you share the syntax for the topic creation? Is you cluster kerberized? Is your Ranger Kafka plugin enabled? Please response and tag me!
... View more
01-03-2021
01:45 PM
1 Kudo
@bvishal You don't really need to mix 2 different databases [Postresql and MySQL]. You can use MySQL or MariaDB free version of MYSQL in the advanced database configuration for you cluster. MySQL has the typical SQL syntax Postgresql is another world ! You don't need to install MySQL on the ambari agent hosts because that will mean if you have 20 nodes you will be running 20 MySQL/MariaDB databases. Usually, you install the MySQL/MariaDB on the Ambari host and you create apart from Ambari database, hive,oozie, ranger,RangerKMS etc. If you are deploying using Ambari then the ambari-agents are deployed automatically and configured by Ambari. Was your question answered? If so make sure to mark the answer as the accepted solution. If you find a reply useful, kudos this answer by hitting the thumbs up button.
... View more
01-03-2021
12:17 PM
@PauloNeves Yes, the command show databases will list all databases in a Hive instance whether you are authorized to access it or not. I am sure this is cluster devoid of Ranger or Sentry which are the 2 authorization tools in Cloudera!!! Once the ranger plugin is enabled then authorization is delegated to Ranger to provide fine-grained data access control in Hive, including row-level filtering and column-level masking. This is the recommended setting to make your database administration easier as it provides a centralized security administration, access control, and detailed auditing for user access within the Hadoop, Hive, HBase, and other components in the ecosystem. Unfortunately, I had already enabled the Ranger plugin for hive on my cluster but all the same, it confirms what I wrote above. Once the ranger plugin is enabled for a component ie. hive,HBase or Kafka then the authorization is managed exclusively through Ranger Database listing before Ranger Below is what happens if my user sheltong has not explicitly been given authorization through Ranger, see [screenshots] I see no database though I have over 8 databases See the output of the hive user who has explicit access to all the tables due to the default policy he could see the databases. Database listing after Ranger After creating a policy explicitly giving the user sheltong access to the 3 databases Policy granting explicit access to 3 databases Now when I re-run the show databases bingo! Back to your question show tables from forbidden_db, it returns an empty list, this can be true especially if the database is empty! has not table like the screenshot below though I have access to the database it's empty Now I create a table and re-run the select now I am able to see the table I hope this demonstrates the power of Ranger and explains maybe what you are encountering, I am also thinking if your cluster has Ranger hive plugin enabled you could have select on the databases but you will need explicit minimum select or the following permission on the underlying database tables to be able to see them. Happy Hadooping
... View more
01-03-2021
03:18 AM
@nishant2305 Can you share the walkthrough of your setup? generation of cert using tls toolkit? Just wondering is this host existing ?? ldap://ldap_hostname:389 And the associated LDIF dc=example,dc=org cn=admin,dc=example,dc=org Please revert
... View more
01-02-2021
04:07 PM
1 Kudo
@Chahat_0 Hadoop is designed to ensure that compute (Node Managers) runs as close to data (Data Nodes) as possible. Usually containers for jobs are allocated on the same nodes where the data is present. Hence in a typical Hadoop cluster, both Data Nodes and Node Manager run on the same machine. Node Manager is the RM slave process while the Data Nodes is the Namenode slave process which responsible for coordinating HDFS functions Resource Manager: Runs on a master daemon and manages the resource allocation in the cluster. Node Manager: They run on the slave daemons and are responsible for the execution of a task on every single Data Node Node Managers manage the containers requested by jobs Data Nodes manage the data The NodeManager (NM) is YARN’s per-node agent and takes care of the individual compute nodes in a Hadoop cluster. This includes keeping up-to-date with the ResourceManager (RM), overseeing containers’ life-cycle management; monitoring resource usage (memory, CPU) of individual containers, tracking node-health, log’s management, and auxiliary services that may be exploited by different YARN applications. NodeManager communicates directly with the ResourceManager. Resource manager and Namenode both as master components [processes] that can run in single or HA setup should run on separate identical usually high spec servers [nodes] as compared to the data nodes. Zookeeper is another important component ResourceManager and NodeManager combine together to form a data-computation framework. ResourceManager acts as the scheduler and allocates resources amongst all the applications in the system. NodeManager takes navigation from the ResourceManager and it runs on each node in the cluster. Resources available on a single node is managed by NodeManager. ApplicationMaster, a framework-specific library is responsible for running specific YARN job and for negotiating resources from the ResourceManager, and working with NodeManager to execute and monitor containers. Hope that helps
... View more
01-02-2021
03:37 PM
1 Kudo
@HoldYourBreath I downloaded a CDH quickstart VM and imported it into my new Oracle Virtualbox 6.1 please find attached screenshots to show you my configs. 01.JPG-->is the network setup Adapter1 is Bridged Adapter and Adapter2 is NAT 02.JPG-->Bridged Adapter details 02b.JPG-->Memory setting I gave my quickstart Sandbox 16 GB 2 CPU my host has 32 GB 4CPU I started the quickstart Sandbox and I was presented with the Classic UI 03.JPG-->CDH quickstart default sandbox UI 04.JPG-->Clicked to console see the arrow and run ifconfig , clearly it picked Bridged Adapter class C 192.168.x IP from my LAN and the default 10.x IP 05.JPG--> The host's host file entry with the FQDN though I used the IP to 05b.JPG--> Combined UI's to show VM, host file, and Chrome CM opened on port 7180 06.JPG--> Started these default quickstart roles 07.JPG--> Roles all running OK 08.JPG--> Detail of the HDFS overview on port 50070. I didn't do any changes to the FW etc 09.JPG--> File browser 10.JPG--> HUE UI on port 8888 11.JPG--> Files/Docs in HUE browser I didn't encounter the same problems as you but I wanted to remind you also that ensure you have enough memory to allocate your sandbox. 01.JPG 02.JPG 02b.JPG 03.JPG 04.JPG 05.JPG 05b.JPG 06.JPG 07.JPG 08.JPG 09.JPG 10.JPG 11.JPG
... View more
01-01-2021
02:05 PM
@prasanna06 Your problem resembles this one check your cluster UI to ensure that workers are registered and have sufficient resources Happy hadooping
... View more
01-01-2021
02:00 PM
@chhaya_vishwaka Can you confirm you went through all the Prerequisites for adding classic clusters and checked against the Cloudera Support Matrix.?? Please revert
... View more