About Shelton

Shelton · ‎06-30-2021

@mike_bronson7 Here you go how to determine YARN and MapReduce Memory Configuration Settings Happy hadooping

Shelton · ‎06-29-2021

@noamsh88 check out these cloudera API's http://cloudera.github.io/cm_api/apidocs/v18/path__cm_license.html Hope that helps

Shelton · ‎06-29-2021

@mike_bronson7 Migration from 190 to 220 is an additional 30 DN and NM theoretically an the Resource Manager Is responsible for resource management and consists of two components: the scheduler and application manager: The scheduler allocates resources: It has extensive information about an application's resource needs, which allows it to make scheduling decisions across all applications in the cluster. Essentially, an application can ask for specific resource requests, via the YARN application master, to satisfy its resource needs. The scheduler responds to a resource request by granting a container, which satisfies the requirements laid out by the application master in the initial resource request. The application manager: Accepts job submissions, negotiating the first container to execute the application-specific application master and to restart the application master container on failure. Node managers A node manager is a per-machine or VM framework agent responsible for managing resources available on a single node. They monitors resource usage for containers and report to the scheduler within the resource manager. You can have multiple node managers just ensure you have required memory reserved for the host OS funtionlity Just like Namenodes Resource managers need to be high spec servers if you can stick to some basics like the table below your 2 RM's can handle 200 nodes with ease.Take into consideration NN and ZK,HBase memory configs. I remember running 300 data nodes/node manager in a project 2 years ago with exact setup like yours Hope that helps

Shelton · ‎06-29-2021

@mike_bronson7 Have you recently changed your YARN configs [CapacityScheduler and the FairScheduler]? That seem related to over subscription? Can you share your queue or capacity-scheduler.xml or fair-scheduler.xml. Can you check the Ambari Ui-->YARN--> Config--> Version to ensure there is wasn't a change. Geoffrey

Shelton · ‎06-29-2021

@jiyo Your attempt will fail the below explains why your gmail account is not recognized 🙂 As of February 1, 2021, access to CDH binaries for production purposes requires authentication. To access the binaries at the locations below, you must first have an active subscription agreement and obtain a license key file along with the required authentication credentials (username and password). The license key file and authentication credentials are provided in an email sent to customer accounts from Cloudera when a new license is issued. If you have an existing license with a Cloudera Enterprise entitlement, you might not have received an email. In this instance, you can identify the authentication credentials from the license key file. If you do not have access to the license key, contact your account representative to receive a copy. See documentation here Hope that helps

Shelton · ‎06-29-2021

@harsh8 I think the answer is yes! Below I will try to demonstrate by creating a table from an existing dataset copied to HDFS [hdfs@bern sqoop]$ hdfs dfs -ls /tmp/sqoop Found 1 items -rw-r--r-- 3 hdfs hdfs 400 2021-06-29 10:14 /tmp/sqoop/hr.txt Contents of the file hr.txt [hdfs@bern sqoop]$ hdfs dfs -cat /tmp/sqoop/hr.txt 100,Geoffrey,manager,50000,Admin 101,Thomas,Oracle Consultant,15000,IT 102,Biden,Project Manager,28000,PM 103,Carmicheal,Bigdata developer,30000,BDS 104,Johnson,Treasurer,21000,Accounts 105,Gerald,Director,30000,Management 106,Paul,Director,30000,Management 105,Mark,CEO,90000,Management 105,Edward,Janitor,30000,Housing 105,Richard,Farm Manager,31000,Agriculture 105,Albert,System Engineer,21000,IT You MUST pre-create the table and database [root@bern sqoop]# mysql -uroot -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 123 Server version: 5.5.65-MariaDB MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> create database harsh8; Query OK, 1 row affected (0.05 sec) MariaDB [(none)]> use harsh8; Database changed Pre-create the table to receive the datasets MariaDB [harsh8]> CREATE TABLE staff ( id INT NOT NULL PRIMARY KEY, Name VARCHAR(20), Position VARCHAR(20),Salary INT,Department VARCHAR(10)); Query OK, 0 rows affected (0.26 sec) MariaDB [harsh8]> show tables; +------------------+ | Tables_in_harsh8 | +------------------+ | staff | +------------------+ 1 row in set (0.00 sec) Check the empty staff table structure MariaDB [harsh8]> describe staff; +------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------+-------------+------+-----+---------+-------+ | id | int(11) | NO | PRI | NULL | | | Name | varchar(20) | YES | | NULL | | | Position | varchar(20) | YES | | NULL | | | Salary | int(11) | YES | | NULL | | | Department | varchar(10) | YES | | NULL | | +------------+-------------+------+-----+---------+-------+ 5 rows in set (0.14 sec) The empty table before the export MariaDB [harsh8]> select * from staff; Empty set (0.00 sec) Run the export to import the HDFS data into the hasrh8.staff table [hdfs@bern sqoop]$ sqoop export \ --connect jdbc:mysql://localhost/harsh8 \ --username root \ --password 'w3lc0m31' \ --table staff \ --export-dir /tmp/sqoop/hr.txt Running sqoop job see the command snippet 21/06/29 10:23:05 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7.3.1.4.0-315 21/06/29 10:23:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 21/06/29 10:23:06 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 21/06/29 10:23:06 INFO tool.CodeGenTool: Beginning code generation 21/06/29 10:23:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `staff` AS t LIMIT 1 21/06/29 10:23:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `staff` AS t LIMIT 1 21/06/29 10:23:09 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/3.1.4.0-315/hadoop-mapreduce 21/06/29 10:23:22 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/5097d7a0a163272ca680207ac06da7d5/staff.jar 21/06/29 10:23:22 INFO mapreduce.ExportJobBase: Beginning export of staff 21/06/29 10:25:10 INFO client.RMProxy: Connecting to ResourceManager at bern.swiss.ch/192.168.0.139:8050 21/06/29 10:25:13 INFO client.AHSProxy: Connecting to Application History server at bern.swiss.ch/192.168.0.139:10200 21/06/29 10:25:17 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/hdfs/.staging/job_1624952474858_0001 21/06/29 10:25:50 INFO input.FileInputFormat: Total input files to process : 1 21/06/29 10:25:50 INFO input.FileInputFormat: Total input files to process : 1 21/06/29 10:25:52 INFO mapreduce.JobSubmitter: number of splits:4 21/06/29 10:25:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1624952474858_0001 21/06/29 10:25:57 INFO mapreduce.JobSubmitter: Executing with tokens: [] 21/06/29 10:25:59 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/3.1.4.0-315/0/resource-types.xml 21/06/29 10:26:01 INFO impl.YarnClientImpl: Submitted application application_1624952474858_0001 21/06/29 10:26:01 INFO mapreduce.Job: The url to track the job: http://bern.swiss.ch:8088/proxy/application_1624952474858_0001/ 21/06/29 10:26:01 INFO mapreduce.Job: Running job: job_1624952474858_0001 21/06/29 10:29:36 INFO mapreduce.Job: Job job_1624952474858_0001 running in uber mode : false 21/06/29 10:29:36 INFO mapreduce.Job: map 0% reduce 0% 21/06/29 10:33:15 INFO mapreduce.Job: map 75% reduce 0% 21/06/29 10:33:16 INFO mapreduce.Job: map 100% reduce 0% YARN UI showing the job running and completing Completed successfully Data now uploaded in the destination table MariaDB [harsh8]> select * from staff; +-----+------------+-------------------+--------+------------+ | id | Name | Position | Salary | Department | +-----+------------+-------------------+--------+------------+ | 100 | Geoffrey | manager | 50000 | Admin | | 101 | Thomas | Oracle Consultant | 15000 | IT | | 102 | Biden | Project Manager | 28000 | PM | | 103 | Carmicheal | Bigdata developer | 30000 | BDS | | 104 | Johnson | Treasurer | 21000 | Accounts | | 105 | Gerald | Director | 30000 | Management | +-----+------------+-------------------+--------+------------+ 6 rows in set (0.28 sec) I hope this answers your question

Shelton · ‎06-03-2021

@dmharshit Is the problem still persistent? Have you tried using Ambari REST API to move the component and delete the component for the old host? Please revert

Shelton · ‎06-03-2021

@MrWilkinson That could be a java-config issue Install JDK 8.0 64 bit. Install Java to C:/java instead of C:/Program Files. Recent Windows versions mark everything in C:\Program Files as read only. Set the JAVA_HOME environment variable using the 8.3 style name conventions. For example: C:\Program\jdk1.8.0. Ensure JAVA_HOME is pointing to a 64-bit JRE/JDK. Ensure your system meets the minimum memory requirement for Windows which is 4GB Please share your install steps?

Shelton · ‎06-03-2021

@rohit_sharma Indeed > was a typo Can you share your code here ?

Shelton · ‎06-02-2021

@dmharshit Please have a look at my other posting on keytabs https://community.cloudera.com/t5/Support-Questions/Headless-Keytab-Vs-User-Keytab-Vs-Service-Keytab/m-p/175277/highlight/true#M137536 Having said that you are switched to the hive user and attempting to use hdfs-headless-keytab. That's not possible. As the root user run the following steps # su - hdfs [hdfs@server-hdp ~]$ kinit -kt /etc/security/keytabs/hdfs.headless.keytab Now you should have a valid ticket [hdfs@server-hdp ~]$ klist Happy hadooping !!!

Online	Offline
Last Visited	‎12-11-2025 11:50 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎12-11-2025 11:50 PM
Posts	3,679
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: YARN resource manager + what is the count of n...

Re: CLI to get expiration date of Cloudera license

Re: YARN resource manager + what is the count of n...

Re: yarn + resource manager failed to start on HDP...

Re: username gives error in wget

Re: Sqoop export to create table

Re: hiveserver2 not starting in HDP 3.1.4

Re: Can't start NiFi in win10

Re: Unable to create topic in kafka cluster.

Re: HDFS is not accessible from an user after kerb...