About AyazHussain

AyazHussain · ‎10-22-2025

Minimum User Percentage and User Limit Factor are ways to control how resources get assigned to users within the queues they are utilising. The Min User Percentage is a soft limit on the smallest amount of resources a single user should get access to if they are requesting it. For a specific queue Minimum User Limit Percentage(MULP) is a soft limit on the smallest amount of resources every user will get. This MULP is decided on the basis of how many concurrent users we are expecting to run job on a particular queue. Setting this to 10% is ideal as it will give around 10 users to have minimum of 10% of the queue minimum capacity configured. Setting the config for MULP is also based on the Active and Non Active users. Active users are the users who are requesting for more resources and Non active users are the users who are running their job but not requesting more resources. Generally the Idea is to calculate the MULP for the active users: active-user-limit = max(resource-used-by-active-users / active-users, queue-capacity * MULP) For Example: 5 users, 5 apps, MULP=20, Queue-configured-resource=100 App: a1, a2, a3, a4, a5 Usr: u1, u2, u3, u4, u5 At the time=T, resource usage: a1=25,a2=20,a3=30,a4=20,a5=5; a1/a2 are active user. This will give result as 22.5 so the user a2 will get the resources but a1 is already crossed that limit to get the available resources. For setting User Limit Factor (ULF) that is the max limit a user will get in a particular queue. User Limit Factor is a way to control the max amount of resources that a single user can consume. User Limit Factor is set as a multiple of the queues minimum capacity where a user limit factor of 1 means the user can consume the entire minimum capacity of the queue. A common design point that may initially be non-intuitive is creation of queues by workloads and not by applications and then using the user-limit-factor to prevent individual takeover of queues by a single user by using a value of less than 1.0

AyazHussain · ‎05-08-2025

Hi @anonymous_123 , Generally the RM heap calculation depends on the yarn.resourcemanager.max-completed-applications value and the number of applications running daily. Default value for yarn.resourcemanager.max-completed-applications is 10000 but if you see that you dont have enough applications running you can set this to 6000. Regarding 4GB heap that is production level RM heap and it is fine if you are not seeing any heap related errors.

AyazHussain · ‎04-15-2025

Hi @Jaguar , Can you please get the RM logs and grep with Ranger in RM and check that. Do you have the cm_yarn service plugin setup in Ranger?

AyazHussain · ‎04-02-2025

Hi @anonymous_123 , Yes you can use Iceberg Table with Spark and to authorise with Ranger. You need to set two permissions one for the Iceberg Metadata files and One for global policy to give permission to iceberg on all tables. Please follow this document https://docs.cloudera.com/runtime/7.3.1/iceberg-how-to/topics/iceberg-setup-ranger.html

AyazHussain · ‎04-02-2025

Hi @satvaddi , Please follow the below actions to setup the policies in RAZ for Spark. Spark doesnt have any plugin of its own so the data accessed on S3 will be logged. Other than that the table metadata will be logged from HMS. Running the create external table [***table definition***] location ‘s3a://bucket/data/logs/tabledata’ command in Hive requires the following Ranger policies: An S3 policy in the cm_s3 repo on s3a://bucket/data/logs/tabledata for hive user to perform recursive read/write. An S3 policy in the cm_s3 repo on s3a://bucket/data/logs/tabledata for the end user. A Hive URL authorization policy in the Hadoop SQL repo on s3a://bucket/data/logs/tabledata for the end user. Access to the same external table location using Spark shell requires an S3 policy (Ranger policy) in the cm_s3 repo on s3a://bucket/data/logs/tabledata for the end user.

AyazHussain · ‎03-24-2025

In YARN, resource allocation discrepancies can occur due to the way resource calculation is handled. By default, resource availability is determined based on available memory. However, when CPU scheduling is enabled, resource calculation considers both available memory and vCores. As a result, in some scenarios, nodes may appear to allocate more vCores than the configured limit while simultaneously displaying lower available resources. This happens due to the way YARN dynamically assigns vCores based on workload demands rather than strictly adhering to preconfigured limits. Additionally, in cases where CPU scheduling is disabled, YARN relies solely on memory-based resource calculation. This may lead to negative values appearing in the YARN UI, which can be safely ignored, as they do not impact actual resource utilization.

AyazHussain · ‎03-23-2025

No the job wont fail as by default the work preserve is enabled on YARN Resource Manager and Node Manager.

AyazHussain · ‎03-06-2025

Hi @sdbags , You can recover the corrupted block if you have set the replication factor to default of 3.

AyazHussain · ‎11-13-2024

Tagging @paras for CM.

AyazHussain · ‎10-29-2024

Hi @yoshio_ono , Please check this article https://my.cloudera.com/knowledge/How-to-calculate-memory-and-v-core-utilization-for-each-of-the?id=271149.

Online	Offline
Last Visited	‎09-23-2025 03:26 AM

Member Since	‎12-20-2022 08:28 AM
Last Visited	‎09-23-2025 03:26 AM
Posts	88
Kudos received	19

Cloudera Community

Re: Resource manager heap calculation

Re: Iceberg with Ranger

Re: Node Manager Down

Re: Sachin Duggal : Can Block-Level Data Be Used t...

Re: How to get total_io_mb of eatch applications i...

User Limit Factor and Minimum User Limit Percentag...

Re: Resource manager heap calculation

Re: Spark job in cdp 7.2.18 RangerRaz not generati...

Re: Iceberg with Ranger

Re: Spark job in cdp 7.2.18 RangerRaz not generati...

Reserved Memory and Vcores in negative value in Ya...

Re: Node Manager Down

Re: Sachin Duggal : Can Block-Level Data Be Used t...

Re: Please let me know how to get out of this prob...

Re: How to get total_io_mb of eatch applications i...