About shehbazk

shehbazk · ‎04-08-2025

Introduction: In today's digital age, data security and compliance are paramount. Organizations handling sensitive information, especially in cloud environments, need robust tools to monitor and audit user activities. Cloudera Data Platform (CDP) offers a comprehensive auditing system to keep track of who's doing what in your environment. In this article, we'll delve into the world of audits in CDP Cloudera, explaining why they are essential and how to use them effectively. Why Audits Matter: Enhancing Security: Audits play a pivotal role in maintaining the security of your CDP Cloudera environment. By keeping a detailed log of user activities, audits help you identify suspicious or unauthorized actions promptly. Compliance Requirements: Many industries and organizations have stringent regulatory requirements. Audits help organizations adhere to these regulations by providing a clear record of data access and modifications. Investigation and Troubleshooting: When issues arise, audits serve as valuable tools for investigation and troubleshooting. You can trace user actions to identify the root cause of problems. Insight into User Behavior: Audits provide insights into user behavior and the usage patterns of your CDP Cloudera environment. This information can be used to optimize resource allocation and improve operational efficiency. How to Access Audits : - There are two primary methods to access audit information in CDP Cloudera: 1. CDP Management Console: The CDP Management Console provides a user-friendly interface for managing and accessing audit data. Navigate to the 'audits' section within the Management Console to retrieve audit events. 2. CDP Command Line Interface (CLI): If you prefer command-line access, you can use the CDP CLI. Use the following command to list audit events: cdp audit list-events --from-timestamp Start-Time --to-timestamp End-Time --event-source iam --event-name "InteractiveLogin" This command fetches audit events for the specified time range, event source, and event name. I have added the above example of the Stop Data Hub Clutser event. # cdp audit list-events --from-timestamp 2023-11-06T13:36:18.036Z --to-timestamp 2023-11-06T17:36:18.036Z --event-source datahub --event-name "stopCluster" --result-code "SUCCESS" { "auditEvents": [ { "version": "1.1.0", "id": "a0b57964-7bea-41d1-afc5-0fa7288d4868", "eventSource": "datahub", "eventName": "stopCluster", "timestamp": 1699280001302, "actorIdentity": { "actorCrn": "crn:altus:iam:us-west-1:65e2e6e4-60dc-######:user:######" }, "accountId": "65e2e6e4-60dc-4358-91a1-cbdc804f6303", "requestId": "bfcb1a11-b0db-422c-af2e-b4db523a6681", "resultCode": "SUCCESS", "apiRequestEvent": { "responseParameters": "{ }", "mutating": false } } ] } Let's break down the provided command and its output step by step: cdp audit list-events: This is the command that lists audit events in CDP Cloudera. --from-timestamp 2023-11-06T13:36:18.036Z: This is the start time for the audit event search. It specifies the date and time from which you want to retrieve audit events. --to-timestamp 2023-11-06T17:36:18.036Z: This is the end time for the audit event search. It specifies the date and time until which you want to retrieve audit events. --event-source datahub: This option filters the audit events by the event source. In this case, it filters for events originating from "datahub." --event-name "stopCluster": This option filters the audit events by the event name. It includes only events with the name "stopCluster." --result-code "SUCCESS": This option filters the audit events by the result code. It includes only events with the result code "SUCCESS." We can determine who performed the Data hub stop activity based on the actor identity provided in the output: actorIdentity: Information about the user or entity that triggered the event, including the user's Cloud Resource Name (CRN). In conclusion, audits in CDP Cloudera are essential for maintaining security, compliance, and operational efficiency. Whether you choose the Management Console or the CLI, accessing audit information is straightforward and invaluable for monitoring user activities within your environment.

shehbazk · ‎09-10-2024

Hi @MaraWang Thanks for raising this query: Cloudera does not natively support JanusGraph or other third-party graph databases like Neo4j directly within its CDP (Cloudera Data Platform) ecosystem, and as you mentioned, GraphX and GraphFrames are also not supported in CDP 7.2.18. However, there are a few options you can consider for graph-based analytics or computations in the Cloudera environment but for that, I request you to raise a query to the account team. They are best suited to assist you in this. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

shehbazk · ‎09-05-2024

Hi @Choolake When a bash script that loads data into Hadoop stops unexpectedly without completing and without any apparent reason or modification, it can be challenging to identify the root cause. If your script outputs any logs, review them to identify at what point the script stopped. This can help pinpoint the problematic step or external dependency causing the issue. Enable verbose logging in the script to capture more details: # set -x This will show each command as it is executed and could help trace where it stops.

shehbazk · ‎08-19-2024

Hi @bigdatacm Thanks for bringing up the issue. To explicitly remove the entity from Atlas after dropping the Hive table, you can use the Atlas REST API or the Atlas UI to delete the entity. UI - you can manually delete the entity from the Atlas UI by searching for the table and removing it. Thanks, Shehbaz.

shehbazk · ‎08-02-2024

Hi @hadoopranger Thanks for bringing up the issue. To better assist you, could you please provide more details about the following: - On which servers you are trying to install the Cloudera services, - Is it Public Cloud or a Private Cloud

shehbazk · ‎08-02-2024

Hi @ipson Thanks for bringing up the issue, The error message indicates that the qualifiedName attribute is missing or incorrectly specified for the hive_process type. In Apache Atlas, the qualifiedName is a mandatory attribute that uniquely identifies an entity within a cluster.

shehbazk · ‎08-02-2024

Hi @Choolake Thanks for bringing up the issue. To better assist you, could you please provide more details about the specific scripts you need help with?

shehbazk · ‎11-06-2023

Hi @Amn_468 Thanks for the query. This can be achievable with the help of CDP CLI and CDP Curl commands. We have the following options: CDP curl CDP CLI Please go through the above links and let us know if you have further queries. Thanks, SP.

shehbazk · ‎04-23-2023

In addition as per your last comments, we can see you have a considerable amount of data under your /home directory. ex: 15G /home Could you please check if you can remove some data from it? Thanks

shehbazk · ‎04-13-2023

I @itsmezeeshan No it's not recommended to move these files. as your CM server is dependent on it. Could you please share the output of the below commands: # df -h | tee -a /tmp/df.out # du -hs /* | tee -a /tmp/du.out

Online	Offline
Last Visited	‎10-28-2025 10:36 AM

Member Since	‎06-13-2021 08:28 PM
Last Visited	‎10-28-2025 10:36 AM
Posts	519
Kudos received	9

Cloudera Community

Re: Which Graph Data Base or Graph Computation Too...

Re: Some Scripts suddenly stopping without ending ...

Re: Manually creating data lineage in Apache Atlas...

Re: Health Checks via API for CDP-PC

Reduced Datalake cluster free space

Understanding Audits in CDP(Public Cloud): Why The...

Re: Which Graph Data Base or Graph Computation Too...

Re: Some Scripts suddenly stopping without ending ...

Re: How to make dropped hive table status as inact...

Re: Cloudera-Services installation using parcels i...

Re: Manually creating data lineage in Apache Atlas...

Re: Some Scripts suddenly stopping without ending ...

Re: Health Checks via API for CDP-PC

Re: space issue in /var/lib/cloudera*

Re: space issue in /var/lib/cloudera*