Member since
09-20-2023
20
Posts
5
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
181 | 10-16-2024 12:10 PM | |
766 | 10-30-2023 07:48 AM |
10-17-2024
04:56 AM
1 Kudo
Correction: 'Check the checkbox to Allow users to Run ML Runtimes'
... View more
10-16-2024
12:10 PM
Resolved. I had ML Runtimes Addons disabled. Went into CML > Site Administrations > Settings and Under Feature Flags, unchecked the checkbox next to Allow users to Run ML Runtimes Addons. Then, started a new session with Spark enabled
... View more
10-16-2024
07:20 AM
I have a CML project using a JupyterLab Runtime with Python 3.10 and I want to start a Spark cluster with my CDP Datalake. I'm using the predefined Spark Data Lake Connection in CML which looks like this: ``` import cml.data_v1 as cmldata # Sample in-code customization of spark configurations #from pyspark import SparkContext #SparkContext.setSystemProperty('spark.executor.cores', '1') #SparkContext.setSystemProperty('spark.executor.memory', '2g') CONNECTION_NAME = "hiaa-dl" conn = cmldata.get_connection(CONNECTION_NAME) spark = conn.get_spark_session() # Sample usage to run query through spark EXAMPLE_SQL_QUERY = "show databases" spark.sql(EXAMPLE_SQL_QUERY).show() ``` When I execute this I get the error: IllegalArgumentException: The value of property spark.app.name must not be null I'm using the predefined spark-defaults.conf which looks like this: ``` spark.executor.memory=1g spark.executor.cores=1 spark.yarn.access.hadoopFileSystems=abfs://[container]@[storage-account].dfs.core.windows.net ``` Is there something else I need to configure in the CML session or at the data lake level?
... View more
Labels:
10-01-2024
04:46 AM
1 Kudo
Thanks, but the reason I'm trying to establish an ODBC connection is because I'm using R.
... View more
09-26-2024
08:26 AM
I want to establish an ODBC connection to my Impala data warehouse in a CML project. I'm running a CML session and have configured my odbc.ini and odbcinst.ini files, but I need to install the driver. Everything I see here https://www.cloudera.com/downloads/connectors/impala/odbc/2-7-2.html describes installing it via the installation wizard. Since CML runs a linux pod, I tried downloading the .deb file for the ODBC driving and then executing `dpkg clouderaimpalaodbc_2.7.2.1011-2_amd64.deb` but it has to be as root. If I run `su` it asks for a password. I'm not sure how this password gets generated by the system. I tried my workload password and portal password but neither worked. Is there a more principled way of connecting via ODBC in CML or is there a way to run as root in CML?
... View more
Labels:
09-26-2024
08:02 AM
I finally had success using HTTP transport. My .odbc.ini looks something like this: ``` [ODBC] ; Specify any global ODBC configuration here such as ODBC tracing. [ODBC Data Sources] Impala = Cloudera ODBC Driver for Impala [Impala] Driver=/opt/cloudera/impalaodbc/lib/universal/libclouderaimpalaodbc.dylib Description=Cloudera Impala ODBC Driver DSN Host=[datahub-name]-master0.hiaa-cdp.uvmh-kdle.a4.cloudera.site Port=443 Schema=default AuthMech=3 UseSASL=1 SSL=1 TransportMode=http httpPath=[datahub-name]/cdp-proxy-api/impala ```
... View more
07-16-2024
05:46 AM
I have two hive tables and I want to manually create a lineage relationship between them using the Atlas API. I'm trying to run this POST request: ``` curl --location 'https://[url]/atlas/api/atlas/v2/entity' \ --header 'Content-Type: application/json' \ --header 'Authorization: ••••••' \ --data-raw '{ "entity": { "typeName": "hive_process", "qualifiedName": "my_etl_process@cluster", "name": "my_etl_process", "description": "ETL process from input_table to output_table", "attributes": { "inputs": [ { "guid": "9768381b-8783-49c3-850d-39bf1f14b73f" } ], "outputs": [ { "guid": "998de7ba-2254-418d-b405-656eba428643" } ] } } } ' ``` I'm getting this 404 response: ``` { "errorCode": "ATLAS-404-00-007", "errorMessage": "Invalid instance creation/updation parameters passed : hive_process.qualifiedName: mandatory attribute value missing in type Referenceable" } ``` Any suggestions for how to manually create lineage using the API would be appreciated.
... View more
Labels:
04-15-2024
06:17 AM
When I run this command via the cdp cli, it just returns a JSON like this: ``` { "archiveName": "[MY_FLOW_NAME].tar.gz" } ``` I want to get the actual flow definition.
... View more
03-26-2024
10:28 AM
I want to use the Atlas API to add/edit user-defined properties to a particular entity. I've previously had success using the API to edit properties that are already defined, such as an entity description: ``` curl --location --request PUT 'https://[host]/api/atlas/v2/entity/guid/[guid]?name=description' \ --header 'Content-Type: application/json' \ --header 'Authorization: Basic' \ --data '"a test table"' ``` However, it's not clear how to use the API to create and edit user-defined properties. When I try changing the url query to a user-defined property name, it complains that that property is not defined for the entity type. For reference, in the UI, this is done fairly easily.
... View more
Labels:
- Labels:
-
Apache Atlas
03-26-2024
07:00 AM
That works. Thanks!
... View more