Member since
09-20-2023
20
Posts
5
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
184 | 10-16-2024 12:10 PM | |
768 | 10-30-2023 07:48 AM |
03-25-2024
05:24 AM
I'd like to have a process for automatically exporting the data flow definition and configuration without needing to go into the CDF UI. Right now, if I wanted the configuration, I'd click here in the UI: My thought was to use the CDP CLI, but under the df section, there doesn't appear to be a command for this. Any suggestions?
... View more
Labels:
03-25-2024
05:15 AM
I'm trying to use the cdp cli (v. 0.9.106 on my local machine) to get KPI metrics from CDF. I'm using the command `cdp df list-deployment-system-metrics --deployment-crn my-deployment-crn` but struggling to correctly format the required `--metrics-time-period` arg. The documentation isn't helpful here: I've tried a few things like: `--metrics-time-period "2024-03-24T12:00:00Z:2024-03-24T16:00:00Z"` `--metrics-time-period LAST_24_HOURS` `--metrics-time-period "yesterday"` But they all scream back: An error occurred: No enum constant com.cloudera.dfx.metrics.TimeSpan.TimePeriod.yesterday (Status Code: 400; Error Code: INVALID_ARGUMENT; Service: df; Operation: listDeploymentSystemMetrics; If someone could elaborate on what this command is expecting, or better yet, offer up some examples of successful executions, that'd be great!
... View more
Labels:
02-29-2024
06:01 AM
Thanks @nhassell , that's resolving the 'cannot locate driver' issue. Now I'm just hitting my head against pesky host/port configuration problems. For the life of me, I cannot get past: ``` [S1000][unixODBC][Cloudera][DriverSupport] (1170) Unexpected response received from server. Please ensure the server host and port specified for the connection are correct. ``` Obviously I'm doing something wrong. But your suggestion solved the issue around the driver library, so thanks.
... View more
02-13-2024
05:03 AM
1 Kudo
No I'm still struggling with ODBC.
... View more
02-13-2024
04:59 AM
1 Kudo
I have an existing iceberg table and I'm trying to evolve schema by adding a new column: `alter table my_db.my_table add column insert_time timestamp` When I execute the above statement, I get a 'New column(s) have been added to the table.' but the column doesn't appear. I tried running a REFRESH, an INVALIDATE METADATA, and I tried restarting Impala but it had no effect. I tried the same operation on a test table and it worked. I'm using Impala as a query engine and Azure for storage. Impala is running in a Data Hub Cluster.
... View more
12-07-2023
12:06 PM
I want to set up a Nifi flow that gets data from a public RSS feed and loads it into a data lake. This RSS feed updates irregularly and when it does update it overwrites previous content. What processor(s) should I use to get data from the RSS feed (close to) when it has updated? Is it as simple as using InvokeHTTP repeatedly, checking for a change in output, then loading into data lake if the content differs from the previous invocation? Is there another way if I don't want to make the HTTP request so frequently?
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
10-30-2023
07:48 AM
Resolved. The issue was that I was using an ExecuteSQL processor before the PutIceberg processor and I neglected to use logical types. The fix was to switch the property 'Use Avro Logical Types' to true.
... View more
10-30-2023
07:42 AM
I have a flow that queries data from a source table and loads it into an iceberg table. The source data includes a Date type which gets read in as a string. The destination table has the same field where the schema is also a Date. When I try to use the PutIceberg processor, I get the following error: org.apache.nifi.serialization.record.util.IllegalTypeConversionException: Failed Conversion of Field [null] from String [2023-10-30] to LocalDate
- Caused by: java.lang.NumberFormatException The flow file is Avro and I'm using an AvroReader service for the PutIceberg. Here's a sample of the Avro: [ { "scan_id" : "1698665963-82DD686DBC1979E6FF6C5443BA82456621DB5004", "queue_seq_id" : 558745163, "flight_airline_code" : "AC", "flight_number" : 8663, "flight_departure_date" : "2023-10-30", "scan_airport_code" : "YHZ", "scan_checkpoint_name" : "Transborder", "scan_stage_name" : "S2 - Screening Lines", "manual_entry" : "N", "wait_time_pax_queue" : "Main Queue", "wait_time_seconds" : 93, "included_in_counts" : "Y", "insert_time" : "2023-10-30 13:44:39.0", "scan_time" : "2023-10-30 11:39:23.0", "scan_hour" : 11, "scan_year" : 2023, "scan_month" : 10, "scan_day" : 30 } The issue seems to be the 'flight_departure_date' field. Strangely enough, I have another flow that does essentially the exact same thing and I've had no issues with it there.
... View more
Labels:
10-23-2023
09:52 AM
1 Kudo
In Cloudera Data Flow, I'm using the `QueryDatabaseTable` processor with a maximum-value column to keep track of which records have already been processed. When I'm in Flow Designer testing the flow, I can see that it's maintaining state. Does this state persist if I deploy it, or does the deploy clear the state? I expect it would be the latter, but want to check. Follow up question: Say I make changes to the flow and redeploy. Is the state cleared this time, or does it use the last deployments flow? I'd like to be confident that making changes to the flow doesn't cause it to reprocess all the data again.
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
09-20-2023
04:55 AM
I'm trying to connect to Impala on a new mac (M2 Chip). I followed all the steps here for unixODBC. I confirmed that my unixODBC install is working and that the Cloudera driver installed properly. When I run `odbcinst -j` I get: unixODBC 2.3.12
DRIVERS............: /opt/homebrew/etc/odbcinst.ini
SYSTEM DATA SOURCES: /opt/homebrew/etc/odbc.ini
FILE DATA SOURCES..: /opt/homebrew/etc/ODBCDataSources
USER DATA SOURCES..: /Users/willhipson/.odbc.ini
SQLULEN Size.......: 8
SQLLEN Size........: 8
SQLSETPOSIROW Size.: 8 Here's my odbcinst.ini [ODBC Drivers]
Cloudera ODBC Driver for Impala=Installed
[Impala]
Description=Cloudera ODBC Driver for Impala
Driver=/opt/cloudera/impalaodbc/lib/universal/libclouderaimpalaodbc.dylib And here's the first few lines of my odbc.ini [ODBC]
[ODBC Data Sources]
Impala=Cloudera ODBC Driver for Impala
[Impala]
Description=Cloudera Impala ODBC Driver DSN
Driver=Impala I have confirmed that the driver exists in `/opt/cloudera/impalaodbc/lib/universal/libclouderaimpalaodbc.dylib` When I try to run `isql -v Impala` in the terminal I get the error: [01000][unixODBC][Driver Manager]Can't open lib '/opt/cloudera/impalaodbc/lib/universal/libclouderaimpalaodbc.dylib' : file not found
[ISQL]ERROR: Could not SQLConnect I have been struggling to resolve this for several days and have scoured SO, blog posts, and this forum for help but nothing seems to work. Hoping to resolve soon.
... View more
Labels:
- Labels:
-
Apache Impala
- « Previous
-
- 1
- 2
- Next »