Support Questions

Find answers, ask questions, and share your expertise

Retrieving Impala data using SQL, only data records since last poll using the Date/Timestamp

avatar
New Contributor

I want to poll an Impala table using SQL queries and only want to retrieve data records added since I last polled. This would be using a Date/Timestamp column in the table. Does anyone know how to achieve this?

I am happy to hold the last Date/Timestamp retrieved, in my source system

3 REPLIES 3

avatar

Hi @StuartM , I know it's not a direct answer, but this requirement sounds more like a good call for Kafka - which inherently supports the idea of "consumer offsets".

avatar
Community Manager

@StuartM, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. 



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Explorer

To retrieve data records after crm enrichment added since the last time you polled an Impala table using a Date/Timestamp column, you can use a SQL query with a WHERE clause filtering for records with timestamps greater than the last timestamp you retrieved. Here's a basic example assuming your timestamp column is named timestamp_column:

SELECT *
FROM your_table
WHERE timestamp_column > 'last_poll_timestamp';

 

Replace 'last_poll_timestamp' with the actual timestamp value you stored from your last poll. Make sure the timestamp format matches the format stored in your table.

Here's a step-by-step guide:

  1. Store the timestamp of the last poll in your source system.
  2. Use this timestamp to construct your SQL query, ensuring you're retrieving records with timestamps greater than the last poll timestamp.
  3. Execute the SQL query against your Impala table to retrieve the new records.