- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Querying large datasets in Cloudera - Emmanuel Katto
- Labels:
-
Apache Impala
Created 10-07-2024 11:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone, My name is Emmanuel Katto. Can anyone share their experiences with Impala for querying large datasets in Cloudera? What are some tips for optimizing query performance?
Looking forward to your tips/suggestions.
Thanks in advance!
Best,
Emmanuel Katto
Created 10-08-2024 04:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello-
Impala is architected for querying "large datasets" out of the box. However, depending on your requirements, you need to allocate enough hardware and resources. You can start with https://impala.apache.org/docs/build/html/topics/impala_scalability.html, https://impala.apache.org/docs/build/html/topics/impala_resource_management.html, https://impala.apa... and https://impala.apache.org/docs/build/html/topics/impala_planning.html.
Let me know if that helps.
Created 10-08-2024 04:16 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@emmanuelkatto Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our Impala experts @jAnshula @Boris G @Saurabhatiyal who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 10-08-2024 04:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello-
Impala is architected for querying "large datasets" out of the box. However, depending on your requirements, you need to allocate enough hardware and resources. You can start with https://impala.apache.org/docs/build/html/topics/impala_scalability.html, https://impala.apache.org/docs/build/html/topics/impala_resource_management.html, https://impala.apa... and https://impala.apache.org/docs/build/html/topics/impala_planning.html.
Let me know if that helps.
Created 10-11-2024 05:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@emmanuelkatto Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
Regards,
Diana Torres,Community Moderator
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: