Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Scratch file generation information

Solved Go to solution

Scratch file generation information

Contributor

Hello,

 

A simple question:

 

How can I know which queries generate Scratch files?

I'm inspecting the Impalad logs and I couldn't find any information about the scratch file generation.

 

Regards,

Silva

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Scratch file generation information

Master Collaborator
The Cloudera Manager queries page has the bytes spilled to disk as one of the metrics it tracks per query. Also in CM, there's a "Cluster utilization report" that has some aggregate information about how much data is spilled to disk over longer time windows. Also, if you're looking at the scratch files themselves the query ID is embedded in the file name (although that's an implementation detail and could change in the future).
4 REPLIES 4

Re: Scratch file generation information

Master Collaborator

Re: Scratch file generation information

Contributor

But I need to know which specific queries spills into disk, generating the scratch files. Is possible to have that kind of information?.

Re: Scratch file generation information

Master Collaborator
The Cloudera Manager queries page has the bytes spilled to disk as one of the metrics it tracks per query. Also in CM, there's a "Cluster utilization report" that has some aggregate information about how much data is spilled to disk over longer time windows. Also, if you're looking at the scratch files themselves the query ID is embedded in the file name (although that's an implementation detail and could change in the future).
Highlighted

Re: Scratch file generation information

Contributor
Thanks a lot for the info
Don't have an account?
Coming from Hortonworks? Activate your account here