Reply
Contributor
Posts: 86
Registered: ‎11-12-2015
Accepted Solution

Scratch file generation information

Hello,

 

A simple question:

 

How can I know which queries generate Scratch files?

I'm inspecting the Impalad logs and I couldn't find any information about the scratch file generation.

 

Regards,

Silva

Cloudera Employee
Posts: 378
Registered: ‎07-29-2015

Re: Scratch file generation information

Contributor
Posts: 86
Registered: ‎11-12-2015

Re: Scratch file generation information

But I need to know which specific queries spills into disk, generating the scratch files. Is possible to have that kind of information?.

Cloudera Employee
Posts: 378
Registered: ‎07-29-2015

Re: Scratch file generation information

The Cloudera Manager queries page has the bytes spilled to disk as one of the metrics it tracks per query. Also in CM, there's a "Cluster utilization report" that has some aggregate information about how much data is spilled to disk over longer time windows. Also, if you're looking at the scratch files themselves the query ID is embedded in the file name (although that's an implementation detail and could change in the future).
Highlighted
Contributor
Posts: 86
Registered: ‎11-12-2015

Re: Scratch file generation information

Thanks a lot for the info
Announcements