Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

ParquetFileReader leading to Close_wait socket connection

ParquetFileReader leading to Close_wait socket connection

New Contributor
Hi,
I am trying to read meta data info for parquet file. Following is the code snippet being used :
metaData=ParquetFileReader.readFooter(fs.getConf(),file) ;
 
This line leads to a connection to be open CLOSE_WAIT state which I checked by using lsof -p pid command.
TCP rack162-hdp26-dev:36608->rack162-hdp26-dev:1019 (CLOSE_WAIT)
 
While running this code on large number of files greater than 65536,I end up with too many open file issue and hence need to restart my application.
 
We have tried replacing above code with this and still facing issue
 
 
try (ParquetFileReader r = ParquetFileReader.open(fs.getConf(), file)) {
logger.info("Getting metadata for:" + file.toString());
metaData = r.getFooter()
//other code//
}
 
Could you please provide the solution for this?
PS:I have already tried using parquet-hadoop jar of version 1.8.1 ,1.10.1,1.11.1and facing issue with all.
Also have tried various code snippet available online but nothing seems to work.
Don't have an account?
Coming from Hortonworks? Activate your account here