Reply
Contributor
Posts: 25
Registered: ‎01-07-2016
Accepted Solution

Setting max S3 connections

Hi all,

 

Best Practices for Using Impala with S3 states "Set the safety valve fs.s3a.connection.maximum to 1500 for impalad."

 

Can annyone clarify which safety valve field should be used and with what syntax? I'm reading somewhere that this setting belongs to core-site.xml but Impala configuration in Cloudera Manger does not seem to have a safety valve for core-site.xml. The instructions mentions safety valve for impalad but that safety valve seems to be for command line arguments to impalad.

 

The problem we are trying to adress is

 

hdfsSeek(desiredPos=503890631): FSDataInputStream#seek error:
com.cloudera.com.amazonaws.AmazonClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

 

that we keep getting when using Impala for querying data stored in S3.

 

We are using CDH 5.8.3

 

Thanks,

Petter

Highlighted
Cloudera Employee
Posts: 17
Registered: ‎01-29-2016

Re: Setting max S3 connections

Hi Pettax,

 

You should be able to find the safety valve in the Cloudera Manager under the HDFS service. The S3AConnector used by Impala is managed by the HDFS service. It will be under the title: "Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml".

 

Let me know if you have any other issues.

 

- Sailesh

Contributor
Posts: 25
Registered: ‎01-07-2016

Re: Setting max S3 connections

Thank you Sailesh!

 

This solved my problem.

 

Br,

Petter

Announcements