Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How do i connect with s3 on impala?

How do i connect with s3 on impala?

New Contributor

hello~ 

 

first .... i appreciate you reading my question. 

 

i want to connect with s3 on impala.

 

i tested several things about impala..  it's like that below.. 

 

1. test with local file (csv file)

  

tab1.csv(local):

1,true,123.123,2012-10-24 08:55:00 
2,false,1243.5,2012-10-25 13:40:00
3,false,24453.325,2008-08-22 09:33:21.123
4,false,243423.325,2007-05-12 22:32:21.33454
5,true,243.325,1953-04-22 09:11:33

 

DROP TABLE IF EXISTS tab1;

CREATE EXTERNAL TABLE tab1
(
   id INT,
   col_1 BOOLEAN,
   col_2 DOUBLE,
   col_3 TIMESTAMP
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/user/cloudera/sample_data/tab1';

 

As you already know... it worked very well.   i could do queries on impala.

 

 

 

2. test with s3(csv file)

 

tab1.csv(amazon s3):

1,true,123.123,2012-10-24 08:55:00 
2,false,1243.5,2012-10-25 13:40:00
3,false,24453.325,2008-08-22 09:33:21.123
4,false,243423.325,2007-05-12 22:32:21.33454
5,true,243.325,1953-04-22 09:11:33

 

 

DROP TABLE IF EXISTS tab1;

CREATE EXTERNAL TABLE tab1
(
   id INT,
   col_1 BOOLEAN,
   col_2 DOUBLE,
   col_3 TIMESTAMP
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION 's3://impaladata/s3test';

 

it didn't worked at all.  i couldn't do queries on impala.   i could see tab1 table.. but i couldn't see any data of tab1.

 

 

of course i checked a connection state of hdfs and s3 like below.

 

[hadoop@impala Impala_Test]$ hdfs dfs -ls /

Found 7 items
drwxrwxrwx - 0 1970-01-01 00:00 /s3test

 

 

please help me ~~ 

 

have a nice day ~ ~

 

good luck to you :)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2 REPLIES 2

Re: How do i connect with s3 on impala?

New Contributor

I have the same issue but also i try to use a Hive external or internal table.

 

So it's not possible to use S3 then I thought that using a Hive table could help me but the result its the same, only i can query this S3 tables from Hive but not from impala althought Impala see the tables properly.

 

Then I'll move the data to HDFS not? or do you known any other option?

 

regards

jcsenciales

 

Highlighted

Re: How do i connect with s3 on impala?

Contributor

Impala currently only runs on HDFS.

Don't have an account?
Coming from Hortonworks? Activate your account here