Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

cna't connect w/ Amazon S3

cna't connect w/ Amazon S3

New Contributor

I am having a horrible time trying to connect with Amazon S3 public datasets.

 

When I do the folloiwng in Pig I get an error.

 

 hdfs@ip-172-31-35-24:~$ pig
2013-08-22 00:43:00,343 [main] INFO  org.apache.pig.Main - Apache Pig version 0.11.0-cdh4.3.0 (rexported) compiled May 27 2013, 20:40:22
2013-08-22 00:43:00,344 [main] INFO  org.apache.pig.Main - Logging error messages to: /var/lib/hadoop-hdfs/pig_1377132180339.log
2013-08-22 00:43:00,371 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /var/lib/hadoop-hdfs/.pigbootup not found
2013-08-22 00:43:00,696 [main] WARN  org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS
2013-08-22 00:43:00,696 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://ip-172-31-35-24.us-west-2.compute.internal:8020
2013-08-22 00:43:01,633 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: ip-172-31-35-24.us-west-2.compute.internal:8021
2013-08-22 00:43:01,635 [main] WARN  org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS
grunt> cd s3://datasets.elasticmapreduce/ngrams/books
2013-08-22 00:43:13,413 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. Wrong FS: s3://datasets.elasticmapreduce/ngrams/books, expected: hdfs://ip-172-31-35-24.us-west-2.compute.internal:8020
Details at logfile: /var/lib/hadoop-hdfs/pig_1377132180339.log
grunt>
grunt>
grunt>

 

Can someone please help me as I'm a newbie to this stuff..

 

Thanks.

3 REPLIES 3

Re: cna't connect w/ Amazon S3

Rising Star

Use the Pig version packaged with the EMR (Elastic MapReduce). Is the Pig version installed separately?

Re: cna't connect w/ Amazon S3

New Contributor

The version of pig that's I'm using is the one that is installed with CDH.

 

hdfs@ip-172-31-35-24:/tmp/pig-0.11.1$ pig
13/08/22 17:23:25 WARN pig.Main: Cannot write to log file: /tmp/pig-0.11.1/pig_1377192205317.log
2013-08-22 17:23:25,323 [main] INFO  org.apache.pig.Main - Apache Pig version 0.11.0-cdh4.3.0 (rexported) compiled May 27 2013, 20:40:22
2013-08-22 17:23:25,351 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /var/lib/hadoop-hdfs/.pigbootup not found
2013-08-22 17:23:25,681 [main] WARN  org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS
2013-08-22 17:23:25,681 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://ip-172-31-35-24.us-west-2.compute.internal:8020
2013-08-22 17:23:26,625 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: ip-172-31-35-24.us-west-2.compute.internal:8021
2013-08-22 17:23:26,627 [main] WARN  org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS
grunt>
grunt>

I looked at the Amazon EMR page ( http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/Pig_SupportedVersions.html ) and it looks like the version shipped with CDH is not compatible?

 

BTW, I do apprecitate the help. I'm a total newbie at this stuff.

 

Thanks.

 

-brad w.

Highlighted

Re: cna't connect w/ Amazon S3

Master Guru

Are you facing issues only with commands such as "cd" or when running actual Pig queries too? If the latter was tried, can you post the error you get from queries such as STORE or LOAD operators?