Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Copying all files from a directory using pig.

Copying all files from a directory using pig.

New Contributor

Hey i need to copy all files from a local directory to the HDFS using pig.

In the pig script im using the copyFromLocal command with a wildcard in the sourcepath i.e copyFromLocal /home/hive/Sample/* /user
It says the source path doesnt exist.

When i use copyFromLocal /home/hive/Sample/ /user , it makes another directory in the HDFS by the name of 'Sample', which i dont need.

But when i include the file name i.e /home/hive/Sample/sample_1.txt it works.

I dont need a single file. I need to copy all the files in the directory without making a directory in the HDFS.

1 REPLY 1

Re: Copying all files from a directory using pig.

Master Guru
This is currently a limitation in Hadoop. Typically, your shell expands the globbing which makes regular use work, but the Pig's invocation is not from the unix shell and passes the pattern into the code instead (which does not apply globbing on source paths today). See/follow https://issues.apache.org/jira/browse/HADOOP-7141 for more.