New Contributor
Posts: 4
Registered: ‎07-30-2015

Copying all files from a directory using pig.

Hey i need to copy all files from a local directory to the HDFS using pig.

In the pig script im using the copyFromLocal command with a wildcard in the sourcepath i.e copyFromLocal /home/hive/Sample/* /user
It says the source path doesnt exist.

When i use copyFromLocal /home/hive/Sample/ /user , it makes another directory in the HDFS by the name of 'Sample', which i dont need.

But when i include the file name i.e /home/hive/Sample/sample_1.txt it works.

I dont need a single file. I need to copy all the files in the directory without making a directory in the HDFS.

Posts: 1,826
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Copying all files from a directory using pig.

This is currently a limitation in Hadoop. Typically, your shell expands the globbing which makes regular use work, but the Pig's invocation is not from the unix shell and passes the pattern into the code instead (which does not apply globbing on source paths today). See/follow for more.
New solutions