I've multiples files (parquet files) in a directory of the HDFS and I want to join all the files into one file using Apache PIG. I don't know how many files I will have into this directory so I can't declare a variable for each file. There is a way to identify all the files in the same directory and with the same schema?
to load only files beginning with the name myFile_.
You suggested in your final sentence that files in same directory may have different schema. If this is the case and files with the same schema have similar names, you can use the globs in your filenames as shown above to pull only same-schema files from the directory.