you can get your multiple files into a Spark RDD with:
val data = sc.textFile("/user/pedro/pig_files/*txt")
val data = sc.textFile("/user/pedro/pig_files")
From this point onwards the Spark RDD 'data' will have as many partitions as there are pig files. Spark is just as happy with that, since distributing the data brings more speed and performance to anything you want to do on that RDD.
Now if you want to merge those files into one and rewrite to HDFS again, it is just:
You can not determine the name of the output file (easily), just the HDFS folder will do
Hope this helps