Support Questions

Find answers, ask questions, and share your expertise

subproduct files into HDFS

avatar
Master Collaborator

Hi:

I have a question about layers on HDFS.

If i need to make subproducts, is better proccess with pig, spark or R the Virgin files and convert it into transformed files and insert in hive, o better attack the virgin files and show with any analitic sofware??

thanks

1 ACCEPTED SOLUTION

avatar

@Roberto Sancho You will be able to achieve much better performance by transforming files using simple peocessing in pig/Hive and create ORC Hive tables on transformed data.

View solution in original post

1 REPLY 1

avatar

@Roberto Sancho You will be able to achieve much better performance by transforming files using simple peocessing in pig/Hive and create ORC Hive tables on transformed data.