Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Data Ingestion From Hive to GP

Data Ingestion From Hive to GP

What is the best and fast way of moving data from Hive to Greenplum.? Using gpfdist protocol or gphdfs protocol?


Re: Data Ingestion From Hive to GP

New Contributor

gpfdist serves files in posix filesystem while gphdfs connects directly to hdfs. Both protocols serve files in parallel so both are very fast. I would use gphdfs so that you won't have to export the data from hdfs to posix before loading.

Don't have an account?
Coming from Hortonworks? Activate your account here