Support Questions
Find answers, ask questions, and share your expertise

Hadoop installation on GPDB to use GPHDFS protocol

Hadoop installation on GPDB to use GPHDFS protocol

New Contributor

Hello,

I am using Greenplum Database along with HDP Hadoop 2.3.6. Greenplum gives gphdfs protocol to connect and access data from HDFS. To use gphdfs, it is required to install Hadoop binaries on GPDB cluster. Is there any document that provides information on that. I am this document but they have not specified the binaries that are required to install HDP.

https://discuss.pivotal.io/hc/en-us/articles/202635496-How-to-access-HDFS-data-via-GPDB-external-tab...

1 REPLY 1

Re: Hadoop installation on GPDB to use GPHDFS protocol

New Contributor

Hi @Govind Tagai,

You can setup remote repositories using - https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.6/bk_installing_manually_book/content/config-...

Once they are setup, you can use yum to install the clients that you need. Example - yum install hadoop hadoop-hdfs hadoop-libhdfs hadoop-yarn hadoop-mapreduce hadoop-client openssl

The above steps should be performed on all the greenplum hosts.

For further reference - https://community.hortonworks.com/questions/10092/how-can-hadoop-client-libraries-be-added-to-a-node...