Reply
New Contributor
Posts: 2
Registered: ‎12-17-2014

Installing Teradata Connector in CDH4 Cloudera Quickstar VM

Hello everyone:

I'm trying to develop a small proof of concept to use Hadoop to store our older data stored in a Teradata DB. I want to store the content of some tables to know if is possible and feasible to do it. I would like to use Sqoop with HUE to achieve that. ¿Is that possible?

I'm new to Linux and Hadoop.

I've downloaded the VMWare version of Cloudera QuickStart VM CDH 4 (CDH 5 works too slow in my PC) from

http://cloudera.com/content/cloudera/en/documentation/core/latest/topics/cloudera_quickstart_vm.html

It's running and ok.

I understand that the next step is to install the Teradata connector in that VM to getting Sqoop communicated with our Teradata DB. I've found this tutorial in Clouderas documentation

http://www.cloudera.com/content/cloudera/en/documentation/connectors/latest/Teradata/Cloudera-Connec...

I'm following the Installation without Cloudera Manager section, as I've not CDH 5 nor Internet access from the VM. I will explain what I have done in every step (in blue goes the text from the documentation, in red what I've done)

1) Install the Sqoop connector by opening the distribution archive in a convenient location such as /usr/lib. Opening the distribution creates a directory that contains the jar file of the compiled version of the connector. Note the path to this jar file. The directory that is created when the file is expanded varies according to which connector you are using. Examples of typical resulting paths include:

    Cloudera Connector Powered by Teradata 1.2cX: /usr/lib/sqoop-connector-teradata-1.2cX/sqoop-connector-teradata-1.2cX.jar
    Cloudera Connector for Teradata 1.2cX: /usr/lib/sqoop-td-connector-1.2cX/sqoop-td-connector-1.2cX.jar

I've chose the Cloudera Connector because it's free and is for a small demo. I've downloaded it from

http://www.cloudera.com/content/cloudera/en/downloads/connectors/sqoop/teradata/1-2c4-for-teradata.h...

and deployed the content in

/usr/lib/sqoop-td-connector-1.2c4/sqoop-td-connector-1.2c4.jar

2) Copy the Teradata JDBC drivers (terajdbc4.jar and tdgssconfig.jar) to the lib directory of Sqoop installation. You can obtain these drivers from the Teradata download website: http://downloads.teradata.com/download/connectivity/jdbc-driver. Without these drivers, the connector will not function correctly.

I've downloaded the driver and deployed the files into

/usr/lib/sqoop/lib/

Is that route correct?

3) Confirm that the managers.d directory exists in the Sqoop configuration directory.
  Note: Depending on how Sqoop is installed, its configuration directory may be in /etc/sqoop/conf, /usr/lib/sqoop/conf, or elsewhere if Sqoop was installed using the tar-ball distribution.

If the managers.d directory does not exist, create it and ensure the directory permissions are set to 755.

There was no managers.d directory in /etc/sqoop/conf, so I've created giving the appropiate permissions with CHMOD.

drwxr-xr-x  2 root root 4096 Dec 16 07:58 managers.d

4) Create a text file in the managers.d directory with a descriptive name such as cldra_td_connector. Ensure the file permissions are set to 644.

I've created the file with the required permissions:

-rw-r--r-- 1 root root 113 Dec 16 08:02 cldra_td_connector

5)The cldra_td_connector file must have the connector class name followed by the complete path to the directory where the connector jar is located.
For example, for the Cloudera Connector powered by Teradata 1.2cX

com.cloudera.connector.teradata.TeradataManagerFactory= \
/usr/lib/sqoop-connector-teradata-1.2cX/sqoop-connector-teradata-1.2cX.jar

For example, for the Cloudera Connector for Teradata 1.2cX:

com.cloudera.sqoop.manager.TeradataManagerFactory= \
/usr/lib/sqoop-td-connector-1.2cX/sqoop-td-connector-1.2cX.jar

  Note: The preceding command is shown on two lines, but this must be entered in a single line.
The TeradataManagerFactory acts as a single point of delegation for invoking the connector bundled with this distribution. An alternate way to specify TeradataManagerFactory is to add the following inside a sqoop-site.xml file, which must be inside a classpath directory:

<configuration>
  <property>
    <name>sqoop.connection.factories</name>
    <value>com.cloudera.sqoop.manager.TeradataManagerFactory</value>
  </property>
</configuration>

This is the way to configure a Sqoop action to use the Teradata connector inside Oozie.

I've written the following in the /etc/sqoop/conf/managers.d/cldra_td_connector file (in a single line)

com.cloudera.sqoop.manager.TeradataManagerFactory=

/usr/lib/sqoop-td-connector-1.2c4/sqoop-td-connector-1.2c4.jar

After having completed all the steps, I start the sqoop service in my VM.

Next, I go to HUE, select Sqoop from the upper menu. I click the "new job" button, and then "add a new connection" button.

And there, in the "connection" combobox, only "generic jdbc connector" appears.

I think that I've done all the steps correctly a "teradata" connector or something should appear, is that true?

I am doing it right?

Thank you.

Posts: 1,827
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Installing Teradata Connector in CDH4 Cloudera Quickstar VM

Hue's Sqoop app utilises the Sqoop2 service/framework which does not yet support Teradata, as it is still under development to be on-par with all the functions Sqoop1 provides via the CLI command "sqoop".

You may have higher success in your task by attempting to run a "Sqoop Action" via Hue's Oozie App (i.e. the Workflow Editor), which runs a Sqoop1 CLI style job, but via Hue.
New Contributor
Posts: 2
Registered: ‎12-17-2014

Re: Installing Teradata Connector in CDH4 Cloudera Quickstar VM

Thank you for the reply.

 

I'll try it.

Announcements