Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!
Labels (1)
avatar
Expert Contributor

Assumptions:

-You have a running HDP cluster with Sqoop installed

-Basic knowledge of Sqoop and its parameters

Ingesting SAP HANA data with Sqoop

To ingest SAP HANA data, all you need is a JDBC driver. To the HDP platform, HANA is just another database - drop the JDBC driver in and you can plug & play.

1. Download the JDBC driver. This driver is not publicly available - it is only available to customers using the SAP HANA product. Find it on their members-only website and download it.

2. Drop the JDBC driver into Sqoop's lib directory. For me, this is located at /usr/hdp/current/sqoop-client/lib

3. Execute a Sqoop import. This command has many variations and many command-line parameters, but the following is one such example.

sqoop import --connect "jdbc:sap://<HANA_SERVER>:30015" --driver com.sap.db.jdbc.Driver --username <YOUR_USERNAME> --password <PASSWORD> --table "<TABLE_NAME>" --target-dir=/path/to/hdfs/dir -m 1 -- --schema "<YOUR_SCHEMA_NAME>"

The '-m 1' argument will limit Sqoop to using one thread, so don't use this if you want parallelism. You'll need to use the --split-by argument and give it a column name to be able to parallelize the import work.

If all goes well, Sqoop should start importing the data into your target directory.

Happy Sqooping!

2,681 Views
Labels
Version history
Last update:
‎11-02-2017 05:31 PM
Updated by:
Contributors