Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Super Collaborator

Assumptions:

-You have a running HDP cluster with Sqoop installed

-Basic knowledge of Sqoop and its parameters

Ingesting SAP HANA data with Sqoop

To ingest SAP HANA data, all you need is a JDBC driver. To the HDP platform, HANA is just another database - drop the JDBC driver in and you can plug & play.

1. Download the JDBC driver. This driver is not publicly available - it is only available to customers using the SAP HANA product. Find it on their members-only website and download it.

2. Drop the JDBC driver into Sqoop's lib directory. For me, this is located at /usr/hdp/current/sqoop-client/lib

3. Execute a Sqoop import. This command has many variations and many command-line parameters, but the following is one such example.

sqoop import --connect "jdbc:sap://<HANA_SERVER>:30015" --driver com.sap.db.jdbc.Driver --username <YOUR_USERNAME> --password <PASSWORD> --table "<TABLE_NAME>" --target-dir=/path/to/hdfs/dir -m 1 -- --schema "<YOUR_SCHEMA_NAME>"

The '-m 1' argument will limit Sqoop to using one thread, so don't use this if you want parallelism. You'll need to use the --split-by argument and give it a column name to be able to parallelize the import work.

If all goes well, Sqoop should start importing the data into your target directory.

Happy Sqooping!

3,034 Views