- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hi, Is there any connector for teradata to spark.We have scenarios to get the data from teradata by using SparkSQl. I am using spark 1.6.0.Please let me know if anyone tired connecting teradata $ spark.Thanks!
- Labels:
-
Apache Spark
Created ‎10-27-2016 05:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎10-31-2016 04:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Make sure you add the jar to your class path and include it when you run the application.
sc.addJar("yourDriver.jar") val jdbcDF = sqlContext.load("jdbc", Map( "url" -> "jdbc:teradata://<server_name>, TMODE=TERA, user=my_user, password=*****", "dbtable" -> "schema.table_name", "driver" -> "com.teradata.jdbc.TeraDriver"))
Created ‎10-27-2016 06:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sounds like a JDBC connection is in order. There is an api for creating a dataframe from jdbc connection.
jdbc(url: String, table: String, predicates: Array[String], connectionProperties:Properties): DataFrame
The issue with JDBC is reading data from teradata will be much slower compared to HDFS. Is it possible to run a sqoop job to move data to hdfs prior to starting your spark application?
Created ‎10-28-2016 02:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Joe,
We want to use spark sql instead sqoop.I tried teradata JDBC driver .Unable to download dependencies .
Thanks
<dependency> <groupId>com.teradata.jdbc</groupId> <artifactId>terajdbc4</artifactId> <version>15.10.00.22</version> </dependency> <dependency> <groupId>com.teradata.jdbc</groupId> <artifactId>tdgssconfig</artifactId> <version>15.00.00.22</version> </dependency>
Created ‎10-31-2016 04:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Make sure you add the jar to your class path and include it when you run the application.
sc.addJar("yourDriver.jar") val jdbcDF = sqlContext.load("jdbc", Map( "url" -> "jdbc:teradata://<server_name>, TMODE=TERA, user=my_user, password=*****", "dbtable" -> "schema.table_name", "driver" -> "com.teradata.jdbc.TeraDriver"))
Created ‎10-31-2016 06:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes ..able to get the tables from teradata .its working fine .Thanks 🙂
Created ‎03-20-2017 12:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Add jars in the spark-defaults.conf:
spark.driver.extraClassPath /opt/spark/jars/terajdbc4.jar:/opt/spark/jars/tdgssconfig.jar
spark.executor.extraClassPath /opt/spark/jars/terajdbc4.jar:/opt/spark/jars/tdgssconfig.jar
But get the invalid IP for the following commands:
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff} span.s1 {font-variant-ligatures: no-common-ligatures}
scala> val jdbcDF = sqlcontext.load("jdbc", Map("url" -> "jdbc:teradata://****my**:1025/john, TMODE=TERA, user=john, password=pass", "dbtable" -> "john.abc", "driver" -> "com.teradata.jdbc.TeraDriver"))
warning: there was one deprecation warning; re-run with -deprecation for details
2017-03-20.08:14:29.170 TERAJDBC4 ERROR [main] com.teradata.jdbc.jdk6.JDK6_SQL_Connection@1504b493 Connection to 9.26.74.151:1025 Mon Mar 20 08:14:29 EDT 2017 invalid IPv6 address at java.net.InetAddress.getAllByName(InetAddress.java:1169) at java.net.InetAddress.getAllByName(InetAddress.java:1126) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF$Lookup.doLookup(TDNetworkIOIF.java:222) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF$Lookup.isLiteralIpAddress(TDNetworkIOIF.java:248) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF.connectToHost(TDNetworkIOIF.java:335) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF.createSocketConnection(TDNetworkIOIF.java:155) at com.teradata.jdbc.jdbc_4.io.TDNetworkIOIF.<init>(TDNetworkIOIF.java:141) at com.terada
,
Created ‎11-12-2019 06:44 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We wont provide any connectors for Teradata to spark. but if you want to get data from Teradata into Spark, you can probably use any JDBC driver that Teradata provides.
Thanks
AKR
