Hello,
We are getting below exception while reading table from netezza using spark,
py4j.protocol.Py4JJavaError: An error occurred while calling o287.count.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 10.0 failed 4 times, most recent failure: Lost task 0.3 in stage 10.0 (TID 13, executor 10): org.netezza.error.NzSQLException: netezza.bad.value
at org.netezza.sql.NzResultSet.getDbosTimestamp(NzResultSet.java:4053)
at org.netezza.sql.NzResultSet.getTimestamp(NzResultSet.java:1578)
at org.netezza.sql.NzResultSet.getTimestamp(NzResultSet.java:1528).
Attaching full stackstarce.
we are using nzjdbc3.jar for netezza and spark connection and below are the connection string,
input_df = spark.read.format('jdbc').options(url='jdbc:netezza://server_name:port/dbname', user='', password='’, driver='org.netezza.Driver',dbtable="(select * from schema_name.table_name limit 100) as t").load()
I am able to print schema of dataframe but when i performed some action like show(),count() It is failing for timestamp column for selected tables, for other tables it is working fine. Also i am able to select other columns other than timestamp columns.
The below workaround we tried,
1) convert timestamp to stringType() still failing.
What will be the fix for this issue?
Thanks