Reply
Highlighted
Explorer
Posts: 15
Registered: ‎11-20-2013

pyodbc insert kudu and number types

[ Edited ]

Hey all !

 

CDH 5.10

 

I realize that kudu and pyodbc are not supported but wondering if someone has run into this before.  Using pyodbc to insert rows into a kudu table.  DDL, pyodbc call and error below.

 

The issue is:  When the primary key column ("id") is type int, pyodbc throws an error trying to convert it to bigint before insert.

 

If I define the "id" column as bigint, all is fine.   The inserted "id" values will never approach bigint size so would prefer not to use that type.

 

Is there something wrong below or is there a way to control the conversion attempted ?

 

DDL**************

 

CREATE TABLE genomics.pipeline_status(
id INT NOT NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION,
experiment_id INT NOT NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION,
PRIMARY KEY (id)
)
PARTITION BY HASH (id) PARTITIONS 3
STORED AS KUDU
TBLPROPERTIES ('kudu.master_addresses'='<server>.choa.org')
;

 

PYODBC CALL**************

try:

#kerberos ticket already acquired

conn = pyodbc.connect('DSN=IMPALA_DEV', autocommit=True)
#print str(connection)
with conn.cursor() as cur:

cur.execute("insert into genomics.pipeline_status (id, experiment_id) values (?,?)", (int(id), int(experiment_id)))

except pyodbc.Error as e:
logger.critical('getting sync rundb results FAILED: ' + str(e) + '\n' + logPath)
finally:
logger.info('end inserting rows')

 

ERROR**************

getting sync rundb results FAILED: ('HY000', "[HY000] [Cloudera][ImpalaODBC] (110) Error while executing a query in Impala: [HY000] : AnalysisException: Possible loss of precision for target table 'genomics.pipeline_status'.\nExpression 'cast(5 as bigint)' (type: BIGINT) would need to be cast to INT for column 'id'\n (110) (SQLExecDirectW)")
/usr/local/scripts/genomics/logs/syncRunDbResults.py-2017_09_05_17_10_15.log
Announcements