About vhp1360

vhp1360 · ‎01-23-2022

Hello everyone, first of all I apologise for my English. I'm facing a big problem between IBM DataStage and HortonWorks Let me first explain IBM DataStage: It's an ETL tool that's some connection types for importing/exporting data from a number of DataSource types. I'm trying to load data from IBM DataStage 11.7 into Hive using the Hive connector, but I'm encountering some strange behavior: There are a couple of configurations for the Hive connector, the most important of which is - as I suspected - : . Record count=2000 . Batch size=2000 for a dataset with 8 columns and almost 1000 rows, data inserted into Hive. For a dataset with 200 columns and 20 million rows, it behaves strangely: for 10 columns, works. For more than 10 columns, the multiplication of the stack size propertiesfails - I mean for 2000, 4000 or 20000 rows - with 'IIS-CONN-DAAPI -00099 Hive_Connector_7,0: java.lang.StringIndexOutOfBoundsException: String index out of bounds: 0 at java.lang.String.substring (String.java: 2667)' I'm sure this error isn't related to String because with 'Batch size=2000' the job loads almost 2000 rows into the hive table and if I increase the value to 4000 it loads almost 4000 records into the hive table. Does anyone know the reason for this error? Thanks a lot

Online	Offline
Last Visited	‎01-26-2022 09:14 AM

Member Since	‎01-23-2022 12:44 AM
Last Visited	‎01-26-2022 09:14 AM
Posts	1

Cloudera Community

insert data into Hive from IBM DataStage