Member since
01-23-2022
1
Post
0
Kudos Received
0
Solutions
01-23-2022
02:54 AM
Hello everyone, first of all I apologise for my English. I'm facing a big problem between IBM DataStage and HortonWorks Let me first explain IBM DataStage: It's an ETL tool that's some connection types for importing/exporting data from a number of DataSource types. I'm trying to load data from IBM DataStage 11.7 into Hive using the Hive connector, but I'm encountering some strange behavior: There are a couple of configurations for the Hive connector, the most important of which is - as I suspected - : . Record count=2000 . Batch size=2000 for a dataset with 8 columns and almost 1000 rows, data inserted into Hive. For a dataset with 200 columns and 20 million rows, it behaves strangely: for 10 columns, works. For more than 10 columns, the multiplication of the stack size propertiesfails - I mean for 2000, 4000 or 20000 rows - with 'IIS-CONN-DAAPI -00099 Hive_Connector_7,0: java.lang.StringIndexOutOfBoundsException: String index out of bounds: 0 at java.lang.String.substring (String.java: 2667)' I'm sure this error isn't related to String because with 'Batch size=2000' the job loads almost 2000 rows into the hive table and if I increase the value to 4000 it loads almost 4000 records into the hive table. Does anyone know the reason for this error? Thanks a lot
... View more
Labels: