Created 06-16-2016 07:09 PM
ExecuteSQL processor currently generates data in Avro format. While fetching data from Oracle, generated Avro format currently converts any data type to string format. Is there a way to ensure that it retains data type of original column or an equivalent type ?
Created 06-16-2016 08:07 PM
It should be retaining other types besides strings... What types are your columns that are being converted to strings?
It looks like it should be retaining BINARY, VARBINARY, LONGVARBINARY, ARRAY, BLOB, CLOB, byte, number, and boolean.
Big Integer and Big Decimal are converted to strings because Avro doesn't have a numeric type that supports them.
For reference the code that performs the conversion is here:
Created 06-16-2016 08:07 PM
It should be retaining other types besides strings... What types are your columns that are being converted to strings?
It looks like it should be retaining BINARY, VARBINARY, LONGVARBINARY, ARRAY, BLOB, CLOB, byte, number, and boolean.
Big Integer and Big Decimal are converted to strings because Avro doesn't have a numeric type that supports them.
For reference the code that performs the conversion is here:
Created 06-17-2016 04:03 PM
@Bryan Bende Thanks Bryan. I double checked all the details. My ORACLE table has NUMBER columns. Here are the settings that I am using to establish DBCPConnectionPool. But my Avro data has all NUMBER columns coming as strings. Is there any way to look at details of how NiFi is generating Avro fields formats ?
Created 06-17-2016 04:42 PM
I'm wondering if it is related to the oracle driver... the code in NiFi that I linked to above shows how we convert the fields to Avro. It starts by get the value from the ResultSet like this:
final Object value = rs.getObject(i);
Then it checks what type value is, in the case of a number it does this:
else if (value instanceof Number || value instanceof Boolean) { rec.put(i - 1, value); }
So if the Oracle driver returns any sub-class of Number, then that is what is given to Avro.
I think a good test would be to write a simple Java program using JDBC and the same driver you are using with NiFI, and query your table and see what getObject(i) returns from the ResultSet for your number columns. If it doesn't return Numbers then that is the problem, if it does then something else is happening after that.
Created 06-17-2016 05:00 PM
Actually, was just reading this:
https://docs.oracle.com/cd/B19306_01/java.102/b14188/datamap.htm
One of the rows says:
DEC , DECIMAL , NUMBER , NUMERIC | oracle.sql.NUMBER | java.math.BigDecimal |
So if that is true then Oracle is returning BigDecimal for your NUMBER columns, and then NiFi's code is turning BigDecimal and BigInteger into Strings.
Based on this, it looks like there may be a way to represent BigDecimal in Avro:
https://issues.apache.org/jira/browse/AVRO-1402
https://avro.apache.org/docs/1.7.7/spec.html#Logical+Types
We should look into using the decimal logical type in NiFi.
Created 06-17-2016 05:08 PM
Created 06-22-2016 02:34 AM
Thank You @Bryan Bende for creating JIRA for this issue. Are there any workarounds to this, till this issue gets resolved in a new release ? I can only think of creating a Hive table with appropriate column types, then writing a select query to transform to correct data type and inserting in new table. But this requires a post process and breaks real time ingestion.
Created 06-22-2016 01:23 PM
@Shishir Saxena Unfortunately I can't think of too many good work arounds using out of the box functionality, but I guess it depends what you are doing with the Avro after ExecuteSQL, are you landing it in HDFS? or are you converting it to some other format and doing something else?
Created 05-26-2017 08:26 PM
It looks like the Jira is still out there for this issue. Two questions: 1) Where is this on the roadmap to get fixed? and 2) Does this affect all Oracle datatypes, or just the numerics that convert to Java BigDecimal?