Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive 3.1.0 rounds long integers stored as string

Highlighted

Hive 3.1.0 rounds long integers stored as string

New Contributor

UPDATE:

Issue occurs even when doing Hive SQL Insert:

INSERT INTO test_ids SELECT "12345678901234567890"

And the result is:

12345678901234567000

Original problem:

I'm using hive 3.0 streaming to write string columns which are long numeric ids. Interestingly these numbers are being rounded so when stored, last few digits are being saved as 0. This is the famous JSON/JavaScript problem with numbers, but my data is string. Here is sample Scala code to reproduce:

import shadehive.org.apache.hadoop.hive.conf.HiveConf
import org.apache.hive.streaming.HiveStreamingConnection
import org.apache.hive.streaming.StrictJsonWriter
import org.apache.hive.streaming.StrictDelimitedInputWriter
import org.junit.Assert
val hiveConf = new HiveConf()
hiveConf.set("hive.metastore.uris", "thrift://localhost:9083")
hiveConf.set("metastore.catalog.default", "hive")
hiveConf.setVar(HiveConf.ConfVars.HIVE_CLASSLOADER_SHADE_PREFIX, "shadehive")
val writer = StrictJsonWriter.newBuilder()
  .build()
val connection = HiveStreamingConnection.newBuilder()
  .withDatabase("default")
  .withTable("test_ids")
  .withRecordWriter(writer)
  .withHiveConf(hiveConf)
  .withAgentInfo("(my_test)")
  .connect()
connection.beginTransaction()
val rec1 = "{\"id\" : \"12345678901234567890\"}"
connection.write(rec1.getBytes())
val rec2 = "{\"id\" : \"a12345678901234567890\"}"
connection.write(rec2.getBytes())
connection.commitTransaction()
connection.close();

12345678901234567890 Value gets stored as 12345678901234567000, but the second value starting with character 'a' is saved correctly. Hive table is created as follows:

CREATE TABLE `test_ids`(`id` STRING)

Am I doing anything wrong? Or is there a workaround for this issue?

3 REPLIES 3

Re: Hive 3.1.0 rounds long integers stored as string

New Contributor

I have posted the question in StackOverflow but no solution: https://stackoverflow.com/questions/54333105/hive-insert-to-string-column-rounds-the-numeric-string

I wonder there is something in my setup which causes this.

Re: Hive 3.1.0 rounds long integers stored as string

New Contributor

Just to be sure, I upgraded the cluster to HDP 3.1.0.0-78, and issue is still there. Interestingly, if I add a non-numeric character, it saves it correctly without rounding.

Re: Hive 3.1.0 rounds long integers stored as string

New Contributor

I finally realized this rounding is not done in hive, but in Zeppelin UI where I run my SELECT query which returns rounded result. There is an open bug for this issue: https://issues.apache.org/jira/browse/ZEPPELIN-1434

It is only rounding when showing on UI, so underlying data is correct.

Don't have an account?
Coming from Hortonworks? Activate your account here