Member since
02-11-2019
5
Posts
0
Kudos Received
0
Solutions
04-08-2019
10:05 AM
Hi All,
When I use datatype float VS double I see that when data is ingested with n digit precision in fraction part ex (2.57) , subsequent query output returns 2.5 59999942779541 for float whereas it result exact 2 digit precision for float as shown in example below. Is there a way we can achieve similar behavior of double for float.
Problem with float is we need to roundoff the output value explicitly when using in different applications.
[hadoop-data.default.svc.cluster.local:21000] > create table parquet_table_name (x float, y float) STORED AS PARQUET;
[hadoop-data.default.svc.cluster.local:21000] > insert into TABLE test values(2.56,2.57); Query: insert into TABLE test values(2.56,2.57) Query submitted at: 2019-04-08 09:56:00 (Coordinator: https://hadoop-data-0:25000) Query progress can be monitored at: https://hadoop-data-0:25000/query_plan?query_id=634a267b54e95cc1:2f111d6300000000 Modified 1 row(s) in 17.34s [hadoop-data.default.svc.cluster.local:21000] > [hadoop-data.default.svc.cluster.local:21000] > [hadoop-data.default.svc.cluster.local:21000] > select * from test; Query: select * from test Query submitted at: 2019-04-08 09:56:25 (Coordinator: https://hadoop-data-0:25000) Query progress can be monitored at: https://hadoop-data-0:25000/query_plan?query_id=cf4731ca88960669:972a651a00000000 +-------------------+------+ | x | y | +-------------------+------+ |2.5 59999942779541 | 2.57 |
+-------------------+------+ Fetched 1 row(s) in 2.52s [hadoop-data.default.svc.cluster.local:21000] >
Thanks,
Raju
... View more
Labels:
- Labels:
-
Apache Impala
02-13-2019
11:54 AM
Thanks so much for help , I will try out sorting and validate query performance.
... View more
02-12-2019
03:52 PM
If I am using dictionary encoding for the column, do I still need to write data in sorted order in parquet file .
... View more
02-12-2019
09:38 AM
I am using parquet-cpp to write parquet file and the upload it to HDFS using web-hdfs . At the end use "LOAD DATA" command to load iparquet file nto into impla. Is there any option in parquet-cpp to sort it out.
... View more
02-11-2019
12:46 PM
Hi All, I was looking at this BLOG https://blog.cloudera.com/blog/2017/12/faster-performance-for-selective-queries/ where we see that using "SORT BY" during table creation we can improve impala query performance . As mentioned in the blog this works only if we use "INSERT" or "CREAT table with select " . Our use case is we create parquet file externally and UPLOAD it onto HDFS and then use IMPALA " LOAD DATA" command. Is there a way we can use "SORT BY" mechanism with this model of loading parquet files. Thanks, Raju.
... View more
Labels:
- Labels:
-
Apache Impala