Member since
04-03-2018
6
Posts
0
Kudos Received
0
Solutions
06-22-2018
01:45 PM
Hi @Vinicius Higa Murakami we have different outputs. I still don't know why this is happening. Any clues? File Version: 0.12 with HIVE_13083
18/06/22 14:05:55 INFO orc.ReaderImpl: Reading ORC rows from adl://test.azuredatalakestore.net/apps/hive/warehouse/ustemp.db/special_char/000000_0 with {include: null, offset: 0, length: 9223372036854775807}
18/06/22 14:05:55 INFO orc.RecordReaderImpl: Reader schema not provided -- using file schema struct<col1:varchar(11)>
Rows: 1
Compression: ZLIB
Compression size: 262144
Type: struct<col1:varchar(11)>
Stripe Statistics:
Stripe 1:
Column 0: count: 1 hasNull: false
Column 1: count: 1 hasNull: false min: 1ºTrimestr max: 1ºTrimestr sum: 11
File Statistics:
Column 0: count: 1 hasNull: false
Column 1: count: 1 hasNull: false min: 1ºTrimestr max: 1ºTrimestr sum: 11
Stripes:
Stripe: offset: 3 data: 20 rows: 1 tail: 35 index: 48
Stream: column 0 section ROW_INDEX start: 3 length 11
Stream: column 1 section ROW_INDEX start: 14 length 37
Stream: column 1 section DATA start: 51 length 14
Stream: column 1 section LENGTH start: 65 length 6
Encoding column 0: DIRECT
Encoding column 1: DIRECT_V2
File length: 246 bytes
Padding length: 0 bytes
Padding ratio: 0%
... View more
06-21-2018
09:43 AM
Olá @Vinicius Higa Murakami Yes, you're right, my special character is consuming more than it should for ORC files. With varchar(12) I don't have any problem. I also checked the ascii and md5 functions and they return the same values for both tables. Did you find any differences with the describe extended command? Thanks
... View more
06-20-2018
10:28 AM
Hi @Vinicius Higa Murakami, I'm from Portugal 😄 Thanks for trying to repro the issue. There's something different, can you also check these: hive --version
Hive 1.2.1000.2.6.1.0-129 describe extended special_char;
| Detailed Table Information | Table(tableName:special_char, dbName:ustemp, owner:hive, createTime:1529404515, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:col1, type:varchar(11), comment:null)], location:adl://azuredatalakestoretest.azuredatalakestore.net/apps/hive/warehouse/ustemp.db/special_char, inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{totalSize=246, numRows=1, rawDataSize=95, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"}, numFiles=1, transient_lastDdlTime=1529404574}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE) | |
I have to say we're using Azure Data Lake Store as a data repository for Hadoop.
... View more
06-19-2018
12:24 PM
Creating a orc table with varchar(11) column and insert a value with a special character: create table special_char (varchar(11)) stored as orc;
insert into special_char values('1ºTrimestre');
select * from special_char;
+--------------------+--+
| special_char.col1 |
+--------------------+--+
| 1ºTrimestr |
+--------------------+--+
Creating a textfile: create table special_char_text (varchar(11));
insert into special_char_text values('1ºTrimestre');
select * from special_char_text;
+--------------------+--+
| special_char_text.col1 |
+--------------------+--+
| 1ºTrimestre |
+--------------------+--+
Can someone explain why the value inside the column of ORC table is truncated?
... View more
Labels:
- Labels:
-
Apache Hive
04-04-2018
11:46 AM
@Vani Is it possible to make LLAP the default namespace? I want to use a JDBC URL without the zooKeeperNamespace parameter.
... View more