About balavignesh_nag

balavignesh_nag · ‎02-13-2017

Is that so? In relational databases I agree that there be a significant difference. But in hadoop i always thought that it will read the entire record if it is stored as TEXTFILE. Doesnt mapper reads entire record to parse it to reducer?

balavignesh_nag · ‎02-13-2017

I have a hive managed table stored as TEXTFILE. Will there be any change in the performance between select col1,col2 from hive_tabl than select * from hive_tabl. Consider the table has 300 columns with 20 billion of data. Will there be any performance which impacts based on the select clause. If so can ORC storage overcome it or what will be the best way to store the data which will not affect based on the select clause.

balavignesh_nag · ‎02-09-2017

Thanks Frank. I have tried both ways.. But the compression ratio is still the same which has provided in the question.

balavignesh_nag · ‎02-07-2017

I have text file with 2.6Gb. I have loaded it into hive table with text as storage type. From the text hive table i have loaded into avro based hive table by insert into table avro_hive table which is a snappy compression table. Please feel free if you need more details

balavignesh_nag · ‎02-07-2017

I have created a hive avro based table with snappy compression. The size of avro file is 2628MB. The data in the hive avro based table without snappy compression is 2296MB. I have created one more avro hive table with snappy compression and loaded the same data. But there is no big change in the compression size. Also if I describe the table properties it shows that the compression as 'No'. Please find below the table property. Table Parameters: COLUMN_STATS_ACCURATE True avro.compress SNAPPY transient_lastDdlTime 1486455066 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.avro.AvroSerDe InputFormat: org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1

balavignesh_nag · ‎01-04-2017

@Sergey SoldatovFixed width files where there will not be any delimiters available. Each column data will be avilable in the specific length but with no delimiters.

balavignesh_nag · ‎01-04-2017

Can we create external hive table on top of Fixed width file? If yes then how it can be done.

balavignesh_nag · ‎09-25-2016

Im trying to load hive table. I have two different source which has to loaded into same target. Is it ok If run that job in parallel?

balavignesh_nag · ‎09-07-2016

I need to know which performs better in hive. EXISTS or IN?

balavignesh_nag · ‎08-11-2016

I have pdf file. I have copied the file from local system to hdfs. But i need to convert the pdf file into a hive table. Is there anyway to do in hive? I know we can handle the same in pig.

Online	Offline
Last Visited	‎10-03-2019 09:01 AM

Member Since	‎05-02-2017 01:47 PM
Last Visited	‎10-03-2019 09:01 AM
Posts	360
Kudos received	64

Cloudera Community

Re: what is the best way to get ftp file to hdfs c...

Re: when yarn communicates with the namenodes when...

Re: [TEZ] are partition, sort and shuffle built-in...

Re: CASE statement Error in Beeline HIVE

Re: hive query to display Week of the timestamp an...

Re: Will there be any performance issues if we sel...

Will there be any performance issues if we select ...

Re: Snappy Compression on Avro backed Hive table -...

Re: Snappy Compression on Avro backed Hive table -...

Snappy Compression on Avro backed Hive table - Dat...

Re: Can we create external hive table on top of Fi...

Can we create external hive table on top of Fixed ...

Is it possible to load hive table parallely?

Exists or IN which performs better

How to convert pdf file into hive table?