About balavignesh_nag

kumarvaibhav199 · ‎05-12-2017

Number of mappers involved in a job= Number of input splits and number of input splits depends on your block size and file size .If file size is 256 mb and block size is 128mb it will involve 2mappers. @Bala Vignesh N V

balavignesh_nag · ‎05-11-2017

@Ashnee Sharma Good article. Thanks for sharing!

balavignesh_nag · ‎05-05-2017

@Vinay R Glad it helped you. If you think it solves your problem then please accept the answer.

balavignesh_nag · ‎04-29-2017

@Wynner Atlast !! Yeah!! Thanks a ton!

balavignesh_nag · ‎04-20-2017

Hi @Simran Kaur, There is no way that first column can be considered as column name. But if the structure changes the its better to load the data as AVRO or Parquet file in hive. Even if the structure changes there is no need for you to change the old data and new data can be inserted into the same hive table. Points to be noted: 1.External table has to be used 2.You might need a stage table before loading into External hive table which should be in avro/parquet format Steps: 1. Create external table with columns which you have as avro/parquet. 2. Load the csv into stage table and then load the stage data into external table hive table. 3. If the columns changes then drop the external table and re-create with additional fields. 4. Insert the new file by following steps 1-2 By this way there will not be any manually work needed to modify the existing data as avro by default will show 'null' for columns which are available in the table but not in the file. The only manually work is to drop and re-create the table ddl. Let me know if you needed any details. And if you feel it answers your question then please accept the answer

pminovic · ‎04-16-2017

You can use one of the following regexp_replace(s, "\\[\\d*\\]", ""); regexp_replace(s, "\\[.*\\]", ""); The former works only on digits inside the brackets, the latter on any text. Escapes are required because both square brackets ARE special characters in regular expressions. For example: hive> select regexp_replace("7 September 2015[456]", "\\[\\d*\\]", ""); 7 September 2015

kelvintong718 · ‎04-13-2017

@Michael Young Thanks ! That worked like a charm. I still have no idea why it doesn't let me upload using the HDFS UI so if you know why then I would love to know.

yangjy · ‎04-18-2017

This is a longer regex, assumed the log_entry meets 2 ip address displayed.

balavignesh_nag · ‎04-02-2017

Thanks @Scott Shaw. Does it mean I have to update the metadata each time after I truncate the partition? Even if the metadata exists it should not display wrong results. In my case select distinct country from mytable should display only India.

sbomma · ‎04-03-2017

Are these tables External Tables? In the case of external tables you would have manually clean the folders by removing the files and folders that are referenced by the table ( using hadoop fs -rm command)

Online	Offline
Last Visited	‎10-03-2019 09:01 AM

Member Since	‎05-02-2017 01:47 PM
Last Visited	‎10-03-2019 09:01 AM
Posts	360
Kudos received	64

Cloudera Community

Re: what is the best way to get ftp file to hdfs c...

Re: when yarn communicates with the namenodes when...

Re: [TEZ] are partition, sort and shuffle built-in...

Re: CASE statement Error in Beeline HIVE

Re: hive query to display Week of the timestamp an...

Re: How Files loaded through a Hive table can be d...

Re: Getting error while doing distcp with two secu...

Re: Hive Insert overwrite directory producing junk...

Re: NiFi installation

Re: column names in a hive table

Re: How to remove '[' from a column

Re: Why can't I upload a simple text file onto HDF...

Re: Hive RegEx finding second pattern

Re: Bug: Partioning In Hive

Re: I am new to hive. how to delete Hive default f...