About balavignesh_nag

balavignesh_nag · ‎04-18-2017

@Pardhu T You might have to check this Link. Its related to the ticket here that you might want to look at.

balavignesh_nag · ‎04-17-2017

@Saikrishna Tarapareddy you mean to say that 172.16.1.4 should be generated?. The other way of doing it is split the data and then use regexp on the second string you got from split. Example: SPLIT('hive:hadoop',':') returns ["hive","hadoop"]

balavignesh_nag · ‎04-16-2017

Hi @Constantin Stanca Thanks. At present im using the combination of substr and instr only. Just wanted to know if there are any other possibilities. My current solution is Substr('28 May 2016[35]',1,instr('28 May 2016[35]','[' - 1 ))

balavignesh_nag · ‎04-15-2017

Hi.. Is there a way to find '[' from a column. I have a field which has a value of '28 May 2016[3]' and I need the output as '28 May 2016' I tried with regexp and split but while using '[' im facing an error. Also please dont suggest substr because my value will change and it will contain like '7 September 2015[456]' , '2 Sep 2014[34]'. Is there any way out in hive?

balavignesh_nag · ‎04-13-2017

@Kelvin Tong Copy your file from windows to the host 127.0.0.1. From local directory copy to HDFS using hadoop fs -ls <local file path> <hdfs path>

balavignesh_nag · ‎04-13-2017

@Kelvin Tong Are you trying to copy the file from windows to hadoop? I think the file which you are trying to be copied should be present in the server on which hadoop is installed..

balavignesh_nag · ‎04-12-2017

@Jonathan Samelson Glad you have solved. Be careful in choosing block size if you are going to deal with large chuck of data then its better to choose a higher value. But then if your just getting know few stuffs in HDFS for small amount of data then smaller block size wont affect the process.

balavignesh_nag · ‎04-11-2017

@Christopher Daumer Check the delimiter which you used in creating external table is correct. Also could you share the sample data and DDL which you have used to create the hive table? I dont think the compression property or size will be the reason for the issue.

balavignesh_nag · ‎04-10-2017

@Anand Pawar Its kind of tricky here. You can have the header when storing in HDFS. While processing the data for analysis you should remember that file contains header and it should be skipped orelse it will cause errors. As mentioned above if you use skip header properties it will be skipped by default in hive. However the base data lying underneath the hive table will contain header which can be used for any further processing. In simple when storing it you can have header but when processing the data you should not have header. If you feel it satisfies your question then accept the answer.

balavignesh_nag · ‎04-09-2017

Hi @kerra Bucketing is supported for hive 2.x and above. set hive.enforce.bucketing = true ; The main reason is that it allows the correct number of reducers and the cluster by column to be automatically selected based on the table. Otherwise, you would need to set the number of reducers to be the same as the number of buckets as in set mapred.reduce.tasks = 256; and have a CLUSTER BY ... clause in the select.

Online	Offline
Last Visited	‎10-03-2019 09:01 AM

Member Since	‎05-02-2017 01:47 PM
Last Visited	‎10-03-2019 09:01 AM
Posts	360
Kudos received	64

Cloudera Community

Re: what is the best way to get ftp file to hdfs c...

Re: when yarn communicates with the namenodes when...

Re: [TEZ] are partition, sort and shuffle built-in...

Re: CASE statement Error in Beeline HIVE

Re: hive query to display Week of the timestamp an...

Re: What does bucket_00003_flush_length indicate w...

Re: Hive RegEx finding second pattern

Re: How to remove '[' from a column

How to remove '[' from a column

Re: Why can't I upload a simple text file onto HDF...

Re: Why can't I upload a simple text file onto HDF...

Re: Removing files in HDFS does not free up space

Re: How to convert ORC file into CSV, or how to cr...

Re: For HDPCD exam, Is it necessary to remove hea...

Re: HIVE MR VS TEZ difference in output, ,Hi,