About rajkumar_singh

rajkumar_singh · ‎12-03-2016

@Ram Mit seems you are trying cloudera distribution, can you try running the same command from cloudera user with a small change in table location like this /user/cloudera/hadoop_practise/emp1

rajkumar_singh · ‎12-03-2016

so what I understand your problem is your hive insert query spin two stages processed with 2 MR job in which last job failed result into the inconsistent data into the destination table. spark job also consist of stages but there is lineage in stages so if one of stage got failed after retrying executor retried attempt then your complete job will fail.

rajkumar_singh · ‎12-03-2016

@Ram M can you try create table like this create external table emp1(eno int, ename string, sal string) row format delimited fields terminate by '~' location '/tmp'; this will create a table with name emp1 on hdfs under /tmp location. seems you are following some cloudera documentation to create table on hdp platform but user cloudera won't exist on HDP thats why you are facing problem.

rajkumar_singh · ‎12-03-2016

can you post me command which you are using to create table, do you have a permission on directory where you are creating external table

rajkumar_singh · ‎12-01-2016

@jayaprakash gadi why don't you implement a companion method in Auction class to handle null values.

rajkumar_singh · ‎11-30-2016

RM store some application states to render the UI which is controlled by yarn.resourcemanager.max-completed-applications, the default value for this is 10000, so at any time RM need somewhere ~1G memory to store these applications in memory. you can try to lower the value for yarn.resourcemanager.max-completed-applications to see the drop in RM heap utilization.

rajkumar_singh · ‎11-28-2016

at session level just run this command set hive.default.fileformat=TextFile; and then run your query

rajkumar_singh · ‎11-28-2016

you can change your default file format by setting set hive.default.fileformat=TextFile;

rajkumar_singh · ‎11-21-2016

1.Once enabled HIVE 2.1, can the cluster support HIVE 1.2 and 2.1 simultaneously? yes you can, 2.Can HIVE 1.2 and HIVE 2.1 runs well in one cluster? for hive 2 you need to you need to use HS2 URL for Interective mode while for 1.2 use regular URL connecting HS2 on 10000/10010 3.Can I disable HIVE 2.1 later? yes

rajkumar_singh · ‎11-16-2016

@Gobi Subramani the bucket number is determined by the expression hash_function(bucketing_column) mod num_buckets.say for example if user_id (unique value 40)were an int, and there were 25 buckets, we would expect all user_id's that end in 0 to be in bucket 1, all user_id's that end in a 1 to be in bucket 2, etc.user_id 26 will go in bucket 1 and so on..

Online	Offline
Last Visited	‎08-23-2021 03:30 PM

Member Since	‎04-25-2016 07:57 AM
Last Visited	‎08-23-2021 03:30 PM
Posts	579
Kudos received	568

Cloudera Community

Re: Why Hive Compaction is failing ?

Re: how to setup queue name for squirrel

Re: How to set the logging level of the hiveserver...

Re: Hive Tez Client Memory

Re: Resource Manager API ?

Re: I am getting below error while creating extern...

Re: What happens if one of the Spark task fails wh...

Re: I am getting below error while creating extern...

Re: I am getting below error while creating extern...

Re: How can i replace or handle null values in dat...

Re: Resource Manager Heap usage when cluster is id...

Re: CSV Query to run from hive.default.fileformat ...

Re: CSV Query to run from hive.default.fileformat ...

Re: Hive 2.1 in HDP 2.5

Re: Hive partition and bucketing