About vamsi123

vamsi123 · ‎01-15-2019

Hi All, Any Input on my clarifications?Faced this scenario one more time

vamsi123 · ‎11-26-2018

Hi experts Anybody Input on my mail?

vamsi123 · ‎11-24-2018

Which one will occur first in MapReduce Flow among shuffling and sorting? To my knowledge shuffling will occur first and then Sorting? Correct me I am wrong. Any body can explain these two things? Below statement from the Definative guide: MapReduce makes the guarantee that the input to every reducer is sorted by key. The process by which the system performs the sort—and transfers the map outputs to the reducers as inputs—is known as the shuffle.

vamsi123 · ‎11-15-2018

@Aditya Sirna Do you mean if we are familiar with Python,We can Work on Spark.In Real time Projects only Python is sufficient. Do I need to learn Scala or Java for real time Projects?

vamsi123 · ‎11-15-2018

Could anybody guide me what is the learning path for Spark? I am familiar with Hadoop,Hive,Pig,sqoop,oozie,Python and Hbase.I do not know much about Java. Do I need to learn both Java and Scala to start with spark? I am completed confused where to start for Spark?

vamsi123 · ‎10-13-2018

I have set the No of reducers to 2 but still Hive is executing with 1.Any body help on this set hive.exec.reducers.max=2 Hive (default)> insert overwrite directory '/input123456' > select count(*) from partitioned_user; Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201810122125_0003, Tracking URL = http://ubuntu:50030/jobdetails.jsp?jobid=job_201810122125_0003 Kill Command = /home/naresh/Work1/hadoop-1.2.1/libexec/../bin/hadoop job -kill job_201810122125_0003 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2018-10-12 21:36:24,774 Stage-1 map = 0%, reduce = 0% 2018-10-12 21:36:32,825 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 4.12 sec 2018-10-12 21:36:41,919 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 4.12 sec 2018-10-12 21:36:42,926 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 6.38 sec MapReduce Total cumulative CPU time: 6 seconds 380 msec Ended Job = job_201810122125_0003 Moving data to: /input123456 MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 Cumulative CPU: 6.38 sec HDFS Read: 354134 HDFS Write: 5 SUCCESS Total MapReduce CPU Time Spent: 6 seconds 380 msec OK _c0 Time taken: 37.199 seconds

vamsi123 · ‎10-11-2018

How to get the list of functions available in any jar file? Let us say I have Piggybank.Jar.It contains Reverse,UnixToISO() etc. Is there any command to get list of functions available in Jar file rather than using Google for it?

vamsi123 · ‎03-07-2017

Thanks for comments.I will do it definately starting from this post.

vamsi123 · ‎03-04-2017

Any input on my clarification

vamsi123 · ‎03-03-2017

Thanks for input.what is the problem with my relation C. STRSPLIT will generate tuple as output.Here it will consists of two fields in a tuple. (a1:chararray, a1of1:chararray) is also a tuple since it is enclosed in parentheses and also consists of two fields

Online	Offline
Last Visited	‎01-29-2018 02:47 AM

Member Since	‎01-12-2016 07:23 AM
Last Visited	‎01-29-2018 02:47 AM
Posts	123
Kudos received	12

Cloudera Community

Re: Pig converting tuple to bag

Re: Where is the output of an Oozie workflow store...

Re: Map reduce Flow clarification

Map reduce Flow clarification

Re: Spark Learning Clarification

Spark Learning Clarification

No of Reducers are not working on Hive

List of Functions in a Jar File

Re: Pig Incompatable schema

Re: Pig Incompatable schema

Re: Pig Incompatable schema