About abhishes

Harsh J · ‎12-09-2015

The role level APIs carry the state, but you're querying the service level. Use the role IDs from the service level to then query the roles directly.

GautamG · ‎01-17-2015

Cloudera doesn't have an official position on this tool yet. The project page does mention that Hadoop 2.0 is not supported, so I doubt it will work with CDH 5.2.1. You're of course welcome to provide feedback if you do try it. Note that Cloudera Manager does allow you to monitor several metrics across the cluster. Maybe if you had specific questions on tasks, we can assist without having to use an extra tool.

abhishes · ‎01-06-2015

I looked at the URL which you gave me and its still very thin for Service providers. it just says Service providers deliver Apache Hadoop-based solutions and services to end users interested in leveraging Cloudera’s training resources, support and brand. * How is the support offered to a partner any different from normal cludera support which I get if I purchase Data Hub Edition? * Are there regular training events? or just 1 training which you give at the time of signing up the partner agreement? If there are perioidic trainings then is there a schedule for these partner trainings? * Can you eloborate on what "brand" means above?

GautamG · ‎12-29-2014

When managing multiple clusters with the same instance of Cloudera Manager, each cluster can be on a different version and can be upgraded independent of each other. While having all three environments managed by the same Cloudera Manager seems attractive, it doesn't offer the flexibility of having three truly independent clusters, each with its own Cloudera Manager instance.

abhishes · ‎12-24-2014

I had to delete the directories in HDFS manually It could be that kite-dataset delete command only does a logical delete. this means it only removes the metadata. anyways. doing kite-dataset delete and then a manual delete in HDFS works for me.

abhishes · ‎12-19-2014

I figured out. there is a directory called /dfs I deleted this directory on each machine and then retried. it worked perfectly

abhishes · ‎12-17-2014

OK. I resolved the problem. on hd3work the call to python -c 'import socket; socket.gethostbyname(socket.getfqdn()) was failing. I troubleshooted for a very long time and could not figure out why this problem is occuring. so I deleted the network adapter of the VM and recreated the network adapter. then suddenly it started to work.

abhishes · ‎12-16-2014

this command did not solve the issue. So I deleted my cluster and rebuilt it using the right IP addresses.

abhishes · ‎07-29-2014

Thank you so much. Your answer is absolutely correct. I went to each server and did nn1: service zookeeper-server init --myid=1 --force nn2: service zookeeper-server init --myid=2 --force jt1: service zookeeper-server init --myid=3 --force earlier I had chosen an ID of 1 on every machine. I also corrected my zoo.cfg. to ensure right entries. Now it works and I am able to do sudo -u hdfs hdfs zkfc -formatZK Thank you so much!

Harsh J · ‎07-27-2014

(1) The "driver" part of run/main code that sets up and submits a job executes where you invoke it. It does not execute remotely. (2) See (1), cause it invalidates the supposition. But for the actual Map and Reduce code execution instead, the point is true. (3) This is true as well. (4) This is incorrect. All output "collector" received data is stored to disk (in an MR-provided storage termed 'intermediate storage') after it runs through the partitioner (which divides them into individual local files pertaining to each target reducer), and the sorter (which runs quick sorts on the whole individual partition segments). (5) Functionally true, but it is actually the Reduce that "pulls" the map outputs stored across the cluster, instead of something sending reducers the data (i.e. push). The reducer fetches its specific partition file from all executed maps that produced one such file, and merge sorts all these segments before invoking the user API of reduce(…) function. The merge sorter does not require that the entire set of segments fit into memory at once - it does the work in phases if it does not have adequate memory. However, if the entire fetched output does not fit into the alloted disk of the reduce task host, the reduce task will fail. We try a bit to approximate and not schedule reduces on such a host, but if no host can fit the aggregate data, then you likely will want to increase the number of reducers (partitions) to divide up the amount of data received per reduce task as a natural solution.

Online	Offline
Last Visited	‎08-16-2017 08:06 PM

Member Since	‎07-15-2014 11:42 AM
Last Visited	‎08-16-2017 08:06 PM
Posts	57
Kudos received	10

Cloudera Community

Re: Not able to connect to cloudera manager admin ...

Re: The configuration of new cluster fails at name...

Re: Problem in creating cloudera cluster

Re: Where does CDH5 store the list of data nodes?

Re: Not able to execute Python based Hadoop Stream...

Re: Query the Active Name Node from an Oozie Job

Re: White Elephant with Clouera?

Re: What are the benefits of becoming a Cloudera S...

Re: Multi Tenancy on CDH 5.2.x?

Re: Deleting existing DataSets

Re: The configuration of new cluster fails at name...

Re: Problem in creating cloudera cluster

Re: Where does CDH5 store the list of data nodes?

Re: FATAL ha.ZKFailoverController: Unable to start...

Re: NewBee Question on Map reduce