About Roland

Roland · ‎11-30-2015

Jadair, Your suggestion is on option on our list. We were hoping to save the trouble of reloading large amounts of data between uses. I will look at your references. Amazon's EMR service provides the capability I am looking for. However, I wanted very much to use Cloudera on EC2 for our solution and that is why I was including these tests as part of our Proof of Concept. Thanks for the valuable input you have provided over the last couple of weeks. - rd

Roland · ‎11-30-2015

I have built a 6 node cluster for a Proof of Concept on AWD using Director. The intent of this set of tests is to determine if we can build a Cloudera cluster with Director, run a series of jobs, use CM and EC2 to stop all the servers when done and then restart them when further jobs need to be executed, i.e.. A predefined lab cluster that gets charged only when in use. I am having problems with corrupt hdfs files on restart. I am tryng to pinpoint the right place to look for clues. Is it a 'connection' btween Director and the Cluster (something defined when the cluster was built) or something about using EBS and restating the servers. Any ideas or place to look for clues would be helpful. Here is some background: Setup : (1 Director, 1 CM, 1 Master, 3 Workers and 1 Gateway). I am using CM/CDH 5.4.7. CM and Master are m3.xlarge. Gateway and Director are m3.medium Workers are m3.large. I built a new AMI that is based on ami-8767d1ec, but that I have upgraded to Java 1.8_45. The root drive has ben adjutsed to use 250 GB EBS. Problem: The original creation is built and starts fine. CM shows 100% green. We can add data, run mapreduce jobs on Yarn, examine results, etc. I then stop the CM cluster services, stop the CM Management services, and then go into EC2 to stop all servers. I then restart the machines, go into CM server and restart CM cluster and managment services. All servces comeup fine initially with the exception of HDFS, which starts with an error on the Canary test. Then eventually HDFS goes into safemode and Hive also triggers a metastore canary test error and both services are flagged red and the Namenode goes down. Leaving safemode and restarting the Namenode brings me back to the original error, but eventully it finds itself back to safemode wothout a namenode. Here are some messages from the Namenode UI: Safe mode is ON. The reported blocks 193 needs additional 192 blocks to reach the threshold 0.9990 of total blocks 385. The number of live datanodes 3 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached. 878 files and directories, 385 blocks = 1263 total filesystem object(s). Heap Memory used 166.71 MB of 990.75 MB Heap Memory. Max Heap Memory is 990.75 MB. Non Heap Memory used 46.81 MB of 47.38 MB Commited Non Heap Memory. Max Non Heap Memory is 130 MB. I examined the missing blocks and the majority are blocks that are there via CDH: Oozie directories, etc. Any ideas on how I might track this down? Thanks - rd

Roland · ‎11-29-2015

Thank you. I took your advice and tweaked the ami.

Roland · ‎11-21-2015

I have been completing a set of prototypes using Director to build a cluster on AWS (so this is all dev work). I have defined a private AMI based on the Cloudera ami-8767d1ec (from Cloudera Github) that I am using for the cluster nodes (basically I installed java 1.8 and loaded some additional software) Can I use the rootVolumeSizeGB to expand the disk size defined in the AMI or is the ami definitinon the max limit (i.e. do I need to tweak the AMI to add additional diskspace)? Also, how can I manage the size dfs uses for its data directories. Does Director/CM have keywords/commands that help define that on build that I cna include in my Director AWS script? I am using the batch script mechanism to create my cluster. Thanks - rd

Roland · ‎10-25-2015

Thanks for mentioning that. I checked and you were right - I had CDH: 5. I will chnage it to 5.4.7. - rd

Roland · ‎10-21-2015

Most times my cluster comes up withouit a hitch, but I have started yesterday to have periodic problems with the CDH repository. I am using a variation on the aws.reference.conf and twice out of 4 times over the last two days, I have recieved the following message displayed as output and in the logs: * java.lang.IllegalArgumentException: CDH=5.4.7-1.cdh5.4.7.p0.3 not found in list of all parcels. ... Here is the entry in my config specifying the parcel: parcelRepositories: ["http://archive.cloudera.com/cdh5/parcels/5.4.7/"] Should I be specifying a different repository or should my version entry be different to be more foolproof? Thanks - rd

Roland · ‎10-14-2015

Jadair, I have been able to convert the external Hoson callback references to actuall service roles in my config file. I found the role types by looking at the roles assigned in the original cluster by using the Director UI. I was semi-successful with this approach - some of my role type guesses were wrong. Here are the erorrs I am recieving. Is there an easy way to find the valid role types that Director looks for in the config for these roles? Here are the ones that are falling out.... * Invalid role type(s) specified. Ignored during role creation: SPARK_ON_YARN: HISTORYSERVER HIVE: HIVEMETASTORESERVER HUE: HUESERVER OOZIE: OOZIESERVER HDFS: HTTPFSSERVER YARN: JOBHISTORYSERVER ... done * Automatically configuring services and roles ... done * Applying custom configurations of services ... done * Configuring credentials for S3 access ... done * Creating Hive Metastore Database ............................ done * Calling firstRun on cluster Eastern-BD-Cluster ... done * Suspended due to failure ... Logging out... Thanks - rd

Roland · ‎10-12-2015

Roland · ‎10-12-2015

Great. I will do some playng with that today. Thanks for the reply. - rd

Roland · ‎10-11-2015

I am building a POC cluster on AWS and would like to spread some CM roles across specific workers in my cluster. In the aws.reference.conf file, I see the sections for master and workers. Is there a way in workers section that would allow me to specify specific workers so that I can assign different roles to each? I have googled it and can't find a document that lays out the choices for the config file worker section for my use case. Thanks - rd

Online	Offline
Last Visited	‎06-14-2017 10:52 AM

Member Since	‎03-15-2015 05:23 AM
Last Visited	‎06-14-2017 10:52 AM
Posts	16
Kudos received	3

Cloudera Community

Re: Parcel Repository Error

Re: Corrupted HDFS on Restart of Cluster Built wit...

Corrupted HDFS on Restart of Cluster Built with Di...

Re: Looking for details on rootVolumeSizeGB

Looking for details on rootVolumeSizeGB

Re: Parcel Repository Error

Parcel Repository Error

Re: Director Config File Role Assignment

Re: Director Config File Role Assignment

Re: Director Config File Role Assignment

Director Config File Role Assignment