Member since
01-20-2014
578
Posts
102
Kudos Received
94
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5724 | 10-28-2015 10:28 PM | |
2724 | 10-10-2015 08:30 PM | |
4749 | 10-10-2015 08:02 PM | |
3542 | 10-07-2015 02:38 PM | |
2340 | 10-06-2015 01:24 AM |
09-20-2014
08:06 AM
The error message here might hold the key. Can you verify why it might not be executable? Did you change permissions at some point? /opt/cloudera-manager/cm-5.1.2/lib64/cmf/service/common/cloudera-config.sh: line 172: /pkg/moip/mo10755/work/mzpl/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/meta/cdh_env.sh: Permission denied
... View more
09-20-2014
04:34 AM
Are you able to provide us the logs from the ZooKeeper instance (/var/log/zookeeper)? It should tell us why it's not starting. Please paste the logs into pastebin and provide the URL here. You just need to provide the section covering the last startup attempt and the failure
... View more
09-18-2014
07:35 PM
Thank you for the update, glad you were able to resolve the problem.
... View more
09-18-2014
02:31 AM
Since you're using VMWare, you could very well do what QuickStart basically is. - Create one VM and configure all services as you wish. - Take a snapshot or export the appliance - Clone it ten times for ten virtual machines When you want to update CDH just update the master image and repeat the process.
... View more
09-18-2014
02:11 AM
I am not aware of how you can get the quickstart VM to work with ESXi. Is there anything specific you need from the quickstart VM? Why not create a blank VM with CentOS6.4 and install CDH+CM from scratch? The whole install process is pretty easy. http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Installation-Guide/Cloudera-Manager-Installation-Guide.html If you do happen to try this and run into any issues, please start a new thread and we'll be happy to assist.
... View more
09-17-2014
07:35 PM
Are you able to try the VMWare image with the Player product and let us know if it works for you? http://www.vmware.com/products/player http://www.cloudera.com/content/support/en/downloads/quickstart_vms/cdh-5-1-x1.html
... View more
09-15-2014
02:41 AM
The advantage of using HAR files is not in saving of disk space but in lesser metadata. Please read the blog link I pasted earlier. quote: === A small file is one which is significantly smaller than the HDFS block size (default 64MB). If you’re storing small files, then you probably have lots of them (otherwise you wouldn’t turn to Hadoop), and the problem is that HDFS can’t handle lots of files. Every file, directory and block in HDFS is represented as an object in the namenode’s memory, each of which occupies 150 bytes, as a rule of thumb . So 10 million files, each using a block, would use about 3 gigabytes of memory. Scaling up much beyond this level is a problem with current hardware. Certainly a billion files is not feasible. Furthermore, HDFS is not geared up to efficiently accessing small files: it is primarily designed for streaming access of large files. Reading through small files normally causes lots of seeks and lots of hopping from datanode to datanode to retrieve each small file, all of which is an inefficient data access pattern. ===
... View more
09-15-2014
01:48 AM
If you use HAR to combine 8 smaller files (each less than 1M), it would occupy just one block. More than disk space saved, you save on metadata storage (on the namenode and datanodes) and this is far more significant in the long term for performance.
... View more
09-15-2014
01:44 AM
The block on the file system isn't a fixed size file with padding, rather it is just a unit of storage. The block's size can be maximum of 128MB (or as configured), so if a file is smaller, it will just occupy the minimum needed space. In my previous response, I had said 8 small files would take up 3GB of space. This is incorrect. The space taken up on the cluster is still just the file size times 3 for each block. Regardless of file size, you can divide the size by the block size (default 128M) and round up to the next whole number, this will give you the number of blocks. So in this case, the 3922 byte file uses one block to store the contents.
... View more
09-15-2014
12:12 AM
> The HDFS block size in my system is set to be 128m. Does it mean that > if I put 8 files less than 128m to HDFS, they would occupy 3G disk > space (replication factor = 3) ? Yes, this is right. HDFS blocks are not shared among files. > How could I know the actual occupied space of HDFS file ? The -ls command tells you this. In the example below, the jar file is 3922 bytes long. # sudo -u hdfs hadoop fs -ls /user/oozie/share/lib/sqoop/hive-builtins-0.10.0-cdh4.7.0.jar -rw-r--r-- 3 oozie oozie 3922 2014-09-14 06:17 /user/oozie/share/lib/sqoop/hive-builtins-0.10.0-cdh4.7.0.jar > And how about I use HAR to archive these 8 files ? Can it save some > space ? Using HAR is a good idea. More ideas about dealing with the small files problem is in this link http://blog.cloudera.com/blog/2009/02/the-small-files-problem/
... View more