Member since 
    
	
		
		
		07-12-2013
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                435
            
            
                Posts
            
        
                117
            
            
                Kudos Received
            
        
                82
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2330 | 11-02-2016 11:02 AM | |
| 3618 | 10-05-2016 01:58 PM | |
| 8275 | 09-07-2016 08:32 AM | |
| 8884 | 09-07-2016 08:27 AM | |
| 2520 | 08-23-2016 08:35 AM | 
			
    
	
		
		
		11-02-2016
	
		
		11:02 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							There's a file at /var/lib/cloudera-quickstart/tutorial/js/config.js you  can edit to manually override the detection. Currently it likely contains  the line:    var managed = true;    I'd recommend changing it to:    var managed = 'express';    And that should unlock the other parts of the tutorial. Do not that the  only parts 'express' unlocks include some sections on checking the health  of services required for each step. The 'enterprise' option of CM will also  add a section on using Navigator to audit access to the data and trace  lineage of data sets.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-05-2016
	
		
		01:58 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							CDH (and Cloudera Manager) are supported on Ubuntu 14.04. You can follow  the standard documentation: it will include the necessary details when the  procedure differs on different Linux distributions. See  http://www.cloudera.com/documentation/enterprise/latest/topics/installation_installation.html  .  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-07-2016
	
		
		08:32 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							The easiest way would be to download and install the JDK version you want  from Oracle's website. They offer RPM packages which should work in the VM,  or a tarball that you can extract yourself anywhere you like. Once it's  installed, make a note of the directory it installed to: the RPMs will  install under /usr/lib/jvm or /usr/java or something like that. The  directory will include the version in the name, and should have a /bin/  directory underneath it. With that directory, you'll want to update the  value of JAVA_HOME in /etc/profile and restart any shell sessions you have  open. If you want CDH to use that JDK as well, export JAVA_HOME in  /etc/default/bigtop-utils.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-07-2016
	
		
		08:27 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							SSH in the VM will listen on port 22 by default. You're hitting port 2222  on your host machine. If you're using VirtualBox, you can set up port  forwarding in VirtualBox so that port 2222 on your host machine is  forwarded to 22 (this is probably the easiest solution, but that isn't done  out of the box). The alternative is to configure the VM to use something  other than NAT for the virtual network. If you configure it to bridged  networking or a similar option, it will get it's own IP address that you  can use to connect to port 22 from your host machine.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-23-2016
	
		
		08:35 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							Depending on what you're doing, the Cloudera Management Services are likely  not needed for your project. They deal with monitoring the various  services. They make it harder to tell from the Cloudera Manager home page  if the service is healthy, but if they crash after 5 minutes it shouldn't  affect any of the services themselves.    In my experience with the VM, often 1 service will fail that impacts the  others (often it's the Host Monitor). I'd look at the monitoring data for  the services to see which one is going down first, and then dig deeper in  it's logs to see what the problem is. 8 GB should not be seen as plenty,  but as the absolute bare minimum required. If you're running all of the  Cloudera Manager services and putting load on Flume, Kafka and Spark /  YARN, I'd expect your VM to be straining to keep up. These are all services  designed to run on fairly large clusters, not minimal VMs - it will  struggle with certain projects. I'd recommend adding more memory if you're  able to - that is likely the reason on of the Cloudera Management Services  isn't keeping up.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-11-2016
	
		
		07:24 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		5 Kudos
		
	
				
		
	
		
					
							The term gateway may be used in lots of contexts - it usually refers to a  machine or service that acts as an entry point to other services. For  example, your entire cluster might be behind a firewall which blocks all  inbound traffic, except that it allows you to log in to one of the  machines. From that machine, you can submit jobs or interact with any of  the services in the cluster. That machine would be called a "gateway".  Often in a Cloudera context, a gateway is just that: a machine that you're  supposed to log into to carry out some tasks that aren't possible from  outside the cluster. Cloudera Manager might manage the machine (meaning it  deploys configuration to it and does basic health checks) but not run any  CDH services on it.    The NFS gateway is a similar idea. It connects to your HDFS cluster and  exposes the filesystem via the NFS protocol. So you might not expose all of  the HDFS ports to your network, but you might expose just the NFS service,  and it therefore acts as a gateway.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-21-2016
	
		
		08:36 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							VirtualBox has the ability to take snapshots of VMs that you can restore to  at a later date.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-20-2016
	
		
		03:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							The QuickStart VM includes a tutorial that will walk you through a use case  where you:    - ingest some data into HDFS from a relational database using Sqoop, and  query it with Impala  - ingest some data into HDFS from a batch of log files, ETL it with Hive,  and query it with Impala  - ingest some data into HDFS from a live stream of logs and index it for  searching with Solr  - perform link strength analysis on the data using Spark  - build a dashboard in Hue  - if Hue run the scripts to migrate to Cloudera Enterprise, also audit  access to the data and visualize it's lineage    That sounds like it will cover most of what you're looking for.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-13-2016
	
		
		07:14 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Note that there are many variables in that tutorial you'll need to replace  with your own values. A copy of the tutorial with all the blanks filled in  and the required datasets are available in the QuickStart VM.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-06-2016
	
		
		09:39 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							I'm not sure I've seen this particular problem before, however I'd suggest  comparing the SHA-1 hashes to be sure it's not compromised. The hashes can  be found when you download the file. For the 5.7.0-0 VirtualBox image  it's 1309591109ebd9b1e44c89bd064b12d8b00feeb6. My copy of the file matches  and is slightly smaller than yours, so unless there's a difference in how  file sizes are reported on different operating systems, I would suspect  your download is corrupted. As Cy said, we do recommend using a download  manager. Browsers tend to have inferior support for recovering from  problems during the download, and you see that more often on large files  like this.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













