Member since 
    
	
		
		
		07-12-2013
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                435
            
            
                Posts
            
        
                117
            
            
                Kudos Received
            
        
                82
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2340 | 11-02-2016 11:02 AM | |
| 3630 | 10-05-2016 01:58 PM | |
| 8291 | 09-07-2016 08:32 AM | |
| 8909 | 09-07-2016 08:27 AM | |
| 2521 | 08-23-2016 08:35 AM | 
			
    
	
		
		
		12-14-2015
	
		
		11:30 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							So once you've started Cloudera Manager it's only running management  services, not CDH (since the full stack uses so much more memory than most  users have on their laptops - it's better to have you start what you know  you'll need). So once you CAN connect to CM, you will need to start the  services you want via the web UI or the CM API (there is a command to start  every service on the cluster, too).    Now as for why you can't connect, after CM starts it does take a couple of  minutes to open the port because it does a lot of checks before it will  accept any user input. However if 'service cloudera-scm-server status' has  said it's up for several minutes, the next thing I'd check is which  interface it's bound to. I'd expect it to bind to everything (including  localhost), but also try 'quickstart.cloudera' (since you're in the  container, that should resolve, and it may actually be a different IP  address than localhost/127.0.0.1 depending on how the network interfaces  are presented). You can also run 'sudo lsof -i | grep 7180' and it should  show you details of whatever's listening on that port. Failing all that,  I'd also check the logs in /var/log/cloudera-scm-server and see if  anything's gone wrong that you can see there.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-14-2015
	
		
		11:05 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Port forwarding in Docker can be tricky, see the "Networking" section here:  https://hub.docker.com/r/cloudera/quickstart/. You need to instruct Docker  to forward any ports you want to use when you start the container (e.g.  8888 for Hue, 7180 for Cloudera Manager), and then you have to lookup what  port number on your host maps to that port number on the guest. So if you  instruct Docker to launch your container with '-p 7180', from the guests  perspective it's listening on that port. However, on your host machine, it  will be assigned a different port (that way, many containers can run the  same services without their ports conflicting). You would need to run  'docker port 7180' and it would show you the interface it  was bound to (usually 0.0.0.0, meaning it's listening on all interfaces /  IP addresses) and the port, which might be 31000 or something in that  neighborhood. In which case, 31000 is actually the port you need to connect  to.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-14-2015
	
		
		09:25 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 The difference is that Cloudera Manager runs Hadoop services independent of Linux's service management (because it manages them with a cluster-wide context rather than the host-only context that the Linux service management has). So once you start Cloudera Manager, you will see that all of the Hadoop services are stopped according to Linux: they're being managed by Cloudera Manager, and not Linux anymore. The reason it's done in 2 steps is that most users of the VM do not have a need or sufficient memory on their laptops to run the entire stack in a single node including Cloudera Manager, so by default the image runs a CDH-only deployment with the services managed by Linux. /home/cloudera/cloudera-manager will disable all such services and enable CM for users that can do that and want to. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-09-2015
	
		
		03:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							You need to register for a new access code every time you deploy a cluster.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-09-2015
	
		
		09:29 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Sorry, it's not - I'm trying to get that updated. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-04-2015
	
		
		04:07 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Remember that schema and data are two separate things in Hadoop. The files  in the directory, they are simply data files. For tables to show up in Hive  or Impala, you have to import or define the schema for those tables in Hive  Metastore. I believe the reason you're not seeing the tables is because the  logs you posted show that Hive is constantly struggling with garbage  collection. My guess is that Sqoop tried to import the schema into Hive but  timed out - but I don't know for sure unless you can post the text  outputted of the Sqoop command.    To be clear - are you running a QuickStart VM? I'm a little unclear on  exactly what your environment is.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-04-2015
	
		
		03:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							  To answer your other question though, I wouldn't expect a different data format to make a difference here. There's enough competition for memory on the system that Hive is constantly doing garbage collection, and that shouldn't have anything to do with what format Sqoop is using for the data.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-04-2015
	
		
		03:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Well there are a lot of variables so a simple "minimum requirement" is a  tough number to give. The tutorial was originally written for a 4-node  cluster with 16 GB of RAM per node, and that's a little bit small for the  master node. The QuickStart VM has a version of the tutorial with a smaller  dataset. You can get away with 4 GB (but this includes the graphical  desktop, so let's say 3 GB for a server) if you don't use Cloudera Manager  and manage everything yourself (note that this is pretty complex). If you  use the "Cloudera Express" option for Cloudera Manager, 8 GB is the  absolute minimum, and if you're going to try out "Cloudera Enterprise" you  need at least 10 GB. But the number of nodes, exactly which services you're  running, exactly what else is going on on the machines, etc. all affects  this.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-04-2015
	
		
		12:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							I was referring to the output of your Sqoop command - they are printed to  the terminal, not written to a log file. However the log snippets you did  post do indicate a potential problem: if Hive was pausing too much for  garbage collection, then Sqoop might have given up / timed out when doing  the import. You may not have enough memory for the services to run well.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-04-2015
	
		
		10:43 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Can you post the output of your Sqoop job? I'm wondering if there were errors when it was doing the --hive-import part. There's 2 stages: writing the files in the new data format to HDFS, and then defining the schema for the tables in Hive's Metastore. It sounds like that 2nd stage failed...
						
					
					... View more