Member since
07-12-2013
435
Posts
117
Kudos Received
82
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1949 | 11-02-2016 11:02 AM | |
3008 | 10-05-2016 01:58 PM | |
7627 | 09-07-2016 08:32 AM | |
8050 | 09-07-2016 08:27 AM | |
1999 | 08-23-2016 08:35 AM |
11-02-2016
11:02 AM
1 Kudo
There's a file at /var/lib/cloudera-quickstart/tutorial/js/config.js you can edit to manually override the detection. Currently it likely contains the line: var managed = true; I'd recommend changing it to: var managed = 'express'; And that should unlock the other parts of the tutorial. Do not that the only parts 'express' unlocks include some sections on checking the health of services required for each step. The 'enterprise' option of CM will also add a section on using Navigator to audit access to the data and trace lineage of data sets.
... View more
10-05-2016
01:58 PM
CDH (and Cloudera Manager) are supported on Ubuntu 14.04. You can follow the standard documentation: it will include the necessary details when the procedure differs on different Linux distributions. See http://www.cloudera.com/documentation/enterprise/latest/topics/installation_installation.html .
... View more
09-07-2016
08:32 AM
1 Kudo
The easiest way would be to download and install the JDK version you want from Oracle's website. They offer RPM packages which should work in the VM, or a tarball that you can extract yourself anywhere you like. Once it's installed, make a note of the directory it installed to: the RPMs will install under /usr/lib/jvm or /usr/java or something like that. The directory will include the version in the name, and should have a /bin/ directory underneath it. With that directory, you'll want to update the value of JAVA_HOME in /etc/profile and restart any shell sessions you have open. If you want CDH to use that JDK as well, export JAVA_HOME in /etc/default/bigtop-utils.
... View more
09-07-2016
08:27 AM
SSH in the VM will listen on port 22 by default. You're hitting port 2222 on your host machine. If you're using VirtualBox, you can set up port forwarding in VirtualBox so that port 2222 on your host machine is forwarded to 22 (this is probably the easiest solution, but that isn't done out of the box). The alternative is to configure the VM to use something other than NAT for the virtual network. If you configure it to bridged networking or a similar option, it will get it's own IP address that you can use to connect to port 22 from your host machine.
... View more
08-23-2016
08:35 AM
1 Kudo
Depending on what you're doing, the Cloudera Management Services are likely not needed for your project. They deal with monitoring the various services. They make it harder to tell from the Cloudera Manager home page if the service is healthy, but if they crash after 5 minutes it shouldn't affect any of the services themselves. In my experience with the VM, often 1 service will fail that impacts the others (often it's the Host Monitor). I'd look at the monitoring data for the services to see which one is going down first, and then dig deeper in it's logs to see what the problem is. 8 GB should not be seen as plenty, but as the absolute bare minimum required. If you're running all of the Cloudera Manager services and putting load on Flume, Kafka and Spark / YARN, I'd expect your VM to be straining to keep up. These are all services designed to run on fairly large clusters, not minimal VMs - it will struggle with certain projects. I'd recommend adding more memory if you're able to - that is likely the reason on of the Cloudera Management Services isn't keeping up.
... View more
08-11-2016
07:24 AM
5 Kudos
The term gateway may be used in lots of contexts - it usually refers to a machine or service that acts as an entry point to other services. For example, your entire cluster might be behind a firewall which blocks all inbound traffic, except that it allows you to log in to one of the machines. From that machine, you can submit jobs or interact with any of the services in the cluster. That machine would be called a "gateway". Often in a Cloudera context, a gateway is just that: a machine that you're supposed to log into to carry out some tasks that aren't possible from outside the cluster. Cloudera Manager might manage the machine (meaning it deploys configuration to it and does basic health checks) but not run any CDH services on it. The NFS gateway is a similar idea. It connects to your HDFS cluster and exposes the filesystem via the NFS protocol. So you might not expose all of the HDFS ports to your network, but you might expose just the NFS service, and it therefore acts as a gateway.
... View more
06-21-2016
08:36 AM
VirtualBox has the ability to take snapshots of VMs that you can restore to at a later date.
... View more
06-20-2016
03:40 PM
The QuickStart VM includes a tutorial that will walk you through a use case where you: - ingest some data into HDFS from a relational database using Sqoop, and query it with Impala - ingest some data into HDFS from a batch of log files, ETL it with Hive, and query it with Impala - ingest some data into HDFS from a live stream of logs and index it for searching with Solr - perform link strength analysis on the data using Spark - build a dashboard in Hue - if Hue run the scripts to migrate to Cloudera Enterprise, also audit access to the data and visualize it's lineage That sounds like it will cover most of what you're looking for.
... View more
06-13-2016
07:14 AM
Note that there are many variables in that tutorial you'll need to replace with your own values. A copy of the tutorial with all the blanks filled in and the required datasets are available in the QuickStart VM.
... View more
06-06-2016
09:39 AM
I'm not sure I've seen this particular problem before, however I'd suggest comparing the SHA-1 hashes to be sure it's not compromised. The hashes can be found when you download the file. For the 5.7.0-0 VirtualBox image it's 1309591109ebd9b1e44c89bd064b12d8b00feeb6. My copy of the file matches and is slightly smaller than yours, so unless there's a difference in how file sizes are reported on different operating systems, I would suspect your download is corrupted. As Cy said, we do recommend using a download manager. Browsers tend to have inferior support for recovering from problems during the download, and you see that more often on large files like this.
... View more