Member since
07-12-2013
435
Posts
117
Kudos Received
82
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2340 | 11-02-2016 11:02 AM | |
| 3630 | 10-05-2016 01:58 PM | |
| 8294 | 09-07-2016 08:32 AM | |
| 8918 | 09-07-2016 08:27 AM | |
| 2522 | 08-23-2016 08:35 AM |
08-03-2015
02:03 PM
@jkestelyn wrote: Sorry, we do not make archived VMs available. As an alternative, consider Cloudera Live (cloudera.com/live), which provides free access to a full multi-node cluster on AWS for two weeks (you are resonsible for AWS charges after that). It's free hosting on GoGrid for 2 weeks after signing up - if you use AWS it's in your own account, and there is no free hosting - only free software. But yes, the CDH 4 VM is no longer available, I'm afraid.
... View more
07-31-2015
01:05 PM
The previous post was about running the tutorial Cloudera Live (which is a fully-distributed cloud cluster), which uses Cloudera Manager. The QuickStart VM is intended to be much smaller, so Cloudera Manager is actually off by default. So unless you're run 'Launch Cloudera Manager' on the desktop, the fact that you can't connect to Cloudera Manager isn't a sign that anything's wrong. Also, it sounds like you might not be on the latest QuickStart VM. A more recent CDH version made it possible for Sqoop to do all the work of creating the tables for you, so this section becaomes significantly simpler. If Impala is down, Hue should be saying so on its home page. But you can make sure all the services are up via Linux's service command, e.g: sudo service impala-server status
sudo service impala-state-store status
sudo service impala-catalog status If all the services are up, let's try some simpler queries to figure out what's going wrong, because I'm not sure what this error is actually meant to be telling you. Let's first see what tables already exist: show tables; I would also try entering the queries 1 by 1 manually. It's possible something funny is going on with the copy / paste button and introducing some invalid characters or something. If you can reproduce this error with single queries, that also helps narrow down where the problem may be.
... View more
07-29-2015
07:37 AM
Yeah you're right - that's a misprint. It should be number of shards, e.g. the number of machines running solrd. I'll get that fixed.
... View more
07-28-2015
02:30 PM
These lines seems like there's a good chance the core problem is the 64-bit architecture: >> 2015-07-28T12:08:21.237-07:00| vmx| I120: [msg.cpuid.noLongmode2] This virtual machine is configured for 64-bit guest operating systems. However, 64-bit operation is not possible. Are you running a 32-bit operating system by any chance? Have you been able to run other 64-bit VMs before? It also says your machine supports VT-x, but it's disabled. I would enable it - it's a BIOS setting.
... View more
07-24-2015
06:35 AM
1 Kudo
According to the Sqoop documentation, the --hive-overwrite command should also allow you to do this without manually dropping the tables first, but I haven't tested that myself.
... View more
07-24-2015
06:32 AM
1 Kudo
/user/hive/warehouse stores the data files, but the metadata (information about the structure and location of the data files) is managed by Hive. Connect to either Impala or Hive (you'll find instructions for doing so later in Tutorial Exercise 1, or Tutorial Exercise 2, depending on which version of the tutorial you're using. Once connected run 'show tables;', and you'll see a list of the tables it has metadata for. For each of these tables (assuming there isn't other data, unrelated to the tutorial that is already stored there), run 'drop table <table_name>;' When none of the tables from retail_db are shown when you run 'show tables;', the Sqoop job should be able to succeed.
... View more
07-17-2015
06:46 AM
1 Kudo
If you want to have a virtual cluster, I would strongly recommend just starting with vanilla Linux VMs, downloading and running the Cloudera Manager installer on one of them, and building a new cluster. You may be surprised at how easy it is, once you have the VMs networked together properly. You would have to reset so much on the QuickStart VM to get it to incorporate a copy of itself as another node in the cluster - it would actually be harder than starting from scratch. The QuickStart VM is designed to "just work" as robustly as possible regardless of how the virtual network is setup, and that requires that it make some assumptions that it is just a single node. So be aware that you're going to run into some issues if you try this, and we do not try to cater to this use case. Specifically, you're going to run into a lot of networking issues. The VM has the hostname quickstart.cloudera 'baked' into it. To add another node, you would need another hostname, and that's going to require changing so many config files and resetting so many services that you would basically be starting from scratch anyway. You would also need to be careful with IP addresses. If another network device is not available early enough in the boot, the VM will use 127.0.0.1 - which works fine as a single-node, but that's not how you want machines to refer to themselves in a distributed system, because as soon as it's resolved elsewhere it's wrong. So you'd need to make sure the VM had an externally routeable IP (e.g. use bridged networking, or a similar option) and was rebooted (in my experience, you have to reboot twice after making the change) in order to have the correct networking device be available early enough in the boot process. Not to mention, this is all in theory - I don't know that anyone has successfully done this. Again - it's so much easier to just install using Cloudera Manager on top of some new Linux VMs.
... View more
07-08-2015
08:05 AM
1 Kudo
So in the tutorial you used Sqoop to import data from MySQL, right? Sqoop also supports Oracle (and a number of other data sources such as other relational databases, mainframes, etc.) and you can also use Sqoop to export the data back to a relational database. I'd suggest you have a look at Sqoop's documentation to see all the various options, etc. Sqoop in CDH is currently based on Sqoop 1.4.5 (with some other fixes / improvements back-ported): http://sqoop.apache.org/docs/1.4.5/index.html. There's also "Sqoop 2" which is still being developed but is available in CDH. It uses a client-server model instead of just the CLI tool. It was Sqoop 1 which you would've seen in the tutorial, though.
... View more
07-06-2015
05:02 PM
When the deployment finished running you should have received an email with a link to done resources and credentials, etc. If you haven't seen it, check spam, etc. If you can't find it, send me a private message with the email address you used to sign up, and I'll see what I can do to help.
... View more
07-03-2015
12:19 PM
1 Kudo
DFS Master (HDFS Namenode) is port 8020. The YARN Resource Manager is port 8032. I'm not that familiar with the hadoop plugin, but you should clarify whether you want to be using MapReduce from Hadoop 2.x (YARN acts as a scheduler, and you submit MapReduce jobs through YARN's ports). When they say "Map/Reduce Master", to me that sounds like MR1, when MapReduce ran it's own daemons. If it's MR1 you want to be using, you would actually want to use the JobTracker port, which is 8021. Even though MR1 is supported in CDH 5, we recommend Hadoop 2 / YARN for production and MR1 is not running in the QuickStart VM by default. Some work would be required to shutdown the YARN daemons and start the MR1 daemons; specifically, stopping that hadoop-yarn-resourcemanager and hadoop-yarn-nodemanager services, uninstalling the hadoop-conf-pseudo package, and installing the hadoop-0.20-conf-pseudo package instead, and then starting the hadoop-0.20-mapreduce-jobtracker and hadoop-0.20-mapreduce-tasktracker services. >> I also specify Host with ip of Clouder CDH5 VMware ip Make sure that you can ping that IP from your host machine. By default, the VM uses "NAT" which means you can't connect from your host machine. You'll want to use a "bridged" network or something similar instead so that you can initiate connections from your host machine.
... View more