About Sean

Sean · ‎10-23-2015

There's a network ACL that blocks access from outside the cluster to any ports that are not configured with secure credentials. You can find the network ACL in your AWS account (look up the specific ID under Resources in CloudFormation if you need to) and edit the rules to meet your needs, but you should be aware of that insecurity.

Sean · ‎10-23-2015

Have you follow the instructions in step 9 of the tutorial, to select the default schema? http://www.tableau.com/cloudera-tableau-9

Sean · ‎10-22-2015

You should probably be using port 21050 (there are some cases where you'd use 21000, see the documentation for details: http://www.cloudera.com/content/www/en-us/documentation/archive/impala/2-x/2-1-x/topics/impala_odbc.html - the version of the ODBC driver currently being used in Cloudera Live / Tableau clusters is 2.5.28.1008). You should be connected to any of the Worker Nodes listed on your Guidance Page, because those are the nodes running Impalad. No username should be given, because no authentication is currently configured for that service. When you see these Tables in Hue, are you using the Impala Query Editor app? or the Hive Query Editor?

Sean · ‎10-21-2015

So the Impala ODBC driver is installed on the Windows server that hosts Tableau Desktop. The Hive ODBC driver is separate. You can download a Windows installer for it here: http://www.cloudera.com/content/www/en-us/downloads/connectors/hive/odbc/2-5-16.html.html. You can also find out more about Tableau and ODBC drivers here: http://kb.tableau.com/articles/knowledgebase/hadoop-hive-connection

Sean · ‎10-20-2015

Oh ODBC - my mistake. I have not used ODBC much with Hive. Hopefully someone else can provide some insight there - I'd have to do some digging. As for the Sqoop issue, there are 2 things I would check. First, can you log into Cloudera Manager (using the link and credentials from the final email you received)? The first screen should show the general health of the cluster in a box to the left. Most of the services should have a green circle, or a little black square. If a lot of the services are marked in yellow or red, then something may be wrong with the cluster in general. Next... There should be 2 entries related to Sqoop. One will be called "Sqoop 2" marked with a little black box because it's stopped by default <- this is a service that is separate from the CLI tool as you're using it in the tutorial. The other should just be called "Sqoop" or "Sqoop Client" or something like that. It will be marked with a grey circle (since it's just a CLI tool, it doesn't have a status, per se). Do you see that? If not, click the button above this box to "Add a service", and select "Sqoop 1 Client". It'll ask you which hosts to deploy the tool on - just select all of them, and click through the menus to complete the deployment. Then try running Sqoop again. I can't imagine why Sqoop wouldn't be available on the command-line already, if everything else got set up right, but try this and see what happens. Maybe you'll find another clue along the way...

Sean · ‎10-20-2015

When you run 'hostname' from the machine you're running sqoop on, what do you see? On that machine, /usr/bin/sqoop should be the executable, and it should ultimately be calling /opt/cloudera/parcels/CDH/bin/sqoop. Do you see either of those files? Sqoop is bundled with everything else that gets installed - it would be very surprising to me that you got this far and had something missing, so I suspect you're not on the right machine.

Sean · ‎10-20-2015

ec2-user is the user ID to be using for SSH, not JDBC. I don't believe you need to specify a user at all for JDBC connections (as I said, nothing at all is modified about Hive from the default configuration - there's no special authentication set up for that service). If you need to specify a user, try 'cloudera', or 'admin'. I'm afraid I don't have a cluster handy to confirm, but one or both of those should work.

Sean · ‎10-20-2015

You must have 'sqoop' before 'import-all-tables'. The full command in the tutorial is as follows: sqoop import-all-tables \ -m 3 \ --connect jdbc:mysql://cloudera1:3306/retail_db \ --username=retail_dba \ --password=cloudera \ --compression-codec=snappy \ --as-parquetfile \ --warehouse-dir=/user/hive/warehouse \ --hive-import >> Please explain why I need to change userid,pwd,db_name and mysql driver location. I am using cloudera live on AWS and I want to use the existing databases. I'm not sure what steps in the tutorial you're referring to here. >> With Impala, when I connect, I don't see any schema. The Sqoop command will import some data. If you haven't already imported data, you should not see any schema in Impala. >> Please tell me the userid/pwd to connect through Hive Server 2. There isn't a password set up for Hive Server 2. You may find this thread helpful: http://community.cloudera.com/t5/Cloudera-Live-End-to-end/Cannot-connect-to-Hive-thru-JDBC-Connection-refused/m-p/33158#U33158

Sean · ‎10-20-2015

Regarding PuTTY, have you read through EC2's documentation on connecting to Linux instances from Windows? http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-connect-to-instance-linux.html#using-putty. It seems you need to go through a process of converting the .pem file (the key you selected when deploying the CloudFormation template) to a PuTTY-specific .ppk format, and then configure your connection to use that file for authentication. As for the issue connecting from Tableau, I would recommend you try using the Private IP for the Manager Node to connect to Hive. If you're using the public IP, a bunch of firewall rules get applied, and they will block access to Hive since the service is not secured by default in Live clusters. However, from inside the cluster, all access to private IPs should be open. Also note that Hive Server 2 is running on the Manager Node: this is distinct from Impala (which the Tableau tutorial in Cloudera Live has you connect to), which is running on all of the Worker Nodes instead. Hope that helps!

Sean · ‎10-19-2015

The JDBC string might be a bit different for Hive Server 2, actually. This is what used to be used when Beeline (CLI JDBC client) was used instead of Hue for some steps in the tutorial: beeline -u jdbc:hive2://[Manager Node IP Address]:10000/default -n admin -d org.apache.hive.jdbc.HiveDriver The `-n admin` may not be correct to use in this case, but perhaps /default is what's required for your JDBC connection string?

Online	Offline
Last Visited	‎03-17-2016 10:55 PM

Member Since	‎07-12-2013 07:35 AM
Last Visited	‎03-17-2016 10:55 PM
Posts	435
Kudos received	117

Cloudera Community

Re: Quickstart VM welcome page doesn't recognize t...

Re: Hadoop installation on Ubuntu 14.o4

Re: In Cloudera Quickstart VM how to upgrade lates...

Re: Unable to transfer files from Mac Desktop to C...

Re: Cloudera service and host monitoring fails fre...

Re: Help with Cloudera Live on AWS

Re: Help with Cloudera Live on AWS

Re: Help with Cloudera Live on AWS

Re: Help with Cloudera Live on AWS

Re: Help with Cloudera Live on AWS

Re: Help with Cloudera Live on AWS

Re: Help with Cloudera Live on AWS

Re: Help with Cloudera Live on AWS

Re: Help with Cloudera Live on AWS

Re: Cannot connect to Hive thru JDBC: Connection r...