Created 10-16-2015 12:22 PM
I am trying to install Cloudera live on AWS with Tableau. The stack creation is complete. I see 6 instances running on my account. I did not receive any email with instructuctions on how to access Cloudera. Can someone suggest how I can check if the installation is complete
Mark
Created 10-27-2015 03:27 PM
Created 10-30-2015 08:46 AM
Created 10-18-2015 12:44 PM
Please close this issue
Mark
Created 10-19-2015 05:12 AM
Did you solve the issue Mark? If so, please share the solution in case it can assist others. 🙂
Created 10-19-2015 05:54 AM
Created 10-19-2015 08:01 AM
Thanks for the reply. Cloudera live install went through fine. I could connect to all three environments(Hue, manager, & navigator). I have tried to query through Hive and Impala and they work. Now, I am trying to use Sqoop to transfer data from mysql to Hadoop and I need help with manager node IP:Port, userid and pwd. I will be using putty to connect.
Created 10-19-2015 08:09 AM
Created 10-19-2015 01:06 PM
Hi Sean,
Thanks for your reply. I used the ip address(54.172.147.35) and userid (ec2-user) through Putty. I get this error message:
"Disconnected: No supported authentication methods available(server sent: public key, gssapi-keyex, gssapi-with-mic"
Can you help me with this issue?
Regarding Tableau, I can access the tool and can log into the system. Then, I select "Cloudera Hadoop" as the server and enter 54.172.147.35 for server with port(10000).
I select "HiveServer" for Type. I see Authentication greyed out. I cannot enter userid or pwd. I click OK and I get a window with error message
"An error occurred while communicating with the Cloudera Hadoop data source '54.172.147.35'
I would appreciate if you can let me know where I am making a mistake in the workflow.
Thanks,
Mark
Created 10-20-2015 11:38 AM
Regarding PuTTY, have you read through EC2's documentation on connecting to Linux instances from Windows? http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-connect-to-instance-linux.html#using-putty. It seems you need to go through a process of converting the .pem file (the key you selected when deploying the CloudFormation template) to a PuTTY-specific .ppk format, and then configure your connection to use that file for authentication.
As for the issue connecting from Tableau, I would recommend you try using the Private IP for the Manager Node to connect to Hive. If you're using the public IP, a bunch of firewall rules get applied, and they will block access to Hive since the service is not secured by default in Live clusters. However, from inside the cluster, all access to private IPs should be open. Also note that Hive Server 2 is running on the Manager Node: this is distinct from Impala (which the Tableau tutorial in Cloudera Live has you connect to), which is running on all of the Worker Nodes instead.
Hope that helps!
Created 10-20-2015 12:27 PM
Hi Sean,
Thanks for your reply. I tried using Putty and I can connect now. I still need to run the script to move tables from mysql to HDFS.
With Tableau, I am stil getting the same error as shown below. I am using Cloudera Hadoop as the server. I am also using private ip (10.0.0.81) and left the port at 10000.
Please let me know if I need to make any other changes.
Mark
The drivers necessary to connect to this database are not properly installed.
To connect to this database, perform the following steps:
Detailed Error Message:
Created 10-20-2015 12:50 PM
1. Please refer to https://www.cloudera.com/content/www/en-us/developers/get-started-with-hadoop-tutorial/exercise-1.ht... for ingets ( Mysql to HDFS ).
It is straignt forward only need to change the db name, user id, pasword, mysql driver location and you should be good.
2. For ODBC driver: Did you install from below link?
http://www.cloudera.com/content/www/en-us/downloads/connectors/hive/odbc/2-5-16.html.html
Created 10-20-2015 01:34 PM
Hi Sean:
With Putty:
I am able to connect through Putty with ec2-user as userid. I ran the script and I get an error:
-bash: import-all-tables: command not found
Please explain why I need to change userid,pwd,db_name and mysql driver location. I am using cloudera live on AWS and I want to use the existing databases.
With Tableau:
I am able to connect using Impala but not through Hive. With Impala, when I connect, I don't see any schema.
I am using Hive Server 2 in ODBC configuration. It doesn't connect to the server. Please tell me the userid/pwd to connect through Hive Server 2.
Mark
Created 10-20-2015 01:43 PM
You must have 'sqoop' before 'import-all-tables'. The full command in the tutorial is as follows:
sqoop import-all-tables \ -m 3 \ --connect jdbc:mysql://cloudera1:3306/retail_db \ --username=retail_dba \ --password=cloudera \ --compression-codec=snappy \ --as-parquetfile \ --warehouse-dir=/user/hive/warehouse \ --hive-import
>> Please explain why I need to change userid,pwd,db_name and mysql driver location. I am using cloudera live on AWS and I want to use the existing databases.
I'm not sure what steps in the tutorial you're referring to here.
>> With Impala, when I connect, I don't see any schema.
The Sqoop command will import some data. If you haven't already imported data, you should not see any schema in Impala.
>> Please tell me the userid/pwd to connect through Hive Server 2.
There isn't a password set up for Hive Server 2. You may find this thread helpful: http://community.cloudera.com/t5/Cloudera-Live-End-to-end/Cannot-connect-to-Hive-thru-JDBC-Connectio...
Created 10-20-2015 02:15 PM
Thanks for the reply.
I tried with Sqoop in front and I get the error:
-bash: sqoop: command not found
It looks like I am not in the right environment or the master node I am connecting to doesn't have sqoop installed.
Please check and let me know.
Mark
Created 10-20-2015 02:19 PM
Created 10-20-2015 03:01 PM
Hi Sean,
I appreciate your feedback. With regards to Putty, I am not good at Unix. All I did was to connect to
ec2-52-91-172-186.compute-1.amazonaws.com using mykeypair.ppk. I used ec2-user as the userid. when I used /opt/cloudera/parcels/CDH/bin/sqoop at the prompt, I get an error - no such file or directory. I am not good at Unix and I need your help with commands if you want me to check something.
With regards to Tableau, I tried to create a odbc connection using Cloudera odbc driver for Hive. I did not use JDBC driver. Can you confirm if I should use ODBC or JDBC for connecting to manager node?
Mark
Created 10-20-2015 03:23 PM
Created 10-21-2015 05:13 AM
Hi Sean,
With regards to Sqoop:
I added the new service Sqoop Client 1 and it seems to be running. I went back to putty and ran the script again. I still get the same error.
-bash: Sqoop: Command not found
Is there any other way I can test if Sqoop is running?
Thanks for your help.
Mark
Created 10-21-2015 11:03 AM
Hi Sean,
I think I finally found out where the problem was. I was not connecting to the Manager node in Putty. I just did that and the script is running. I will let you know once it finishes. I am hoping new tables will be created and I can query through hive or impala. Thanks for your help.
I still need to work through Tableau. Please let me know if you find the solution on the right driver/connectivity parameters I should use.
Mark
Created 10-21-2015 01:05 PM
Hi Sean
I could get everything on the server to work and I finished all tutorial exercises.
The only outstanding issue is connectivity from Tableau. I am connecting from a windows machine using remote desktop. I am not sure why I even need odbc driver on my machine. It looks like there is something with connectivity parameters that is not correct. I would appreciate your help.
Mark
Created 10-21-2015 01:13 PM
So the Impala ODBC driver is installed on the Windows server that hosts Tableau Desktop. The Hive ODBC driver is separate. You can download a Windows installer for it here: http://www.cloudera.com/content/www/en-us/downloads/connectors/hive/odbc/2-5-16.html.html.
You can also find out more about Tableau and ODBC drivers here: http://kb.tableau.com/articles/knowledgebase/hadoop-hive-connection
Created 10-22-2015 05:13 AM
Hi Sean,
Thanks for the information. The problem I had with Impala was that I could connect but I am not seeing any of the tables that I could see through Hue. Can you tell me what the host IP, port and userid I should use?
I will try to install odbc driver for Hive and try connecting.
Please reply when you get a chance.
Mark