Support Questions

Find answers, ask questions, and share your expertise

Help with Cloudera Live on AWS

avatar
Explorer

I am trying to install Cloudera live on AWS with Tableau. The stack creation is complete.  I see 6 instances running on my account.  I did not receive any email with instructuctions on how to access Cloudera.  Can someone suggest how I can check if the installation is complete

 

Mark

2 ACCEPTED SOLUTIONS

avatar
Guru
Glad it's working. You should make the rules as specific or as general as
your needs dictate. I had forgotten about the rule that allowed all
outbound traffic, simply so any request originating in the cluster would
succeed (since the ephemeral ports for Linux are allowed inbound traffic).
The default firewall is quite strict about incoming traffic...

View solution in original post

avatar
Explorer

Hi Sean,

 

Thanks for your suggestion.  I will create a newpost.

 

Mark

View solution in original post

51 REPLIES 51

avatar
Explorer

Hi Sean,

 

Is there any way I can restore just the Orders HDFS file?  I have been making changes to this table and it looks like I have corrupted it.  I know I need to move the table from mysql database using sqoop. I  would appreciate your reply.

 

Mark

avatar
Explorer

Hi Sean,

 

I am sorry for another posting. I found the command to move just 1 table from mysql to hdfs.  But, I am running into one issue

 

I dropped the table using Hive. But the table still exists in hadoop fs -ls /user/hive/warehouse/ directory.  I tried to delete the file and I don't seem to have the right permissions.  Can you delete this file for me?

 

Mark

avatar
Guru
So the issue with metadata changes not showing up in clients is probably
just because Impala caches metadata to make requests faster. When there's a
metadata change, you can simply issue the 'invalidate metadata;' command.
I'm not sure what you mean when you say it worked on the server though.
Maybe it worked in the Hive app in Hue? Hive doesn't cache metadata like
that - it looks it up for every query.

Yeah the Sqoop command can be modified to import a specific table instead
of all of them.

I can't do anything on your cluster for you - once you get the credentials,
Cloudera doesn't keep the SSH keys to log in in the future. But you should
be able to do anything in HDFS if you preface the command with 'sudo -u
hdfs'. So 'sudo -u hdfs hadoop fs -rm -r /user/hive/warehouse/orders', for
instance.

avatar
Explorer

Hi Sean,

 

I could move just one table after I deleted the hdfs file from the directory.  Thanks for your help.

 

Mark

avatar
Explorer

Hi Sean,

 

I am using metastore manager to copy hdfs file and create a new table in hive.  But, this table file (orders) has 3 different parquet files.  I am using the "create a new table from a file" utility and  I am getting an error message as shown below.  Can you tell me how I can create a hive table from this kind of a file?

 

Mark

 

"

Failed to open file '/user/hive/warehouse/orders_new': [Errno 21] Is a directory: '/user/hive/warehouse/orders_new'

None

avatar
Guru

This thread's getting quite long and touching a lot of different types of questions - I'd suggest you post that issue under the Hue forum - I don't know the answer. I'd show them the output of `hadoop fs -ls /user/hive/warehouse/orders_new` so they can see the specific files under that directory and permissions, etc.

avatar
Explorer

Hi Sean,

 

Thanks for your suggestion.  I will create a newpost.

 

Mark

avatar
Rising Star

1. For Impala Issue: Try to refresh metadata by using below command in Impala.

invalidate metadata;

 

2.I am able to connect through Putty with ec2-user as userid. I ran the script and I get an error:

-bash: import-all-tables: command not found

 

Are you able to invoke SQOOP ?

 

I am not sure which mysql db you are connecting. Thats why i mentioned you might need to change password, username and db name etc.

 

 

avatar
Explorer

Just to clarify

 

I used the private IP to connect using Coludera Live ODBC driver.  I tried different userids including ec2-user and I was not able to connect to Hive Server 2. 

 

When I try to connect using Impala directly in Tableau, I can connect using ec2-user as id but cannot access the tables that I can see through Hue/Hive.

 

Please let me know what I am missing in the workflow.

 

Mark

avatar
Guru
ec2-user is the user ID to be using for SSH, not JDBC. I don't believe you
need to specify a user at all for JDBC connections (as I said, nothing at
all is modified about Hive from the default configuration - there's no
special authentication set up for that service). If you need to specify a
user, try 'cloudera', or 'admin'. I'm afraid I don't have a cluster handy
to confirm, but one or both of those should work.