Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Load data in cloudera Hadoop running on AWS - New to Hadoop

Highlighted

Load data in cloudera Hadoop running on AWS - New to Hadoop

New Contributor

Dear Experts,

 

I am quite new to Hadoop with only 3 weeks of learning. I am posting the below questions, can you please guide :

 

Things done until now :

 

1. I have Cloudera Hadoop running on AWS (ubuntu 14).

 

2. Have successfully installed Cloudera manager and a cluster is also created. The web link for CM and HUE are functioning good.

 

Questions :

 

1. I want to load some sample .csv file data to HDFS. How can I do the same ? Is there any tutorials for loading the data to HDFS.

 

2. Can I use HUE to load data and use HIVE to build tables on the top of it.

 

Thanks.

 

3 REPLIES 3
Highlighted

Re: Load data in cloudera Hadoop running on AWS - New to Hadoop

Super Guru

Hi @Chandra3,

 

One way to get started is to use Hue's examples.  Log into Hue as a superuser and the front page will have a series of steps listed.  "Step 2" is Examples.  Clicking on Hive will isntall example data with Hive tables.  The others will also do so.

 

You can also Click on Data Browsers --> Metastore Tables 

From there you can import files to create Hive tables.

 

Depending on what you are looking to test, there may be other useful resources but the above are pretty quick and simple.

Others may have example data they use.  Many datasets are available online.

One we commonly use in training is movie data:

 

http://files.grouplens.org/datasets/movielens/ml-latest-README.html

 

other grouplens datasets:  https://grouplens.org/datasets/

 

Ben

 

 

Re: Load data in cloudera Hadoop running on AWS - New to Hadoop

Cloudera Employee

Hi, Chandra. The Hue Guide might also have some helpful information: 

 

https://www.cloudera.com/documentation/enterprise/latest/topics/hue.html

Highlighted

Re: Load data in cloudera Hadoop running on AWS - New to Hadoop

New Contributor

Many thanks. It was really useful for beginners like me.

 

Do you have any idea how can i access this data from other reporting tools. In our case, we use BIRST reporting tool. However I am not able to establish connection to HIVE tables.

 

Is there any link, which I can refer to establish the JDBC.

 

Warm Regards,

Chandra.

Don't have an account?
Coming from Hortonworks? Activate your account here