Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Please i need assist with this project ; Please i need a guide to go about this. Thanks

avatar
Visitor

Problem Statement

 

A. Find out the top 5 categories with maximum number of videos uploaded.

B. Find out the top 10 rated videos.

C. Find out the most viewed videos.

 

Dataset

http://www.edureka.co/medias/6cchxi6to4

 

 

Dataset Description

 

Column1: Video id of 11 characters.

Column2: uploader of the video of string data type.

Column3: Interval between day of establishment of Youtube and the date of uploading of the video of integer data type.

Column4: Category of the video of String data type.

Column5: Length of the video of integer data type.

Column6: Number of views for the video of integer data type.

Column7: Rating on the video of float data type.

Column8: Number of ratings given on the video.

Column9: Number of comments on the videos in integer data type.

Column10: Related video ids with the uploaded video

 

PLease i need a quide to go about this. Any help will be appreciated.

1 ACCEPTED SOLUTION

avatar
Contributor

Hi ,

 

For example:

* load this data to hive,

* run queries such as: select category, count(*) number_of_videos from youtube_data order by number_of_videos desc limit 5;

 

Regards

Andrzej

View solution in original post

1 REPLY 1

avatar
Contributor

Hi ,

 

For example:

* load this data to hive,

* run queries such as: select category, count(*) number_of_videos from youtube_data order by number_of_videos desc limit 5;

 

Regards

Andrzej