Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Super Guru

Short Description:

Teragen and Terasort Performance testing on AWS

Article

This article should be used with extreme care. Do not use as benchmark. I performed this test to simply run a quick 1 Terabype teragen test on AWS to determine what type of performance I can get from mapreduce on AWS with VERY LITTLE configuration tweaking/tuning

On my github page here you will find the following:

  • teragen script
  • hadoop,yarn,mapred,capacity scheduler configurations used during testing

Hardware: (Master & Datanode)

1 Master, 3 Data nodes

d2.4xlarge, 16vCPU, 122GB ram, (max) 12x2000 Storage

TeraGen Results:

1hrs, 6mins, 38sec

Job Counters:

5664-teragen-counters.jpg

Terasort Results:

1hrs, 34mins, 20sec

5666-terasort.jpg

Teravalidate Results:

25mins, 27sec

5667-teravalidate.jpg

3,793 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 11:27 AM
Updated by:
 
Contributors
Top Kudoed Authors