Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
Super Guru

Short Description:

Teragen and Terasort Performance testing on AWS


This article should be used with extreme care. Do not use as benchmark. I performed this test to simply run a quick 1 Terabype teragen test on AWS to determine what type of performance I can get from mapreduce on AWS with VERY LITTLE configuration tweaking/tuning

On my github page here you will find the following:

  • teragen script
  • hadoop,yarn,mapred,capacity scheduler configurations used during testing

Hardware: (Master & Datanode)

1 Master, 3 Data nodes

d2.4xlarge, 16vCPU, 122GB ram, (max) 12x2000 Storage

TeraGen Results:

1hrs, 6mins, 38sec

Job Counters:


Terasort Results:

1hrs, 34mins, 20sec


Teravalidate Results:

25mins, 27sec


Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎08-17-2019 11:27 AM
Updated by:
Top Kudoed Authors