Member since
03-16-2017
3
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1482 | 03-30-2017 05:23 AM |
04-01-2017
04:23 PM
@Constantin Stanca Any thoughts on this?
... View more
03-30-2017
05:23 AM
1)Please
define actual size and performance numbers that you encountered. Ans.
Data
Volume
Time
elapsed for TEZ
Average
Time MR
Time
elapsed for MR
Average
Time for TEZ
1900 records
46.350 secs
41.626 secs
63.666 secs
56.176 secs
40.341 secs
55.633 secs
38.189 secs
49.230 secs
91914 records
32.049 secs
32.097 secs
52.920 secs
51.236 secs
32.088 secs
49.030 secs
32.156 secs
51.760 secs
993168 records
850.01 secs
861.781 secs
611.625 secs
635.781 secs
865.230 secs
691.751 secs
872.110 secs
672.285 secs
868.995 secs
567.466 secs
2)Clarify what test beds you are referring and how did you use
them? Ans. In above statistics table: In Operation 1 is a creating lateral view on a small data set. In Operation 2 is joining 3 tables of intermediate data volume. In Operation 3 is joining 4 tables of large data volume in inner
query and aggregation happening on top of that. 3)Clarify
what is the type of test case you execute? It is important to clarify because
some tests can be disk I/O intensive, others can be memory intensive.
1.Ans. Above jobs ran in parallel i.e. 10 jobs in parallel
on TEZ mode and 10 jobs in parallel on MR mode.
2.Above results are output of multiple test
iterations and performed on different test beds.
... View more
03-16-2017
08:45 AM
1 Kudo
We are doing some analysis on MR vs TEZ. TEZ is doing better than MR on small and mild data volumes but MR is beating TEZ on large volumes, We have seen it multiple times on different test beds. Please suggest
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Tez