Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Can we get better performance for hive queries by using SSD?

avatar
Rising Star

One of my client is using Azure based IaaS for their HDP cluster. They are open to using more expensive storage to get better performance.

Is it recommended to use SSD for some of the data in hive tables, to get that boost in performance? Also what are the steps to make your temporary storage to point to SSD, that is used by Tez/MR jobs?

1 ACCEPTED SOLUTION

avatar
2 REPLIES 2

avatar

I understood that yarn.nodemanager.local-dirs would be the setting to point to SSD to get better performance on shuffle and other temporary usage. I also would like to confirm it.

avatar