If i submit a job, i see only one worker node on the resource manager. I doubt all the data it is looking for is not on one host.
Is there a way to find what are the nodes a specific job used?
You can see the details on resource manager UI (http://rmhost:8088/ on typical setup). Look for the application that you are running, and you will which nodes the mappers and reduces run on.