Support Questions
Find answers, ask questions, and share your expertise

Hadoop, Hive, Spark, what's going onnnn?

New Contributor

I started a new job and I will soon be working with Hive and Pyspark to pull from the company's big data lake. I have lots of experience with Python and SQL but not much with big data systems. Can anyone recommend any good books to help a data scientist understand how to work with Hadoop systems? Extra helpful if they go into detail on Hive and Spark

1 REPLY 1

Guru
@ElistonCole,

If you have access to O'Reilly, below Hive book is useful. It was published in 2018, fairly new compare with others:
https://learning.oreilly.com/library/view/apache-hive-essentials/9781788995092/

Regarding Spark, below book published this year is good:
https://learning.oreilly.com/library/view/learning-spark-2nd/9781492050032/

Hope that helps.

Cheers
Eric