Samstag, 2. November 2013

Book Review: Hadoops Operations and Cluster Management Cookbook

I just had the opportunity to dig a little deeper into the Hadoop Ops & Cluster Management Cookbook. The 368 pages and eight chapters follow the path: big data, preparing, configuring, managing, hardening, monitoring, tuning and running on AWS. 
The book claims to provide 60 recipes and targets everyone who has to maintain a hadoop cluster. It is also referring to many tools around the ecosystem as flume, oozie, squoop, spark, storm, ambari, etc. The book is full of these typical PACKT hints. I really liked the installation part which is full of colored pictures (especially the Amazon chapter) to e.g. select the right packages and it does even little steps in the appropriate detail (e.g. as a ssh configuration). Tons of topics as import / export, logging, or even monitoring are written so practically that you feel the intense urge to install something like Ganglia. Newbies who did not had Hadoop installed will find the starting chapters 1-3 valuable. All the others who would like to go in depth and find new tipps and tricks will surely find tons of stuff in chapter 4-8. E.g. balancing load, analyzing jobs, block size, startup-time, etc.

I have rarely seen a book with so much hands on and installation code like this one (which is a good sign, because doing is much more fun then reading plain text). Although I only ran some little clusters in a very unsophisticated mode for some short specific tasks (and hence my knowledge is limited), I am sure that it will take years to find operations and managemend knowledge NOT covered in this book. And I am still unable to find any weaknesses.

To summarize: the claim of explaining 60 tasks and tricks in Hadoop has succeded in a superb way. There is no other book I know taking the reader by the hand step by  step for Hadoop. Congratulations.

Keine Kommentare:

Kommentar veröffentlichen