Could have been better if the authors focuses on architectural concerns only
Unlike other books on Hadoop, Hadoop Application Architecture try to provide the high level view architects need to make decisions about which tools to use to satisfy the “-ibilities”. The idea of the book is excellent and the authors are uniquely qualified to the task, helping clients and partners to integrate Hadoop in their ecosystem. The results is an interesting book but I was expecting more.
As the book progresses, the authors fall in simple framework introductions and the main subject of the book (architecture) is left aside. For example, instead of basic examples on Oozie (how to create a linear flow, how to set the timezone), I would have preferred that the authors discussed the limitations of the tool. Moreover, the content is not homogeneous. Some chapters contain too much code, while others not enough, and some covers advanced topics while others just introduce their subject like a “Getting Started” guide. It is unfortunate because technical details does not help to have a high level view of the subject. I think the book would have been better if the authors would have resist the temptation to put technical details easily found elsewhere and have developed the case studies and the comparison of different solutions instead. Note that the subject of Machine Learning is completely absent.
In definitive, Hadoop Application Architecture provides just enough information to make an informed choice about the tools covered in the book but as the Hadoop ecosystem is evolving at a fast pace, the book will soon become outdated (if it’s not already the case). Whether you’re designing a new Hadoop application, or planning to integrate Hadoop into your existing data infrastructure, I recommend you this book after reading the classic Hadoop: The Definitive Guide even if many of your questions will be left unanswered at the end of this book.