CDN$ 48.09
  • List Price: CDN$ 58.34
  • You Save: CDN$ 10.25 (18%)
Only 10 left in stock (more on the way).
Ships from and sold by Gift-wrap available.
Hadoop in Action has been added to your Cart
Have one to sell?
Flip to back Flip to front
Listen Playing... Paused   You're listening to a sample of the Audible audio edition.
Learn more
See all 3 images

Hadoop in Action Paperback – Dec 25 2010

See all 3 formats and editions Hide other formats and editions
Amazon Price
New from Used from
"Please retry"
CDN$ 48.09
CDN$ 29.49 CDN$ 47.96

There is a newer edition of this item:

Harry Potter and the Cursed Child
click to open popover

Special Offers and Product Promotions

  • You'll save an extra 5% on Books purchased from, now through July 29th. No code necessary, discount applied at checkout. Here's how (restrictions apply)

No Kindle device required. Download one of the Free Kindle apps to start reading Kindle books on your smartphone, tablet, and computer.
Getting the download link through email is temporarily not available. Please check back later.

  • Apple
  • Android
  • Windows Phone
  • Android

To get the free app, enter your mobile phone number.

Product Details

  • Paperback: 325 pages
  • Publisher: Manning Publications; 1 edition (Dec 25 2010)
  • Language: English
  • ISBN-10: 1935182196
  • ISBN-13: 978-1935182191
  • Product Dimensions: 18.7 x 1.7 x 23.5 cm
  • Shipping Weight: 590 g
  • Average Customer Review: Be the first to review this item
  • Amazon Bestsellers Rank: #291,820 in Books (See Top 100 in Books)
  •  Would you like to update product info, give feedback on images, or tell us about a lower price?

Product Description

About the Author

Chuck Lamis a Senior Engineer at RockYou!. Chuck received his B.S from San Jose State University and his Ph.D in Electrical Engineering from Stanford University, where his thesis topic was computational data acquisition.

What Other Items Do Customers Buy After Viewing This Item?

Customer Reviews

There are no customer reviews yet on
5 star
4 star
3 star
2 star
1 star

Most Helpful Customer Reviews on (beta) HASH(0xa0745678) out of 5 stars 14 reviews
19 of 22 people found the following review helpful
HASH(0xa06402a0) out of 5 stars Good introduction to Hadoop ecosystem March 24 2012
By Erik Gfesser - Published on
Format: Paperback Verified Purchase
After checking out reviews of what O'Reilly and Apress had to offer with regard to Hadoop, I ended up purchasing this book based on positive reviews, my past positive experiences with the Manning "In Action" series of texts in general, such as "Spring in Action" and "Java Persistence with Hibernate", formerly "Hibernate in Action" (see my reviews), and the fact that this book was the most recently published on the subject. In short, this text is well organized, and covers its focus on Hadoop well, but potential readers should be aware that about one-third of what Lam has to offer here are ancillary to Hadoop, and not with regard to Hadoop itself. Inclusion of the larger ecosystem within which Hadoop sits personally makes sense, and I do not think this aspect of the book detracts from what the author provides in any way.

The author provides a good introduction to Hadoop in the first three chapters, which includes a discussion on differences between Hadoop and traditional technologies in this space, such as relational databases, as well as a tour of Hadoop building blocks, working with files in the Hadoop Distributed File System (HDFS), and the anatomy of a MapReduce program. The next three chapters contain the bulk of the text, which focuses on writing MapReduce programs, and includes segments on chaining MapReduce jobs, joining data from different sources, creating a Bloom filter, and monitoring, debugging, and tuning.

The next two chapters offer a short cookbook in which the author presents 5 different general MapReduce techniques (Lam admits that specialized MapReduce techniques can be found rather easily by Googling, and that he does not intend this cookbook to be comprehensive in any way), as well as a chapter on managing Hadoop, followed by four chapters on running Hadoop in the cloud, brief introductions on programming with Pig (a Hadoop extension that provides a language called Pig Latin) and using Hive (a package built on top of Hadoop that provides a SQL-like language called HiveQL). and a chapter that discusses four Hadoop case studies from the New York Times, China Mobile, StumbleUpon, and IBM (the case study from IBM takes up about 50% of the discussion, and the case study from the New York times is less than a page).

Be aware that at the time of this review, this book was published over a year ago. One of the common complaints I read about what O'Reilly and Apress have to offer in this space is that their counterparts to this book cover older versions of Hadoop. In chapter 4, Lam mentions that "one of the main design goals driving toward Hadoop's major 1.0 release is a stable and extensible MapReduce API. As of this writing, version 0.20 is the latest release and is considered a bridge between the older API (that we use throughout this book) and this upcoming stable API. The 0.20 release supports the future API while maintaining backward-compatibility with the old one by marking it deprecated."

"Future releases after 0.20 will stop supporting the older API. As of this writing, we don't recommend jumping into the new API yet for a couple reasons: (1) Many of Hadoop's own library classes in 0.20 aren't written under the new API yet. You won't be able to use those classes if your MapReduce code uses the new API in 0.20. (2) Many still consider the most production-ready and stable version of Hadoop as of this writing to be 0.18.3. Some users are warming up to version 0.20, but we suggest you wait a little longer before going full production with it." The author follows up by writing that "by the time you read this the situation may be different. In this section we cover the changes the new API presents. Fortunately, almost all the changes affect only the basic MapReduce template. We rewrite the template under the new API to enable you to use it in the future."

Exactly two weeks ago today, Hadoop 1.0.1 was released after 6 years of development. Inbetween the version that this book covers, and this most recent version, several intermediary versions were released, which provide bug fixes, improvements, optimizations, and new features, as well as support for some of the offerings in the Hadoop ecosystem. More timely information on open source technologies that enjoy wide community support is always going to be more readily available on the internet, especially via blog posts, but in my opinion this fact does not detract from the value of this text, which still serves as a good introduction to the Hadoop ecosystem, especially for those more comfortable starting out with a published text. Just be aware that you will be quickly referring to other materials after you make your way through this text.

The portions that I especially appreciated about what Lam has to offer include his presentations in chapter 5 on reduce-side joining and creating a Bloom filter, the cookbook that he provides in chapter 7 that includes segments on passing job-specific parameters to tasks, probing for task-specific information, partitioning into multiple output files, inputting from and outputting to a database, and keeping all output in sorted order, as well as chapters 9, 10, 11, which discuss the larger Hadoop ecosystem, especially the introduction to Pig Latin. Recommended to anyone looking for an introduction to the Hapoop ecosystem of technologies who understands that published texts such as this one cannot contain information about the latest releases.
24 of 29 people found the following review helpful
HASH(0xa06354d4) out of 5 stars outdated! Oct. 19 2012
By G. Franklin - Published on
Format: Paperback
This book covers Hadoop version 0.20, which is quite outdated relative to the current stable version of 1.x and coming 2.x. It should have been made clear on the cover or in the preface or product description but actually none until page 28 when starting talking about configuring Hadoop. In my opinion, authors and publishers should make it clear right upfront about what version of the product is covered to help readers make an informed choice. In addition, the examples are mostly based on the ubiquitous word-counting example that everyone uses, which is quite boring. If you just want to read about Hadoop and don't plan to actually run any samples, this book is fine. But if you also want to try some samples not based on the word-counting example, you might want to check out another book titled "Hadoop Essentials: A Quantitative Approach" Hadoop Essentials: A Quantitative Approach, which is based on the latest stable release of 1.0.3.
16 of 19 people found the following review helpful
HASH(0xa0635c24) out of 5 stars Hadoop book for normal people Dec 14 2010
By Amazon Customer - Published on
Format: Paperback Verified Purchase
I really love this book, is made for normal people just trying to get something done. The streaming coverage is perty good, it's the best book for python type of people I've seen. Lot of configuration information - very practical. I can't really review the java examples, but i did like the very practical examples on simple combiners. I think this book in combination with the newer version of "definitive guide"(make sure to get the recent one), really makes a solid statement on the hadoop front. I think both books are mandatory for anyone doing anything serious in hadoop.
3 of 3 people found the following review helpful
HASH(0xa064506c) out of 5 stars scattered book. little to no benefit over on-line information Jan. 31 2013
By R. Singer - Published on
Format: Paperback Verified Purchase
This book has little to no value over what you can just read on-line. The writing style wasn't great and it only covers the surface. No real descriptions of how things really work, performance issues, etc.
2 of 2 people found the following review helpful
HASH(0xa06408a0) out of 5 stars Great book, but outdated. Nov. 24 2014
By Nelson Estrada - Published on
Format: Paperback Verified Purchase
Great intro to Hadoop and the Hadoop Ecosystem. The reason I give this 4 stars is because this book is fairly outdated. The Hadoop world is moving at the speed of light, and a book published 3-4 years ago will not give you the necessary skills to work with today's versions/APIs of MapReduce/HDFS/etc. If you want more than a conceptual understanding of Hadoop, I would wait for the second edition (that will is expected to come out next year) or find another book.