Auto boutiques-francophones Simple and secure cloud storage Personal Care Furniture Kindle Music Deals Store Cycling Tools minions
CDN$ 44.95
  • List Price: CDN$ 56.76
  • You Save: CDN$ 11.81 (21%)
Only 6 left in stock (more on the way).
Ships from and sold by Amazon.ca.
Gift-wrap available.
Quantity:1
Hadoop in Action has been added to your Cart
Have one to sell?
Flip to back Flip to front
Listen Playing... Paused   You're listening to a sample of the Audible audio edition.
Learn more
See all 2 images

Hadoop in Action Paperback – Dec 25 2010


See all 3 formats and editions Hide other formats and editions
Amazon Price New from Used from
Paperback
"Please retry"
CDN$ 44.95
CDN$ 35.20 CDN$ 35.03

There is a newer edition of this item:

Hadoop in Action
CDN$ 63.60
This title has not yet been released.

Unlimited FREE Two-Day Shipping for Six Months When You Try Amazon Student


Frequently Bought Together

Hadoop in Action + Hadoop in Practice
Price For Both: CDN$ 86.92

Buy the selected items together



Product Details

Product Description

About the Author

Chuck Lamis a Senior Engineer at RockYou!. Chuck received his B.S from San Jose State University and his Ph.D in Electrical Engineering from Stanford University, where his thesis topic was computational data acquisition.


Inside This Book

(Learn More)
Browse Sample Pages
Front Cover | Copyright | Table of Contents | Excerpt | Index | Back Cover
Search inside this book:

What Other Items Do Customers Buy After Viewing This Item?

Customer Reviews

There are no customer reviews yet on Amazon.ca
5 star
4 star
3 star
2 star
1 star

Most Helpful Customer Reviews on Amazon.com (beta)

Amazon.com: 14 reviews
18 of 21 people found the following review helpful
Good introduction to Hadoop ecosystem March 24 2012
By Erik Gfesser - Published on Amazon.com
Format: Paperback Verified Purchase
After checking out reviews of what O'Reilly and Apress had to offer with regard to Hadoop, I ended up purchasing this book based on positive reviews, my past positive experiences with the Manning "In Action" series of texts in general, such as "Spring in Action" and "Java Persistence with Hibernate", formerly "Hibernate in Action" (see my reviews), and the fact that this book was the most recently published on the subject. In short, this text is well organized, and covers its focus on Hadoop well, but potential readers should be aware that about one-third of what Lam has to offer here are ancillary to Hadoop, and not with regard to Hadoop itself. Inclusion of the larger ecosystem within which Hadoop sits personally makes sense, and I do not think this aspect of the book detracts from what the author provides in any way.

The author provides a good introduction to Hadoop in the first three chapters, which includes a discussion on differences between Hadoop and traditional technologies in this space, such as relational databases, as well as a tour of Hadoop building blocks, working with files in the Hadoop Distributed File System (HDFS), and the anatomy of a MapReduce program. The next three chapters contain the bulk of the text, which focuses on writing MapReduce programs, and includes segments on chaining MapReduce jobs, joining data from different sources, creating a Bloom filter, and monitoring, debugging, and tuning.

The next two chapters offer a short cookbook in which the author presents 5 different general MapReduce techniques (Lam admits that specialized MapReduce techniques can be found rather easily by Googling, and that he does not intend this cookbook to be comprehensive in any way), as well as a chapter on managing Hadoop, followed by four chapters on running Hadoop in the cloud, brief introductions on programming with Pig (a Hadoop extension that provides a language called Pig Latin) and using Hive (a package built on top of Hadoop that provides a SQL-like language called HiveQL). and a chapter that discusses four Hadoop case studies from the New York Times, China Mobile, StumbleUpon, and IBM (the case study from IBM takes up about 50% of the discussion, and the case study from the New York times is less than a page).

Be aware that at the time of this review, this book was published over a year ago. One of the common complaints I read about what O'Reilly and Apress have to offer in this space is that their counterparts to this book cover older versions of Hadoop. In chapter 4, Lam mentions that "one of the main design goals driving toward Hadoop's major 1.0 release is a stable and extensible MapReduce API. As of this writing, version 0.20 is the latest release and is considered a bridge between the older API (that we use throughout this book) and this upcoming stable API. The 0.20 release supports the future API while maintaining backward-compatibility with the old one by marking it deprecated."

"Future releases after 0.20 will stop supporting the older API. As of this writing, we don't recommend jumping into the new API yet for a couple reasons: (1) Many of Hadoop's own library classes in 0.20 aren't written under the new API yet. You won't be able to use those classes if your MapReduce code uses the new API in 0.20. (2) Many still consider the most production-ready and stable version of Hadoop as of this writing to be 0.18.3. Some users are warming up to version 0.20, but we suggest you wait a little longer before going full production with it." The author follows up by writing that "by the time you read this the situation may be different. In this section we cover the changes the new API presents. Fortunately, almost all the changes affect only the basic MapReduce template. We rewrite the template under the new API to enable you to use it in the future."

Exactly two weeks ago today, Hadoop 1.0.1 was released after 6 years of development. Inbetween the version that this book covers, and this most recent version, several intermediary versions were released, which provide bug fixes, improvements, optimizations, and new features, as well as support for some of the offerings in the Hadoop ecosystem. More timely information on open source technologies that enjoy wide community support is always going to be more readily available on the internet, especially via blog posts, but in my opinion this fact does not detract from the value of this text, which still serves as a good introduction to the Hadoop ecosystem, especially for those more comfortable starting out with a published text. Just be aware that you will be quickly referring to other materials after you make your way through this text.

The portions that I especially appreciated about what Lam has to offer include his presentations in chapter 5 on reduce-side joining and creating a Bloom filter, the cookbook that he provides in chapter 7 that includes segments on passing job-specific parameters to tasks, probing for task-specific information, partitioning into multiple output files, inputting from and outputting to a database, and keeping all output in sorted order, as well as chapters 9, 10, 11, which discuss the larger Hadoop ecosystem, especially the introduction to Pig Latin. Recommended to anyone looking for an introduction to the Hapoop ecosystem of technologies who understands that published texts such as this one cannot contain information about the latest releases.
23 of 28 people found the following review helpful
outdated! Oct. 19 2012
By G. Franklin - Published on Amazon.com
Format: Paperback
This book covers Hadoop version 0.20, which is quite outdated relative to the current stable version of 1.x and coming 2.x. It should have been made clear on the cover or in the preface or product description but actually none until page 28 when starting talking about configuring Hadoop. In my opinion, authors and publishers should make it clear right upfront about what version of the product is covered to help readers make an informed choice. In addition, the examples are mostly based on the ubiquitous word-counting example that everyone uses, which is quite boring. If you just want to read about Hadoop and don't plan to actually run any samples, this book is fine. But if you also want to try some samples not based on the word-counting example, you might want to check out another book titled "Hadoop Essentials: A Quantitative Approach" Hadoop Essentials: A Quantitative Approach, which is based on the latest stable release of 1.0.3.
16 of 19 people found the following review helpful
Hadoop book for normal people Dec 14 2010
By Amazon Customer - Published on Amazon.com
Format: Paperback Verified Purchase
I really love this book, is made for normal people just trying to get something done. The streaming coverage is perty good, it's the best book for python type of people I've seen. Lot of configuration information - very practical. I can't really review the java examples, but i did like the very practical examples on simple combiners. I think this book in combination with the newer version of "definitive guide"(make sure to get the recent one), really makes a solid statement on the hadoop front. I think both books are mandatory for anyone doing anything serious in hadoop.
11 of 14 people found the following review helpful
Excellent Intro to Hadoop Jan. 7 2011
By Kenneth DeLong - Published on Amazon.com
Format: Paperback
This book is extremely well-written and clear, as well as being very pragmatic and useful. You can really understand how to set up and use Hadoop. I've read many other articles on Hadoop and MapReduce, but after reading this book I thought "why couldn't those articles have explained it that clearly?"
3 of 3 people found the following review helpful
scattered book. little to no benefit over on-line information Jan. 31 2013
By R. Singer - Published on Amazon.com
Format: Paperback Verified Purchase
This book has little to no value over what you can just read on-line. The writing style wasn't great and it only covers the surface. No real descriptions of how things really work, performance issues, etc.


Feedback