Programming Collective Intelligence: Building Smart Web 2.0 Applications Paperback – Aug 16 2007
|New from||Used from|
Frequently Bought Together
Customers Who Bought This Item Also Bought
No Kindle device required. Download one of the Free Kindle apps to start reading Kindle books on your smartphone, tablet, and computer.
Getting the download link through email is temporarily not available. Please check back later.
To get the free app, enter your mobile phone number.
About the Author
Toby Segaran is the author of Programming Collective Intelligence, a very popular O'Reilly title. He was the founder of Incellico, a biotech software company later acquired by Genstruct. He currently holds the title of Data Magnate at Metaweb Technologies and is a frequent speaker at technology conferences.
What Other Items Do Customers Buy After Viewing This Item?
Most Helpful Customer Reviews on Amazon.com (beta)
My area of strength happens to be neural networks (my MS thesis topic was in the subject), so I will focus on that. In a few pages of the book, the author describes how the most popular of all neural networks, backpropagation, can be used to map a set of search terms to a URL. One might do this, for example, to try and find the page best matching the search terms. Instead of doing what nearly all other authors will do, prove the math behind the backprop training algorithm, he instead mentions what it does, and goes on to present python code that implements the stated goal.
The upside of the approach is clear -- if you know the theory of neural networks, and are not sure how to apply it (or want to see an example of how it can be applied), then this book is great for that. His example of adaptively training a backprop net using only a subset of the nodes in the network was interesting, and I learned from it. Given all the reading I have done over the years on the subject, that was a bit of a surprise for me.
However, don't take this book as being the "end all, be all" for understanding neural networks and their applications. If you need that, you will want to augment this book with writings that cover some of the other network architectures (SOM, hopfield, etc) that are out there. The same goes for the other topics that it covers.
In the end, this book is a great introduction to what is available for those new to machine learning, and shows better than any other book how it applies to Web 2.0. Major strengths of this book are its broad coverage, and the practicality of its contents. It is a great book for those who are struggling with the theory, and/or those who need to see an example of how the theory can be applied in a concise, practical way.
To the author: I expect this book will get a second edition, as the premise behind the book is such a good one. If that happens, perhaps beef up the equations a bit in the appendix, and cite some references or a bibliography for those readers interested in some more in depth reading about the theory behind all these wonderful techniques. (The lack of a bibliography is why I gave it 4 stars out of 5, I really think that those who are new to the subject would benefit greatly from knowing what sits on your bookshelf.)
My favorite part is how he shows us code (gives it to us!) that goes out into the world, grabs masses of data and does interesting things with it. The use of a hierarchical clustering algorithm to dig into people's intrinsic desires in life as expressed in zebo is worth the price of the book alone. The graph that shows a strong connection between "wife", "kids", and "home" but a different connection between "husband", "children", and "job" is IMHO just fascinating.
Gems like that make this book worth reading cover to cover. After that it can happily hang out on your shelf as a reference anytime you need to build something to mine user data and extract the wisdom of crowds.
Introduction to Collective Intelligence; Making Recommendations; Discovering Groups; Searching and Ranking; Optimization; Document Filtering; Modeling with Decision Trees; Building Price Models; Advanced Classification - Kernel Methods and SVMs; Finding Independent Features; Evolving Intelligence; Algorithm Summary; Third-Party Libraries; Mathematical Formulas; Index
In each of the chapters, Segaran takes a type of capability, be it decision-making or filtering, and shows how a programming language can be used to build that feature. His examples are all in Python, so it helps if you are already familiar with that language if you want to actually work with the code. But even if you don't know Python, the examples are clear and detailed enough that you can follow along and get the gist of what's happening. I personally think that it would help immensely if you had a background in mathematics and statistics. You can use the code here without having a detailed understanding of math, but I'm sure much of this would be more deeply appreciated if you already know about such things as Tanimoto similarity scores, Euclidean distances, or Pearson coefficients.
From my perspective (a non-Python programmer *without* the math background), I was more interested in understanding the overall picture about things like how ranking systems work or how recommendation engines are structured. While there was more detail than I needed (or understood), I still felt as if I accomplished my goal. I have a much greater appreciation for what companies like Google and Amazon have done in building web applications that allow the knowledge and wisdom of groups to be gathered and applied to my own preferences.
Statistical programmers will probably find years of entertainment here. :) "Normal" programmers will expand their horizons, too.
Unfortunately the book suffers from a number of problems. As other reviews have noted, the editing is very sloppy indeed: many of the code samples don't work, due to basic editing errors such as variables with different names on different lines of code, and frequently the text describing the code does not match the code it surrounds. Additionally, while the code samples aim to be brief and informal, they are written in a poor coding style: readability is sacrificed frequently for very small space savings. Variable names follow the "wordsruntogether" style rather than using underscores_like_this or camelCase (as normal Python style would recommend), the author doesn't even bother to put spaces around his equals signs, and expressions are telescoped onto one line where a little more space would have made the example much clearer (particularly to anyone meeting Python for the first time). Coverage of what are frequently fairly advanced mathematical topics is quite superficial, and the author tries to maintain the fiction that you can understand this with only a high-school or perhaps college 101 math background. This just isn't true, and results in a book that fails everyone: those who know the math are annoyed and unsatisfied at the depth, while those that don't are probably mystified and confused.
As well as being mostly non-functional for reasons of poor editing, many of the code samples also no longer work as they rely on APIs of websites such as Facebook, Hot or Not and Zillow which have changed or which no longer function. While we can't really fault the author for this after 3+ years since publication, it does add to the reasons not to buy the book now. Fortunately the Kindle "sample chapter" for this book is about 75% of the book's content, so you can read most of it for free. It would be nice if O'Reilly's errata page for the book was maintained, and also if the Kindle edition had the errata applied to it. But I guess that is too much to ask.
It's worth reading the free chapters in the Kindle sample if you want a quick skim of these topics and aren't going to bother with the code samples, but don't spend money on the book or time on the code.
Look for similar items by category
- Books > Computers & Technology > Computer Science > Artificial Intelligence > Human Vision & Language Systems
- Books > Computers & Technology > Computer Science > Artificial Intelligence > Machine Learning
- Books > Computers & Technology > Programming > Algorithms
- Books > Computers & Technology > Programming > Languages & Tools
- Books > Computers & Technology > Programming > Software Design, Testing & Engineering > Software Development
- Books > Computers & Technology > Software
- Books > Computers & Technology > Web Development > Web 2.0
- Books > Textbooks > Computer Science & Information Systems > Algorithms
- Books > Textbooks > Computer Science & Information Systems > Artificial Intelligence
- Books > Textbooks > Computer Science & Information Systems > Computer Science
- Books > Textbooks > Computer Science & Information Systems > Programming Languages