17 of 18 people found the following review helpful
4.0 out of 5 stars
pure fun, Mar 3 2011
By H. Smith "profhal" - Published on Amazon.com
This review is from: Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites (Paperback)
Mining the Social Web does a great job of introducing a wide variety of techniques and wealth of resources for exploring freely available social data and personal information. If you are willing to spend the time tinkering with the examples, the book is pure fun. It offers a nice compliment to Segaran's Programming Collective Intelligence: Building Smart Web 2.0 Applications. The two books overlap but where they do offer different perspectives and explanations of common techniques (e.g., TF-IDF, cosine similarity, Jaccard index). If you are well-versed in data mining the web you may find much of the discussion familiar. If you have only been casually engaged to date, your toolbox will fill quickly.
In order to work with the book's examples related to LinkedIn and Facebook you really need to have a robust collection of connections. In terms of the source code itself, most of it worked as is. I wasn't able to install the Buzz library which limited my interaction with material in chapter 7 and opted to not get involved with the LinkedIn or Facebook but found the discussions around them easy to follow. By far my favorite chapter in the book was chapter 8, "Blogs et al.: Natural Language Processing (and Beyond)..." It was quite fascinating and caused my reading list to grow considerably.
11 of 12 people found the following review helpful
5.0 out of 5 stars
A book that covers an awesome lot of ground, Feb 7 2011
By Ricardo Bánffy - Published on Amazon.com
This review is from: Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites (Paperback)
This book covers a lot of ground. It's, at times, a bit vertiginous in the amount of subjects and technologies it touches per chapter, and is not always easy to follow. It can also introduce so many interesting things that, by the time you finished becoming familiar with all of them, after wandering for hours on the web, jumping from interesting technology to interesting technology, you may have forgotten what took you to these places and wonder where you were in the book. Time spent reading it is, however, time very well spent. When you finish it, you will have at least a cursory familiarity with tools like OAuth, CouchDB, Redis, MapReduce, NumPy (and the Python programming language, albeit it will help you a lot if you know your way around Python before you start the book), Graphviz, SIMILE widgets, NLTK, various service APIs and data formats, and will be well equipped to explore those rich datasets on your own. The chapters are well compartmentalized and it's easy to pick chapters to read according to your needs. I know that, when I face the problems they tackle, I will do exactly that.
If you do any kind of analysis and visualization of social-generated data that's on the web, this book is a good pick. Even if your datasets are not from the web, you may find the parts on analysis and visualization very interesting.
5 of 5 people found the following review helpful
5.0 out of 5 stars
Easy to read. I tore through it, Mar 8 2011
By Wiebe de Jong - Published on Amazon.com
This review is from: Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites (Paperback)
Some basic programming ability is a must for this book, as the first page starts with installing the Python development tools. If you don't know Python, that is okay since all the code is easy to follow. Everything you need to develop and run the examples is described step by step with clear instructions at every point.
Once you get comfortable with the basics, the author quickly moves from topic to topic, giving a good introduction into many aspects of how to mine data and generate useful conclusions. Some of the examples include
accessing your twitter feed with OAuth,
processing feeds to determine influence,
using set-wise opeations with redis to determine which of your friends are also followers,
storing data in CouchDB,
using map-reduce to determine the most popular mentions and topics,
natural language processing,
and seeing data with various visualization tools.
And that was just for Twitter.
The book continues on with examples of processing mailboxes, LinkedIn, Google Buzz, blogs, Facebook, and the Semantic Web. The examples show how easy it is to gather and analyze data from all these social web sites.
With a good breadth of coverage, I highly recommend this book for anyone wanting to learn to process and visualize large amounts of data, either from the social web or any other data source.