Vous voulez voir cette page en français ? Cliquez ici.

 

or
Sign in to turn on 1-Click ordering.
 
 
More Buying Choices
19 used & new from CDN$ 5.52

Have one to sell? Sell yours here
 
   
Spidering Hacks
 
 

Spidering Hacks (Paperback)

by Kevin Hemenway (Author), Tara Calishain (Author)
4.6 out of 5 stars  See all reviews (8 customer reviews)
List Price: CDN$ 38.95
Price: CDN$ 24.54 & eligible for FREE Super Saver Shipping on orders over CDN$ 39. Details
You Save: CDN$ 14.41 (37%)
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
Usually ships within 3 to 5 weeks.
Ships from and sold by Amazon.ca. Gift-wrap available.

Ordering for Christmas?? This item requires additional time to ship and will arrive after December 25. Need a last-minute gift? Send an Amazon.ca Gift Certificate.

12 new from CDN$ 15.69 7 used from CDN$ 5.52

Customers Who Bought This Item Also Bought

CSS: The Definitive Guide

CSS: The Definitive Guide

by Eric A Meyer
4.5 out of 5 stars (2)  CDN$ 37.16
Explore similar items

Product Details


Product Description

Product Description

The Internet, with its profusion of information, has made us hungry for ever more, ever better data. Out of necessity, many of us have become pretty adept with search engine queries, but there are times when even the most powerful search engines aren't enough. If you've ever wanted your data in a different form than it's presented, or wanted to collect data from several sites and see it side-by-side without the constraints of a browser, then "Spidering Hacks" is for you.

"Spidering Hacks" takes you to the next level in Internet data retrieval--beyond search engines--by showing you how to create spiders and bots to retrieve information from your favorite sites and data sources. You'll no longer feel constrained by the way host sites think you want to see their data presented--you'll learn how to scrape and repurpose raw data so you can view in a way that's meaningful to you.

Written for developers, researchers, technical assistants, librarians, and power users, "Spidering Hacks" provides expert tips on spidering and scraping methodologies. You'll begin with a crash course in spidering concepts, tools (Perl, LWP, out-of-the-box utilities), and ethics (how to know when you've gone too far: what's acceptable and unacceptable). Next, you'll collect media files and data from databases. Then you'll learn how to interpret and understand the data, repurpose it for use in other applications, and even build authorized interfaces to integrate the data into your own content. By the time you finish "Spidering Hacks," you'll be able to:

Aggregate and associate data from disparate locations, then store and manipulate the data as you like

Gain a competitive edge in business by knowingwhen competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites

Integrate third-party data into your own applications or web sites

Make your own site easier to scrape and more usable to others

Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day

Like the other books in O'Reilly's popular Hacks series, "Spidering Hacks" brings you 100 industrial-strength tips and tools from the experts to help you master this technology. If you're interested in data retrieval of any type, this book provides a wealth of data for finding a wealth of data.



About the Author

About The Author

Tara Calishain is the editor of ResearchBuzz, a weekly newsletter on Internet searching. She's also a regular columnist for SEARCHER and has written for a variety of other publications. Her author/co-author credits include Google Hacks and Official Netscape Guide to Internet Research.


Inside This Book (Learn More)
Browse Sample Pages
Front Cover | Copyright | Table of Contents | Excerpt | Index | Back Cover
Search inside this book:

Tag this product

 (What's this?)
Think of a tag as a keyword or label you consider is strongly related to this product.
Tags will help all customers organize and find favorite items.
Your tags: Add your first tag
 

 

Customer Reviews

8 Reviews
5 star:
 (5)
4 star:
 (3)
3 star:    (0)
2 star:    (0)
1 star:    (0)
 
 
 
 
 
Average Customer Review
4.6 out of 5 stars (8 customer reviews)
 
 
 
 
Share your thoughts with other customers:
Most helpful customer reviews

 
4.0 out of 5 stars Many examples of how to use spiders, April 8 2004
By W Boudville (Terra, Sol 3) - See all my reviews
(REAL NAME)   
The book has a nice collection of case studies on how to gather data from disparate websites. You might consider this as showing a simple way for you to use Web Services.

Spidering is the way that search engines gather their data. But you do not have to be Altavista or Google to use spiders. Nor do you have to be scanning a large fraction of the Web. The authors demistify spiders. If you can follow their examples, then you get concrete instances of usage that might help your particular application.

Thoughtfully, the examples are mostly written in Perl, with a few in Java. These languages should be familiar to many. Though even if you don't know them, the logic of the code can still be useful. (That is, you can treat the code as pseudocode.)

While spiders are probably best known as being used by search engines, they are really only the starting point for the latter. The much harder problems start when you have the data amassed by a spider. Now you have to efficiently find correlations between the various web pages. You should be aware that the book does not discuss these with any significant depth. Not surprising, because these are outside the scope of the book. The examples do show how to use the data found by spiders. But most of these are for web pages that sit in a given domain. So the pages are closely affiliated in content and structure.

Was this review helpful to you? Yes No (Report this)



 
5.0 out of 5 stars Lots of great ideas, Mar 22 2004
By Jack D. Herrington "engineer and author" (Silicon Valley, CA) - See all my reviews
(REAL NAME)   
Once in a long while you get a book that inspires you with a lot of great small ideas. Spidering Hacks is just that type of book. The web has a wealth of structured and semi-structured that is just waiting to be mined with automated tools. This book not only teaches you how to get the data out of these sources, but gives you idea about where to look for information and what to do with it.

This book demonstrates everything I like in a technical book. It not only describes how things are done. It also gives practical examples of how the technology can be useful in the real world, and presents them enthusiastically. It makes you want to go out and implement all of the ideas and to keep on going with some of your own.

Nitpicks I have with the book are minor. The 'Hacks' format seems imposed, for example, hack #8 is about installing CPAN. I don't think that section should be left out, but I don't think it's a hack either. But hey, I don't care that much about the structure as long as it isn't an imposing flaw and the content within the structure is great, as it is with this book.

Have to say, O'Reilly is on a roll with the Hacks series. They have all been fine books.

Was this review helpful to you? Yes No (Report this)



 
5.0 out of 5 stars Example-filled and easy-to-follow, Mar 7 2004
By Midwest Book Review (Oregon, WI USA) - See all my reviews
The knowledgeable collaboration of Kevin Hemenway and Tara Calishain, Spidering Hacks: 100 Industrial-Strength Tips & Tools is an extensive, 402-page instructional guidebook and reference to Internet data retrieval through the use of spiders and scrapers. Including information on methodology, philosophies, and ethical considerations, as well as freely available modules, scripts, frameworks, and templates, information on how to build alternative interfaces to online databases, how to keep one's data current and share it in a user-friendly manner, and so much more, Spidering Hacks is an example-filled, easy-to-follow, highly recommended computer shelf resource.
Was this review helpful to you? Yes No (Report this)


Share your thoughts with other customers: Create your own review
 
 
Most recent customer reviews

4.0 out of 5 stars Rich samples, fit your specific needs if you're Perl lover
If you are a Perl lover and looking for a book to help you extracting contents from this huge resourceful Internet, this book quite fits your needs. Read more
Published on Feb 25 2004 by Otto Yuen

4.0 out of 5 stars Good book with a light start
The 'Hacks' series from O'Reilly seems to be breeding as fast as virii in a Windows network - every time you turn around another one. Read more
Published on Feb 14 2004 by A Williams

5.0 out of 5 stars Great Book
Are you ready to be the next Google? It is widely known that Google pulled out in front of (and largely obsoleted) major search engine players like Altavista and Yahoo largely... Read more
Published on Jan 5 2004

5.0 out of 5 stars A fresh idea
Spidering hacks like other oreilly "hacks" books live up to the tradition. This book shows some of the internet guru tips and tricks. Read more
Published on Dec 29 2003

5.0 out of 5 stars fun to read
Like other Oreilly hacking books, this one is easy to read and follow. Inside this book, you can find lots of aways automating perl scripts to things for you... Read more
Published on Dec 9 2003

Only search this product's reviews



Look for similar items by category


Look for similar items by subject


Feedback


Your Recent History

 (What's this?)

After viewing product detail pages or search results, look here to find an easy way to navigate back to pages you are interested in.