|Amazon Price||New from||Used from|
The fascination of this topic is that it makes you see the Web in a different way, not as a set of pages for users to browse, but as a huge database for your programs to explore. The most robust technique for querying Web sites programmatically is through XML Web Services, but this approach is in its infancy. LWP takes a different route, called screen-scraping. In essence, your Perl code pretends to be a browser and grabs HTML for processing. Using LWP you could write a command-line program to search your favourite auction site, fetch news headlines, or check multiple retail sites for the best prices. As the author acknowledges, the problem with screen-scraping is its brittleness: if the target Web site adopts a new look, it breaks your code. There are also interesting fair usage issues. Even so, it's a powerful technique with many possible applications. This clear and concise guide comes complete with typically terse Perl code examples. Topics include LWP basics, posting form data, processing results with regular expressions, using trees to process HTML, imitating different browser types, and supporting cookies programmatically. An appendix offers handy information like HTTP status codes, character tables, and MIME types. LWP is large, but while this title does not attempt to cover all the modules, it does provide all you need to start coding your own Web-mining programs.--Tim Anderson
If you are unfamiliar with LWP and web scraping, or HTML parsing using tokens and trees, I strongly recommend this book. Read morePublished on March 15 2003 by Matthew D. Huwiler
As a web programmer, I had dealt with several such projects dealing with web automation and writing simple crawlers even before I read "Perl & LWP". Read morePublished on Aug. 7 2002 by "sherzodr"
I was definitely interested when I first heard that O'Reilly were publishing a book on LWP. LWP is a definitive collection of perl modules covering everything you could think of... Read morePublished on July 16 2002 by Gavin