Vous voulez voir cette page en français ? Cliquez ici.


or
Sign in to turn on 1-Click ordering.
More Buying Choices
Have one to sell? Sell yours here
Tell the Publisher!
I'd like to read this book on Kindle

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

Processing XML with Java: A Guide to SAX, DOM, JDOM, JAXP, and TrAX [Paperback]

Elliotte Rusty Harold
5.0 out of 5 stars  See all reviews (13 customer reviews)
List Price: CDN$ 67.99
Price: CDN$ 42.83 & FREE Shipping. Details
You Save: CDN$ 25.16 (37%)
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
Usually ships within 1 to 2 months.
Ships from and sold by Amazon.ca. Gift-wrap available.
Join Amazon Student in Canada


Book Description

Nov. 5 2002 0201771861 978-0201771862 1
A complete guide to writing Java programs that read and write XML documents. Shows developers how to save XML documents, read XML documents, communicate with network servers that send and receive XML data, and integrate XSLT into their programs. Softcover.

Customers Who Bought This Item Also Bought


Product Details


Product Description

From the Inside Flap

One night five developers, all of whom wore very thick glasses and had recently been hired by Elephants, Inc., the world's largest purveyor of elephants and elephant supplies, were familiarizing themselves with the company's order processing system when they stumbled into a directory full of XML documents on the main server. "What's this?" the team leader asked excitedly. None of them had ever heard of XML before, so they decided to split up the files among them and try to figure out just what this strange and wondrous new technology was. The first developer, who specialized in optimizing Oracle databases, printed out a stack of FMPXMLRESULT documents generated by the FileMaker Pro database where all the orders were stored, and began poring over them. "So this is XML! Why, it's nothing novel. As anyone can see who's able, an XML document is nothing but a table!"

"What do you mean, a table?" replied the second developer, well versed in object-oriented theory and occupied with a collection of XMI documents that encoded UML diagrams for the system. "Even a Visual Basic programmer could see that XML documents aren't tables. Duplicates aren't allowed in a table relation, unless this is truly some strange mutation. Classes and objects are what these documents are. Indeed, it should be obvious on the very first pass. An XML document is an object, and a DTD is a class."

"Objects? A strange kind of object, indeed!" said the third developer, a web designer of some renown, who had loaded the XHTML user documentation for the order processing system into Mozilla."I don't see any types at all. If you think this is an object, then it's your software I refuse to install. But with all those stylesheets there, it should be clear to anyone not sedated that XML is just HTML updated!"

"HTML? You must be joking" said the fourth, a computer science professor on sabbatical from MIT, who was engrossed in an XSLT stylesheet that validated all of the other documents against a Schematron schema. "Look at the clean nesting of hierarchical structures, each tag matching its partner as it should. I've never seen HTML that looks this good. What we have here is S-expressions, which is certainly nothing new. Babbage invented this back in 1882!"

"S-expressions?" queried the technical writer, who was occupied with documentation for the project, written in DocBook. "Maybe that means something to those in your learned profession. But to me, this looks just like a FrameMaker MIF file. However, locating the GUI does seem to be taking me a while."

And so they argued into the night, none of them willing to give an inch, all of them presenting still more examples to prove their points, but none bothering to look at the others' examples. Indeed, they're probably still arguing today. You can even hear their shouts from time to time on xml-dev. Their mistake, of course, was in trying to force XML into the patterns of technologies they were already familiar with rather than taking it on its own terms. XML can store data, but it is not a database. XML can serialize objects, but an XML document is not an object. Web pages can be written in XML, but XML is not HTML. Functional (and other) programming languages can be written in XML, but XML is not a programming language. Books are written in XML, but that doesn't make XML desktop publishing software.

XML is something truly new that has not been seen before in the world of computing. There have been precursors to it, and there are always fanatics who insist on seeing XML through database (or object, or functional, or S-expression) colored glasses. But XML is none of these things. It is something genuinely unique and new in the world of computing; and it's possible to understand it only when you're willing to accept it on its own terms, rather than forcing it into yesterday's pigeonholes.

There are a lot of tools, APIs, and applications in the world that pretend XML is something more familiar to developers--that it's just a funny kind of database, or just like an object, or just like remote procedure calls. These APIs are occasionally useful in very restricted and predictable environments; however, they are not suitable for processing XML in its most general format. They work well in their limited domains, but they fail when presented with XML that steps outside the artificial boundaries they've defined. XML was designed to be extensible, but sadly many of the tools designed for XML aren't nearly as extensible as XML itself.

This book is going to show you how to handle XML in its full generality. It pulls no punches. It does not pretend that XML is anything except XML, and it shows you how to design your programs so that they handle real XML in all its messiness: valid and invalid, mixed and unmixed, typed and untyped, and both all and none of these at the same time. To that end, this book focuses on APIs that don't try to hide the XML. In particular, there are three major Java APIs that correctly model XML, as opposed to modeling a particular class of XML documents or some narrow subset of XML. These are

  • SAX, the Simple API for XML
  • DOM, the Document Object Model
  • JDOM, a Java native API

These APIs are the core of this book. In addition, I cover a number of preliminaries and supplements to the basic APIs, including

  • XML syntax
  • DTDs, schemas, and validity
  • XPath
  • XSLT and the TrAX API
  • JAXP, a combination of SAX, DOM, and TrAX with a few factory classes

And, since we're going to need a few examples of XML applications to demonstrate the APIs, I also cover XML-RPC, SOAP, and RSS in some detail. However, the techniques this book teaches are hardly limited to those three applications.

Who You Are

This book is written for experienced Java developers who want to integrate XML into their systems. Java is the ideal language for processing XML documents. Its strong Unicode support in particular made it the preferred language for many early implementers. Consequently, more XML tools have been written in Java than in any other language. More open source XML tools are written in Java than in any other language. More developers process XML in Java than in any other language.

Processing XML with Java™ will teach you how to

  • Save XML documents from applications written in Java
  • Read XML documents produced by other programs
  • Search, query, and update XML documents
  • Convert legacy flat data into hierarchical XML
  • Communicate with network servers that send and receive XML data
  • Validate documents against DTDs, schemas, and business rules
  • Combine functional XSLT transforms with traditional imperative Java code

This book is intended for Java developers who need to do anything with XML. It teaches the fundamentals and advanced topics, leaving nothing out. It is a comprehensive course in processing XML with Java that takes developers from having little knowledge of XML to designing sophisticated XML applications and parsing complicated documents. The examples cover a wide range of possible uses, including file formats, data exchange, document transformation, database integration, and more.

What You Need to Know

This is not an introductory book with respect to either Java or XML. I assume you have substantial prior experience with Java and preferably some experience with XML. On the Java side, I freely use advanced features of the language and its class library without explanation or apology. Among other things, I assume you are thoroughly familiar with the following:

  • Object-oriented programming, including inheritance and polymorphism.
  • Packages and the CLASSPATH. You should not be surprised by classes that do not have main() methods or that are not in the default package.
  • I/O including streams, readers, and writers. You should understand that System.out is a horrible example of what really goes on in Java programs.
  • The Java Collections API including hash tables, maps, sets, iterators, and lists.

In addition, in one or two places in this book I use some SQL and JDBC. These sections are relatively independent of the rest of the book, however, and chances are if you aren't already familiar with SQL, then you don't need the material in these sections anyway.

What You Need to Have

XML is deliberately architecture, platform, operating system, GUI, and language agnostic (in fact, more so than Java). It works equally well on Mac OS, Windows, Linux, OS/2, various flavors of Unix, and more. It can be processed with Python, C++, Haskell, ECMAScript, C#, Perl, Visual Basic, Ruby, and of course Java. No byte-order issues need concern you if you switch between PowerPC, X86, or other architectures. Almost everything in this book should work equally well on any platform that's capable of running Java.

Most of the material in this book is relatively independent of the specific Java version. Java 1.4 bundles SAX, DOM, and a few other useful classes into the core JDK. However, these are easily installed in earlier JVMs as open source libraries from the Apache XML Project and other vendors. For the most part, I used Java 1.3 and 1.4 when testing the examples; therefore, it's possible that a few of the classes and methods used are not available in earlier versions. In most cases, it should be fairly obvious how to backport them. All of the basic XML APIs except TrAX should work in Java 1.1 and later. TrAX requires Java 1.2 or later.

How to Use This Book

This book is organized as an advanced tutorial that can also serve as a solid and comprehensive reference. Chapter 1 covers the bare minimum material needed to start working with XML, although for the most part this is not intended as a comprehensive introduction, but more as a review for readers who already have read other, more basic books. Chapter 2 introduces RSS, XML-RPC, and SOAP, the XML applications used for examples throughout the rest of the book. This is followed by two chapters on generating XML from your own programs (a subject all too often presented as a lot more complicated than it actually is). Chapter 3 covers generating XML directly from code, and Chapter 4 covers converting legacy data in other formats to XML. The remaining bulk of the book is devoted to the major APIs for processing XML:

  • The event-based SAX API
  • The tree-based DOM API
  • The tree-based JDOM API
  • XPath APIs for searching XML documents
  • The TrAX API for XSLT processing

Finally, the book finishes with an appendix providing quick references to the main APIs.

If you have limited experience with XML, I suggest that you read at least the first five chapters in order. From that point forward, if you have a particular API preference, you may begin with the part that covers the major API you're interested in:

  • Chapters 6 to 8 cover SAX.
  • Chapters 9 to 13 cover DOM.
  • Chapters 14 and 15 cover JDOM.

Once you're comfortable with one or more of these APIs, you can read Chapters 16 and 17 on XPath and XSLT. However, those APIs and chapters do require some knowledge of at least one of the three major APIs.

The Online Edition

The entire book is available online in plain-vanilla HTML at my Cafe con Leche web site.Every word of this book is there. Nothing has been held back or left out. I do hope you also find the printed book useful and choose to buy it--it's certainly cheaper than the paper and toner you'd use up printing out all 1,120 pages from your laser printer--but you are by no means obligated to do so. My goal is to make this material as broadly available and useful as possible.

The online version has no protection other than copyright law and your own good will. You don't need to register to read it, or to download some special electronic key that becomes invalid when you buy a new laptop (and that probably wouldn't run on Linux or a Mac in the first place). I want people to read and use this book. I do not want to put up silly roadblocks that make it less useful than it could be. I do ask, as a courtesy, that you do not republish the online edition on your own server. Doing so makes it extremely difficult for me to keep the book up to date. If you want to save a few pages on your laptop so you can read this book on an airplane, I don't really mind. But please don't pass out your own copies to anyone else. Instead, refer your friends and colleagues to the web site or the printed book.

Some Grammatical Notes

The rules of English grammar were laid down, written in stone, and encoded in the DNA of elementary school teachers long before computers were invented. Unfortunately, this means that sometimes I have to decide between syntactically correct code and syntactically correct English. When forced to do so, English normally loses. This means that sometimes a punctuation mark appears outside a quotation mark when you'd normally expect it to appear inside, a sentence begins with a lowercase letter, or something similarly unsettling occurs. For the most part, I've tried to use various typefaces to make the offending phrase less jarring. In particular, please note the following:

  • Italicized text is used for emphasis, the first occurrence of an important term, titles of books and other cited works, words in languages other than English, words as words themselves (for example, Booboisie is a very funny word), Java system properties, host names, and resolvable URLs.
  • Monospaced text is used for XML and Java source code, namespace URLs, system prompts, and program output.
  • Italicized monospace text is used for pieces of XML and Java source code that should be replaced by some other text.
  • Bold monospaced text is used for literal text that the user types at a command line, as well as for emphasis in code.

It's not just English grammar that gets a little squeezed, either. The necessities of fitting code onto a printed page rather than a computer screen have occasionally caused me to deviate from the ideal Java coding conventions. The worst problem is line length. I can fit only 65 characters across the page in a line of code. To try to make maximum use of this space, I indent each block by two spaces and indent line continuations by one space, rather than the customary four spaces and two spaces respectively. Even so, I still have to break lines where I otherwise would prefer not to. For example, I originally wrote this line of code for Chapter 4:

result.append(" " + amount + "\r\n");

To fit it on the page, however, I had to split it into two pieces, like this:

result.append(" ");
result.append(amount +"\r\n");

This wasn't too bad, but sometimes even this wasn't enough and I had to remove indents from the front of the line that would otherwise be present. This occasionally forced the indentation not to line up as prettily as it otherwise might, as in this example from Chapter 3:

wout.For example, in Chapter 4, I found I needed to remove a few characters from this line:

OutputStreamWriter wout = new OutputStreamWriter(out, "UTF8");

On reflection I realized that nowhere did the program actually need to know that wout was an OutputStreamWriter as opposed to merely a Writer. Thus I could easily rewrite the offending line as follows:

Writer wout = new OutputStreamWriter(out, "UTF8");

This follows the general object-oriented principle of using the least-specific type that will suit. This polymorphism makes the code more flexible in the future should I find a need to swap in a different kind of Writer.

Contacting the Author

I always enjoy hearing from readers, whether with general comments, specific ways I could improve the book, or questions related to the book's subject matter. Because this book is being published in its entirety online, it is possible for me to reprint at least the online edition much faster than can be done with a traditional paper book. Thus corrections and errata are especially helpful because I have a real chance to fix them. Before sending in a correction, please do check the online edition to see if I have already fixed the problem.

Please send all comments, inquiries, bouquets, and brickbats to elharo@ metalab.unc.edu. I get a lot of e-mail, so I can't promise to answer them all; but I do try. It's helpful if you use a subject line that clearly identifies yourself as a reader of this book. Otherwise, your message may accidentally get misidentified as spam I don't want or bulk mail I don't have time to read and be dropped in the bit bucket before I see it. Also, please make absolutely sure that your message uses the correct reply-to address and that the address will be valid for at least several months after you send the message. There's nothing quite as annoying as taking an hour or more to compose a detailed response to an interesting question, only to have it bounce because the reader sent the email from a public terminal or changed their ISP. But please do write to me. I want to hear from you.

Elliotte Rusty Harold
Brooklyn, New York
June 7, 2002



0201771861P10222002

From the Back Cover

Praise for Elliotte Rusty Harold’s Processing XML with Java

“The sophistication and language are very appropriate for Java and XML application developers. You can tell by the way the author writes that he too is a developer. He delves very deeply into the topics and has really taken things apart and investigated how they work. I especially like his coverage of ‘gotchas,’ pitfalls, and limitations of the technologies.”

        —John Wegis, Web Engineer, Sun Microsystems, Inc.

“Elliotte has written an excellent book on XML that covers a lot of ground and introduces current and emerging technologies. He helps the novice programmer understand the concepts and principles of XML and related technologies, while covering the material at a level that’s deep enough for the advanced developer. With a broad coverage of XML technologies, lots of little hints, and information I haven’t seen in any other book on the topic, this work has become a valuable addition to my technical library.”

        —Robert W. Husted, Member, Technical Staff, Requisite Technology, Inc.

“The code examples are well structured and easy to follow. They provide real value for someone writing industrial-strength Java and XML applications. The time saved will repay the cost of this book a hundred times over.

“The book also contains more of the pearls of wisdom we’ve come to expect from Elliotte Rusty Harold—the kind of pointers that will save developers weeks, if not months, of time.”

        —Ron Weber, Independent Software Consultant

Written for Java programmers who want to integrate XML into their systems, this practical, comprehensive guide and reference shows how to process XML documents with the Java programming language. It leads experienced Java developers beyond the basics of XML, allowing them to design sophisticated XML applications and parse complicated documents.

Processing XML with Java™ provides a brief review of XML fundamentals, including XML syntax; DTDs, schemas, and validity; stylesheets; and the XML protocols XML-RPC, SOAP, and RSS. The core of the book comprises in-depth discussions on the key XML APIs Java programmers must use to create and manipulate XML files with Java. These include the Simple API for XML (SAX), the Document Object Model (DOM), and JDOM (a Java native API). In addition, the book covers many useful supplements to these core APIs, including XPath, XSLT, TrAX, and JAXP.

Practical in focus, Processing XML with Java™ is filled with over two hundred examples that demonstrate how to accomplish various important tasks related to file formats, data exchange, document transformation, and database integration. You will learn how to read and write XML documents with Java code, convert legacy flat files into XML documents, communicate with network servers that send and receive XML data, and much more. Readers will find detailed coverage of the following:

  • How to choose the right API for the job
  • Reading documents with SAX
  • SAX filters
  • Validation in several schema languages
  • DOM implementations for Java
  • The DOM Traversal Module
  • Output from DOM
  • Reading and writing XML documents with JDOM
  • Searching XML documents with XPath
  • Combining XSLT transforms with Java code
  • TrAX, the Transformations API for XML
  • JAXP, the Java API for XML Processing

In addition, the book includes a convenient quick reference that summarizes the major elements of all the XML APIs discussed. A related Web site, located at http://www.cafeconleche.org/books/xmljava/, contains the entire book in electronic format, as well as updates and links referenced in the book.

With thorough coverage of the key XML APIs and a practical, task-oriented approach, Processing XML with Java™ is a valuable resource for all Java programmers who need to work with XML.




Sell a Digital Version of This Book in the Kindle Store

If you are a publisher or author and hold the digital rights to a book, you can sell a digital version of it in our Kindle Store. Learn more

Customer Reviews

4 star
0
3 star
0
2 star
0
1 star
0
5.0 out of 5 stars
5.0 out of 5 stars
Most helpful customer reviews
5.0 out of 5 stars Excellent Book Nov. 7 2007
Format:Paperback
I have been going through many books, forums using google with marginal results. Finally stumbled on "Processing XML with Java".

If I had this book from the beginning I would have saved myself many many hours of frustration. Clear, concise and best of all, nice examples that work!

To boot you get an online version that is searchable with google and always updated.
Was this review helpful to you?
5.0 out of 5 stars Excellent!!! June 26 2004
Format:Paperback
If only every technical book was written this well! Anyone who is working with Java and XML should have a copy of this book. Highly example driven with clear explanations, the author makes using XML in your Java programs a breeze. Even better, the author has a style that makes the book fun to read as you feel like you are learning all sorts of secrets from an XML insider.
The book starts with a quick introduction to XML and then gets into how to create XML documents in your programs. The first four chapters cover everything you need to know about creating XML whether it is for XML-RPC, SOAP, or simply to store in a file. The next section covers parsing XML documents. SAX and DOM are compared and then the next eight chapters discuss these two methods of parsing documents, explaining how to use them, comparing them, and helping you determine how to decide which technique to use for which situation. The section on DOM explains not just how to parse documents using DOM but also how to create new documents. The final chapters of the book cover JDOM, XPATH, and XSLT.
Did I mention that this book is full of examples? The author doesn't rely on simply explaining how something works or how to use a technology (even though his explanations are excellent), he has examples to demonstrate everything he discusses. Each example builds upon the previous example and makes learning the techniques easy and enjoyable.
Was this review helpful to you?
5.0 out of 5 stars Elliote Rusty Harold is my favorite author March 19 2004
By DK
Format:Paperback
Lucidity, explanation of the fundamentals are E.R. Harolds hallmark. Mr Harold has authored several Java and XML books and all of them are a pleasure to read.
Was this review helpful to you?
5.0 out of 5 stars An excellent choice Aug. 15 2003
Format:Paperback
I really like reading this book. It is easy to read and understand. The author does a good job of describing the XML technologies related to JAVA. This book has a lot of code to analyze. This book is a must have for the experienced developer who wants to do JAVA with XML. I have a message for the experienced developer: THE CODE WILL CHALLENGE YOU; IT CHALLENGED ME!!!
Michael
Was this review helpful to you?
5.0 out of 5 stars A huge amount of topics and API Aug. 14 2003
Format:Paperback
This is definitely a valuable resource for anybody dealing with XML and Java, written by one of the best tech writers in town. The author covers in details a huge amount of topics and API, so many that you couldn't ask for more.
Be advised that some basic understanding of XML and intermediate Java skills are required to get the best out of this book
Was this review helpful to you?
5.0 out of 5 stars An excellant choice Aug. 6 2003
Format:Paperback
I bought this book when it first came out. I really enjoyed reading it. The book is well written. It has a lot useful code.
The author code that can be used in the real world of JAVA and XML. I liked the books section on JDOM. This book shows the differences between DOM and JDOM. Also, this book has a lot of information on SAX, DOM, JDOM, and it shows the differences when using each. I would recommend this book to anyone who wants to learn JAVA and XML. Make sure you are an experienced developer before purchasing this book.
Michael
Was this review helpful to you?
5.0 out of 5 stars Excellent Value April 17 2003
Format:Paperback
This book is an excellent resource for combining these two technologies, XML and Java. The author starts with the assumption that the reader is conversant in XML and at least intermediate skill level with Java. The first chapter of the book serves as a XML refresher. The author uses this chapter to reach a common understanding of terms with the reader. The first part of the book covers using many of issues of managing XML from Java and introduces two XML based services, XML-RPC and SOAP.
The remainder of the book is devoted to the various APIs for parsing XML hence the subtitle "A Guide to SAX, DOM, JDOM, JAXP, and TrAX". Throughout the book the author creates clear code examples and very readable text. This serves to develop understanding and insight in reader. This particular technical topography is under continuous change. Adapting to these changes will be much easier after having read this book.
A lot of tips and "gotchas" are shared in the book, but it is arranged so that the developer grab what he needs or he can sit and camp awhile. The book text is available at the author's website, but I prefer to read the paper copy. If you are going to use XML and Java together, this book would be a good investment.
Was this review helpful to you?
Want to see more reviews on this item?
Most recent customer reviews
5.0 out of 5 stars XML as high art - THE classic guide on modern XML
I bought this book with high expectations. I have read Elliott Rusty Harold's XML in a Nutshell book from O'Reilley twice. He is an exceptional technology writer. Read more
Published on Feb. 5 2003
5.0 out of 5 stars Readability without compromise
I preordered the book and have enjoyed reading it. I did not expect to just read end to end, but its style and humor have kept me going. Read more
Published on Dec 17 2002 by "cnew202"
5.0 out of 5 stars Recommended especially for newbies & beginners
Today XML landscape has became quite large. I can't even count XML related specs and protocols. Everyday a new X.. is popping up. Read more
Published on Dec 11 2002 by O
5.0 out of 5 stars Excellent piece
This is a great book. One couldn't ask for more when it comes to XML and Java processing. I was following the pre-releases and was even more satisfied when saw the 'fat' final... Read more
Published on Dec 1 2002 by Ivan S. Georgiev
5.0 out of 5 stars Very readable, complete, and up-to-date
I found everything in this book that is required as a Java XML developer. Very well written and contains good number of examples. Read more
Published on Nov. 29 2002 by Darshan Singh
5.0 out of 5 stars Attractively lucid and comprehensive
It used to be that to get a job as a java programmer, all you typically needed was knowledge of java itself plus some general background in computer science. Read more
Published on Nov. 22 2002 by W Boudville
Search Customer Reviews
Only search this product's reviews

Look for similar items by category


Feedback