Refactoring Databases: Evolutionary Database Design Hardcover – Mar 3 2006
Customers Who Bought This Item Also Bought
No Kindle device required. Download one of the Free Kindle apps to start reading Kindle books on your smartphone, tablet, and computer.
To get the free app, enter your e-mail address or mobile phone number.
From the Back Cover
Refactoring has proven its value in a wide range of development projects–helping software professionals improve system designs, maintainability, extensibility, and performance. Now, for the first time, leading agile methodologist Scott Ambler and renowned consultant Pramodkumar Sadalage introduce powerful refactoring techniques specifically designed for database systems.
Ambler and Sadalage demonstrate how small changes to table structures, data, stored procedures, and triggers can significantly enhance virtually any database design–without changing semantics. You’ll learn how to evolve database schemas in step with source code–and become far more effective in projects relying on iterative, agile methodologies.
This comprehensive guide and reference helps you overcome the practical obstacles to refactoring real-world databases by covering every fundamental concept underlying database refactoring. Using start-to-finish examples, the authors walk you through refactoring simple standalone database applications as well as sophisticated multi-application scenarios. You’ll master every task involved in refactoring database schemas, and discover best practices for deploying refactorings in even the most complex production environments.
The second half of this book systematically covers five major categories of database refactorings. You’ll learn how to use refactoring to enhance database structure, data quality, and referential integrity; and how to refactor both architectures and methods. This book provides an extensive set of examples built with Oracle and Java and easily adaptable for other languages, such as C#, C++, or VB.NET, and other databases, such as DB2, SQL Server, MySQL, and Sybase.
Using this book’s techniques and examples, you can reduce waste, rework, risk, and cost–and build database systems capable of evolving smoothly, far into the future.
About the Author
Scott W. Ambler is a software process improvement (SPI) consultant living just north of Toronto. He is founder and practice leader of the Agile Modeling (AM) (www.agilemodeling.com), Agile Data (AD) (www.agiledata.org), Enterprise Unified Process (EUP) (www.enterpriseunifiedprocess.com), and Agile Unified Process (AUP) (www.ambysoft.com/unifiedprocess) methodologies. Scott is the (co-)author of several books, including Agile Modeling (John Wiley & Sons, 2002), Agile Database Techniques (John Wiley & Sons, 2003), The Object Primer, Third Edition (Cambridge University Press, 2004), The Enterprise Unified Process (Prentice Hall, 2005), and The Elements of UML 2.0 Style (Cambridge University Press, 2005). Scott is a contributing editor with Software Development magazine (www.sdmagazine.com) and has spoken and keynoted at a wide variety of international conferences, including Software Development, UML World, Object Expo, Java Expo, and Application Development. Scott graduated from the University of Toronto with a Master of Information Science. In his spare time Scott studies the Goju Ryu and Kobudo styles of karate.
Pramod J. Sadalage is a consultant for ThoughtWorks, an enterprise application development and integration company. He first pioneered the practices and processes of evolutionary database design and database refactoring in 1999 while working on a large J2EE application using the Extreme Programming (XP) methodology. Since then, Pramod has applied the practices and processes to many projects. Pramod writes and speaks about database administration on evolutionary projects, the adoption of evolutionary processes with regard to databases, and evolutionary practices’ impact upon database administration, in order to make it easy for everyone to use evolutionary design in regards to databases. When he is not working, you can find him spending time with his wife and daughter and trying to improve his running.
See all Product Description
Top Customer Reviews
I should say I disagree with the reviewer's opinion on Scott Ambler's work. I have read Ambler's articles since the late 1990s (Ronin Intl) and have bought most of the books written or co-authored by Scott Ambler. I do believe most of the concepts/ideas implemented in ORM ([...]) frameworks such as Hibernate/NHibernate, JDO, JPA are there thanks to articles/whitepapers like the ones Ambler has published. I might be wrong but the Object-Relational Impedance Mismatch was first coined by Ambler.
My short advise, read Codd then read Scott Ambler, but don't get stuck on the "SPROCs and cursors" world, move on.
As far as "money grubbing" goes, you don't make a lot of money writing books. Sorry to burst anyone's bubble on that issue. ;-)
He is just trying to jump on the refactoring bandwagon and find a profitable niche with the NDBA(near-DBAs) community who are OO programmers that do not really understand data concepts. Refactoring is not supposed to be a politically acceptable buzzword for perpetual recoding in the hope that you will eventually approximate to the undefined business requirements you did not take the time to collect in the first place.
Most Helpful Customer Reviews on Amazon.com (beta)
This brings me to the second purpose of this book. Many DBAs view their jobs as protectors of the data. While that is admirable, they sometimes forget that they are part of a software development team whose job is to provide value to the organization through the development of new (and enhancement of existing) applications. One of the best DBAs I ever worked with viewed himself as a "Data Valet." He said his job was to make sure the data was presented to applications when and where they wanted and to protect the doors from getting dinged while under his care. Through its first five chapters and then the refactorings that follow, this book will help DBAs expand their view of their role in the organization from one of simply protecting data to one of enhancing the value of data to the organization.
This book is one that you'll keep on your reference shelf for many years to come. Highly recommended.
Several of the structural refactorings are just simple database schema changes: rename/drop column/table/view. Adding is not really a refactoring so add column/table/view were cataloged as 'transformations' - changes that do affect the application, a distinction that appears to me a little clumsy. Some structural refactorings are more interesting: merge/split columns/tables, move column, introduce/remove surrogate key, introduce calculated column, introduce associative table.
The data quality refactorings include introduce/drop default values, null or check constraints, standardize codes, formats and data types, use consistent keys and lookup tables. Most of these are common best practices, seeing them cataloged as refactorings didn't yield me any new insights. Only replacing type codes with flags was of special interest.
Referential integrity refactorings include the obvious add/drop foreign keys with optional cascading delete, but also using triggers to create a change history and hard vs. soft deletes. Using before and after delete triggers to implement soft deletes is probably the best example in the book.
Architectural refactorings include using CRUD methods (ie. stored procedures or functions to select/insert/update/delete records), query functions that return cursor refs, interchanging methods and views, implementing methods in stored procedures, using materialized views and using local mirror tables vs. remote 'official' data sources. All these are common design techniques and the discussion of motivation and tradeoffs is particularly relevant.
The final section on method refactorings is more abbreviated and covers typical code refactorings. These qualify for inclusion only because databases include stored procedures, but they have nothing to do with schema evolution.
An important aspect of this book is that the catalog of refactorings is presented in the context of evolutionary database development described in the first five chapters: this approach emphasises an iterative approach, automated regression testing, configuration control of schema objects and easy availability of personalized application database environments for developers. Refactorings and transformations are intended to be applied one by one, and an automated regression test suite used to maintain confidence that a change does not introduce an application defect. Change control and a change tracking mechanism are essential to manage the application of schema changes to integration, QA and production environments.
What do I like about this book? The catalog of refactorings is thorough (some might say pedantic) which makes it a good learning tool for new database developers and DBAs, and as a shared reference for communicating on larger projects and in larger organizations. Experienced DBAs working on smaller projects are less likely to find it useful.
What don't I like? Relatively little is provided about the tools required to make regular refactoring practical, the authors simply state that these are being worked on. utPLSQL is not mentioned at all. The discussion on tracking changes is thin (but check out the LiquiBase project on Sourceforge). No guidance is provided on how you might use Ant to build and maintain developer database environments. Little is covered on the tough topic of building and maintaining test data sets. A final pet peeve: no discussion of refactoring across multiple schemas shared by an application suite.
In summary this book sketches out some important ideas but much work remains to be done. The catalog takes a number of established techniques and best practices and places them in a new framework which at least provides value to some for now.
The first five chapters describe how to go about database refactoring. Chapter 1 overviews the idea that you can evolve your database schema in small steps, a radical departure for many traditional DBAs. It also overviews the need for supporting techniques such as agile data modeling, database regression testing, and version control of your data models and scripts. I would have liked to see more coverage of these topics, but at least the modeling material is covered in Ambler's Agile Modeling book and there are some great SCM books out there.
Chapters 2 and 3 walk through the process of implementing a database refactoring, first through the simple situation where there is only a handful of applications accessing the database. I guess this sort of thing happens in smaller companies, but most of the time you really have to worry about scores of applications accessing your database which is a much harder situation. This is actually the focus of Chapter 3 and of the presented solutions in Chapters 6 through 11 which provide reference implementations for all of the database refactorings. This approach belies the true strength of the book: it reflects actual experience in large organizations, not just the theoretical pie in the sky stuff you see from other authors.
Chapter 4 focuses on deploying database refactorings in production, providing detailed instructions for how to roll refactorings between various sandboxes. It importantly describes how to merge the refactorings of several teams together. If you have 100 applications accessing a shared database, then potentially you need to manage the refactorings coming from 100 different development teams. Of course it would never be that bad, but even merging refactorings from 10 teams would be tough. This might be where the technique falls apart because many companies likely don't have data managers who are skilled enough to do this sort of thing efficiently enough to keep up with agile developers. We need new tools, so hopefully companies like Oracle will build something for us.
Chapter 5 describes a collection of best practices and challenges pertaining to refactoring databases. The authors share their experiences as well as identify potential issues, such as a lack of tooling and agile expertise within the data community, that could slow adoption of this technique. My guess is that the smarter teams within companies will start doing this stuff right away, for the most part it's pretty easy technically, but that bigger companies will struggle to change as they always do.
Chapters 6+ are reference descriptions for the individual refactorings. Each one is described using a UML data model, which is a little strange at first although once you get used to it you can see how it's a much better notation than Crow's feet, a detailed text description and source code. The source code examples are detailed, I guess the authors want to be thorough and provide a complete solution so that there's no question how to implement each refactoring. The application examples are written in Java or Hibernate, but they're simple enough that you could see how to implement them in C#, C++, Ruby, or even VB. The database code is Oracle, once again it's pretty straightforward so you can easily see how it would work in other DBs like Sybase or MySQL.
All in all, if you're a DBA or agile programmer you need to seriously think about buying this book.
The more interesting part of the book talks about how to manage and evolve a database in general (e.g. keep a table that tracks all changes that have been applied). But this part doesn't go quite as far as I'd hoped it would, e.g. there is no discussion of how to track down who is using what parts of the database prior to refactoring (proxy driver? access stats?), and the discussion is limited to relational databases (which may not even be the best choice for rapidly evolving data models).
btw there is an interesting open source tool called LiquiBase (apparently inspired by this book) that attempts to help manage (and deploy) database "refactorings" as described in this book.
Look for similar items by category
- Books > Computers & Technology > Databases > Database Design
- Books > Computers & Technology > Programming > Software Design, Testing & Engineering > Object-Oriented Design
- Books > Computers & Technology > Programming > Software Design, Testing & Engineering > Software Development
- Books > Computers & Technology > Programming > Software Design, Testing & Engineering > Structured Design
- Books > Computers & Technology > Software > Databases
- Books > Qualifying Textbooks - Fall 2007 > Computers & Internet
- Books > Textbooks > Computer Science & Information Systems > Database Storage & Design
- Books > Textbooks > Computer Science & Information Systems > Object-Oriented Software Design
- Books > Textbooks > Computer Science & Information Systems > Software Design & Engineering