Who hates migrations? If you have used migrations to any extent, your hand better be up. A lot of migration-greif comes from this fundamental flaw: Migrations are static, your application is dynamic. Yet migrations depend on the application in order to run. The following issues are examples of this phenomenon:

  1. Failing migrations as a result of a removed column
  2. Failing migrations as a result of methods being called that no longer exist
  3. Extensive use of SQL in your migrations to avoid loading Doctrine objects
  4. Loss of bowel control at the mentioning of “Doctrine Migrations”

These issues can consume development hours, leave your development team in disarray, and cost a fortune in new boxer briefs. The simple solution: Sandboxing

At the Ruby on Rails Meetup last thursday, I saw a revolutionary piece of code that blew my PHP-paradigmed mind:

class MigrateFoo < ActiveRecord::Migration
  # this is significant!
  class Foo < ActiveRecord
    # declare what the migration needs
  end

  def self.up
    # do up logic
  end
end

Let me unpack this. The model being migrated (Foo) is namespaced inside the migration. When Foo is referenced inside the migration, the namespaced model MigrateFoo::Foo is instanciated. You can now declare everything you need in the migration, to separate your migration's dependency on your application, and it will always be there when the migration needs it. We can use this concept in PHP / Symfony / Doctrine as well. Here is an example situation:

  • We have a model called Book
  • We have a model called Author
  • A One-To-Many relationship exists between Book and Author
  • We realize after launching that we need a Many-To-Many relationship between the two, in the case of co-authorship (duh)

Now, in order to migrate, we need to

  • first: add the new link table
  • second: move all existing author_ids out of the book table and into the book_author table
  • third remove the now-deprecated author_id field from the book table

This is all fine until you run into one small (enormous) problem with step two. If you migrate before you build your models, doctrine is not aware of the book_author > author_id column, and will throw an error. If you migrate after you build your models, doctrine is not aware of the book > author_id column, and will throw an error. Its an old-fashion Damned-If-You-Do-Damned-If-You-Don't scenario. At no point in our application timeline will our models have both relationships. However, our model needs to be aware of both these relationships in the migration in order to work with the data.

In the past, I've executed straight SQL in my migration in order to bypass the ORM entirely. This works fine in some situations, but what if you have complex queries already built to handle the implementation logic? What if you need to use the ORM?

Solution: Sandbox those migrations!

This is actually a very simple problem to fix. All you need to do is sandbox your migrations. After running php symfony doctrine:generate-migration migrate_book_author_data task, I should have a file named something like /path/to/project/lib/migration/doctrine/1274632922_migrate_book_author_data.php. Perfect! Now, lets add the following models:

class Migration_1274632922_Book extends Book
{
  public function setUp()
  {
    parent::setUp();
    $this->hasColumn('author_id', 'integer', '4', array('length' => '4', 'type' => 'integer'));
  }
}

class Migration_1274632922_BookTable extends BookTable
{}

If you're running PHP 5.3, I would definitely recommend namespacing the classes rather than adding prefixes. For example, \1274632922\BookTable, or even \Migration\1274632922\BookTable, to keep your code cleaner.

Now, we cannot add these classes directly in our 1274632922_migrate_book_author_data.php file. Doctrine interprets every class as a migration class, regardless of whether or not it extends Doctrine_Migration_Base. I put them in /path/to/project/lib/migration/model/1274632922/Book.php. Now, include your extended class files in the methods of your migration, and you can migrate as expected.

Note: At the time this migration is run, we assume a previous migration has already added the book_author link table. We have both the book_author table and the author_id field simultaneously present in our database schema.

class migrateBookAuthorData extends Doctrine_Migration_Base
{
  public function preUp()
  {
    include_once(dirname(__FILE__).'/../model/1274632922/Book.php');

    // Fix Data
    $books =  Doctrine::getTable('Migration_1274632922_Book')->findAll();

    foreach ($books as $book) {
      if ($book['author_id']) {
        $bookAuthor = new BookAuthor();
        $bookAuthor['author_id'] = $book['author_id'];
        $bookAuthor['book_id'] = $book['id'];
        $bookAuthor->save();
      }
    }
  }

  public function up()
  {
    $this->removeColumn('book', 'author_id');
  }

  public function down()
  {
    $this->addColumn('book', 'author_id', 'integer', '4', array());
  }

  public function postDown()
  {
    include_once(dirname(__FILE__).'/../model/1274632922/Book.php');

    // Fix Data
    $book =  Doctrine::getTable('Migration_1274632922_Book')->findAll();

    foreach ($books as $book) {
      if ($book['Authors']->count()) {
        $author = $book['Authors']->getFirst();
        $book['author_id'] = $author['id'];
        $book->save();
      }
    }
  }
}

Our migration moves the ids from the book > author_id column to the book_author > author_id column in the preUp() method, and then removes the book > author_id column in the up() method. We do the exact opposite in the down method.

By extending the object and calling the extended object in our migrations, we no longer care what fields are in our application-level models. Our migrations are sandboxed, and will always have the methods/columns it needs to carry out the migration successfully.

1 comment

  • elbouillon - May 28, 2010

    GREAT! Incredibly simple and straighfoward! I’m in love with this method.
    If I understood, you keep a new table after the migration too, in case ot reversing.

Add comment