Doctrine/ORM - Here be dragons

Published on 2016-02-15

Doctrine is one of the many PHP package I've taken for granted over the last year. I've counted it as a solid and reliable abstraction layer for my persistence. I'm now finally disillusioned to the notion of it being infallible. That's not say that I actually thought it was bug free in the first place but I was doing a darn fine job of not thinking about its bugs and failures -- really just avoiding the topic as a whole perhaps subconsciously. I guess it's a side effect, or maybe even trap, of using existing packages to solve the pesky implementation details.

If you're actually using Doctrine2, have you actually taken a look at the issue tracker on github? I had not previously taken the time to really look at it, at the number of issues and the type of issues previously. Presently, there are 435 open issues marked with the Bug label. That is not to say that it is bad software or that the maintainers are doing a bad job (quite the contrary as there are 2244 closed bug issues). But it stands as a reminder that the package has a magnitude of issues.

What is interesting, is that the package works for most of the "core" functionality. Things like persistence of entities, associations and the like. It is the extraneous features when you start getting into troubled waters.

Last week, I wanted to utilize collections w/ criteria for filtering associations. Specifically, I had an Therapist entity with a unidirectional many-to-many association to facilities and I wanted to get a list of those facilities matching a specified name. The "typical" solution that would probably be suggested on stack overflow might be to write a custom repository method to fetch the facility using a DQL statement and returning the result, maybe utilizing a paginator. But doctrine boasts a feature where if you define your associations as extra-lazy then when you access the association property you'll receive a proxy collection that waits until you iterate over it to fetch the data from the database. Additionally, the collection supposedly supports filtering by providing a Criteria object that defines conditionals/orderings. Unfortunately through trial an error I discovered that the ManyToManyPersister which is responsible for providing this functionality suffers from 5 separate, compounding bugs.

  • Collection of a non-owning side association was broken
  • Limit/offset of Criteria were being ignored
  • OrderBy of criteria was being ignored
  • field to column name mappings were being ignored
  • all the where conditionals of the criteria were being treated as Equal To operation meaning if your where clause was "count > 3" it was treated as "count = 3".

While not a bug, the implementation detail of the matching method that applies to the PersistentCollection and returns a new "filtered" ArrayCollection with the entire result hydrated in memory. I'd prefer if it returned another PersistentCollection that didn't hydrate until you iterated over it.

Currently, there are pull requests to address the first four of the above bug list -- so they will likely be addressed eventually. But for right now, it means the criteria solution is just off the table.

At the very least, I've now got extra motivation to contribute back to open source software as this is a package I like using and a feature I'd really like to make use of. So I'll probably pick up the hammer and see if I can patch up some of this fuckiness with a pull request of my own.

Using DBunit with Doctrine ORM

Published on 2015-12-26

I decided to make use of DBUnit to test my persistence implementations that utilize the Doctrine ORM framework. It was my first time, so there were a few missteps.

DBUnit is actually a PHPUnit extension so I needed to install an additional composer package to start writing test cases that extend the DBUnit test case.

composer require "phpunit/dbunit"

Then I started by extending the DBUnit test case and stubbing the two abstract methods getConnection and getDataset.

class PostStorageTest extends \PHPUnit_Extensions_Database_TestCase
    protected function getConnection()
        // todo get $pdo instance
        return $this->createDefaultDBConnection($pdo, ':memory:');
    protected function getDataSet()
        return $this->createFlatXMLDataSet(__DIR__ . '/fixtures/default_data.xml');

I didn't need to stub getDataSet as it was a trivial implementation. Basically I just gave it an XML file that describes the starting state of the data in the test tables when tests run. So I threw in some initial records that I'll be able to use to test reads against.

Because I'm using Sqlite and an in memory database, I realized I needed to use the same PDO instance that my EntityManager was using. This might not be necessary if I were using a MySQL db or maybe even a non-memory Sqlite database but with the memory option each PDO instance has a separate data store. So I needed to instantiate an EntityManager. But there is also another caveat that DBUnit expects that the schema should be initialized before the tests run (IE the tables / views / procedures etc. should already be created). This resulted in moving instantiating EntityManager to setupBeforeClass.

class PostStorageTest extends \PHPUnit_Extensions_Database_TestCase
    /** @var  EntityManager */
    protected static $entity_manager;
    /** @var  SchemaTool */
    protected static $schema_tool;

    public static function setUpBeforeClass()
        $factory = new EntityManagerFactory(['driver' => 'pdo_sqlite', 'memory' => true]);
        self::$entity_manager = $factory->create(true);
        self::$schema_tool = new SchemaTool(self::$entity_manager);

    public static function tearDownAfterClass()

    protected function getConnection()
        $pdo = self::$entity_manager->getConnection()->getWrappedConnection();
        return $this->createDefaultDBConnection($pdo, ':memory:');

    protected function getDataSet()
        return $this->createFlatXMLDataSet(__DIR__ . '/fixtures/default_data.xml');

I used EntityManagerFactory which is not described here, but is a simple factory that creates an EntityManager instance w/ the provided connection parameters. It just does the bare minimum to create the manager configured with the proper model paths (for the coolsurfin project). Then I utilized Doctrine's SchemaTool to create the schema for my tests. I was initially concerned here because I knew I wanted to use doctrine migrations to manage my database change scripts. I was worried about how I would configure and execute the migration scripts for the test database (and also how slow it would be to run through all the scripts). But I realized for testing, I don't need to worry about migrating the database instance because with the test instance I'm not concerned about keeping (and maintaining integrity of) data. Instead I can just use the current state of the schema (based on the model metadata). So I just initialize the schema using the EntityManager instance metadata.

I also ended up adding a drop database call in the tearDownAfterClass. In hindsight this might be unnecessary because other tests will have their own EntityManager and therefore their own PDO instance (memory). But perhaps if at a later date I switched to a persistent MySQL test instance, then I know the test will clean up after itself.

With the setup of the test database instance complete I was then able to write my first test method.

class PostStorageTest extends \PHPUnit_Extensions_Database_TestCase
    // other member variables
    /** @var  PostStorageInterface */
    protected $storage;

    // previous discussed setupBeforeClass/teardownAfterClass/getConnection/getDataset methods

    protected function setUp()
        $this->storage = new PostStorage(self::$entity_manager);

    public function test_it_persists_model(){
        $post = new Post();
        $post->setContent('hello world');
        $post->setCreated(new DateTime('2010-04-24 17:15:23', new DateTimeZone('UTC')));

        $queryTable = $this->getConnection()->createQueryTable(
            'posts', 'SELECT * FROM posts'
        $expectedTable = $this->createFlatXMLDataSet(__DIR__ . '/fixtures/persisted_post.xml')
        $this->assertTablesEqual($expectedTable, $queryTable);

I placed instantiating the concrete Storage implementation in the setUp method because that code won't change and will be necessary for each test. I then wrote my test using the storage instance like I would use it in real source code. I also created another data fixture for what the table data should contain after the new model was persisted. I learned an interesting thing here that wasn't quite documented. When defining row data in the fixtures, the data has to be provided with the columns in the order that they will come back in the query result. As the assertTablesEqual doesn't consider data tables with the same data but different column order as equal. That probably makes since in cases where the code references result columns by index # (shitty!) but makes it kind of a pain to make sure the fixture data is in the correct column order.

All in all the experience of setting up this test case was not difficult. The biggest stumbling block was the incorrect first inclination to use the migrations process to setup the schema for tests. After I realized I should just use the current state described the models, everything worked it self out quite nicely. Even debugging the issue with data set column order was easy as the failed expectation message was very verbose and useful. It rendered both the expected and actual data tables on screen and it was immediately clear what the problem was.

These notes reference the coolsurfin repository and the PostStorageTest specifically. Feel free to check them out for more context.

A new blogging platform

Published on 2015-11-30

So, has been down for a while, since I accidentally canceled its' hosting account.

I finally got around to standing something back up in its place. What you see now is a new blogging platform written from scratch with Slim framework (RC 3).

Now I don't have an excuse for not blogging.

I'll fill some content by talking about what I learned with this little exercise.

Spoiler: micro-frameworks are neat

New Copyright Model: Distribution oriented

Published on 2009-09-10

Recently watched a discussion on copyright model issues HAR Panel from TorrentFreak.

Below I describe my naive approach to a "fair" copyright model. I think this model provides the right balance between distribution and protecting the author's rights.

A new copyright model

  1. Once created, any work automatically becomes copyright by the author.
  2. Copyright ownership does not, nor cannot transfer. Ownership exists from the time of the work’s creation to the death of the author.
  3. Copyright affords the right of the author to a monetary percentage (based on the ratio of contribution in collections of work) of any income directly attributed to the usage or distribution of owned work. The default percentage amount is 25%. However, the author of copyrighted work may enter in to contracts with distributors that alter the default percentage.
  4. Copyright does not grant the right to specify who can or cannot distribute work or for what amount the work is distributed.
  5. Copyrighted work can be obtained, used, or distributed for free–so long as no monetary gain is resulted from that material.

The goal of this copyright model is meant to open up copyright material in a fair and reasonable way to any party.

Benefits for the author

Being as copyrighted material can no longer exclusively be held and distributed authors open up more sources of distribution and thus more opportunity for profit on their work. Authors have the comfort of know that they are entitled to a percentage of any profit made in part by contribution of their copyrighted material and as such have a legal channel to enforce that right.

Benefits to education

Educators no longer have to worry about finding material for their curriculum that fits a budget. Any material can be obtained by the instructor and made available to the students. Either by making it available on the course home page, or passing out fliers during class. Educators are also free to make a profit by comprising there own compendium of works for the course curriculum that students must purchase from the book store (as is done already) with the simple stipulation that the regulated percentage of the price go to the authors of the the material featured in the collection.

Benefits for business (Small and Large)

Being as no person or company can obtain exclusive distribution rights, as a company you have the opportunity to obtain and distribute copyrighted material that is doing well in a desire to join in on profiting from said work.

Large companies have the benefit of having a bargaining chip with the author to allow them to distribute copyrighted work at reduced distribution percentage by offering highly available distribution. (IE Sell a lot, for a little).

Small companies have the benefit of competing with large companies by capitalizing on the creative freedom of the distribution model. For example, creating start up web sites with subscriber downloadable content. Since copyrighted material doesn’t have a set in stone base price, they can choose to offer as little or as much content via subscription or pay per download to create a competitive pricing model against large physical media distribution companies.

Individual usage benefits

Because monetary issues only come into play on profiting off of distributed material, obtaining material is a lot simpler. For one, due to the competitive nature of capitalistic markets, you’re going to find material provided cheaper and far more easier to obtain.

Due to the nature of the distribution license, all material is potentially freely available. Only a single person, possibly starting with the author himself, is needed to seed the initial distribution of material for free. Once you (or any person) gets a hold of copyrighted material (either by paying for it or by obtaining it for free) you are free to pass the material on to as many or as few people as you want (either for free or for a fee [so long as you pay the author a percentage of the profit]). A site like pirate bay (that is NOT add supported, or subscription supported, or money involved) is totally legit. Pirate bay itself becomes legit simply by paying the authors of the content distributed by its site a percentage of ad revenue.

The only issue here is determining what percentage of distribution any material represents, how much total revenue was generated and passing forward the author’s royalty percentage.


You release a compilation album of 10 songs by 10 different artists at $100 dollars. Regardless what the cost of producing the album is, you are selling it for $100 dollars. $100/10 = $10 per artist for revenue base. Multiplied by the distribution royalty percentage of 25% , $10 * .25 = $2.50. Each artist is entitled to $2.50 for each album sold. Variations of this example involve a single artist for more than one song, plus contractual stipulations reducing the distribution royalty percentage.