When using Doctrine or any other large OO library or framework the number of files that need to be included on a regular HTTP request rises significantly. 50-100 includes per request are not uncommon. This has a significant performance impact because it results in a lot of disk operations. While this is generally no issue in a dev environment, it's not suited for production. The recommended way to handle this problem is to bundle the most-used classes of your libraries into a single file for production, stripping out any unnecessary whitespaces, linebreaks and comments. This way you get a significant performance improvement even without a bytecode cache (see next section). The best way to create such a bundle is probably as part of an automated build process i.e. with Phing.
Maybe the most important rule is to only fetch the data you actually need. This may sound trivial but laziness or lack of knowledge about the possibilities that are available often lead to a lot of unnecessary overhead.
Take a look at this example:
<code type="php">
$record = $table->find($id);
</code>
How often do you find yourself writing code like that? It's convenient but it's very often not what you need. The example above will pull all columns of the record out of the database and populate the newly created object with that data. This not only means unnecessary network traffic but also means that Doctrine has to populate data into objects that is never used. I'm pretty sure you all know why
<code>
SELECT * FROM ...
</code> is bad in any application and this is also true when using Doctine. In fact it's even worse when using Doctrine because populating objects with data that is not needed is a waste of time.
Another important rule that belongs in this category is: **Only fetch objects when you really need them**. Until recently this statement would make no sense at all but one of the recent additions to Doctrine is the ability to fetch "array graphs" instead of object graphs. At first glance this may sound strange because why use an object-relational mapper in the first place then? Take a second to think about it. PHP is by nature a prodecural language that has been enhanced with a lot of features for decent OOP. Arrays are still the most efficient data structures you can use in PHP. Objects have the most value when they're used to accomplish complex business logic. It's a waste of resources when data gets wrapped in costly object structures when you have no benefit of that. Take a look at the following pseudo-code that fetches all comments with some related data for an article, passing them to the view for display afterwards:
</code> Can you think of any benefit of having objects in the view instead of arrays? You're not going to execute business logic in the view, are you? One parameter can save you a lot of unnecessary processing:
<code type="php">
... ->execute(array(1), Doctrine::FETCH_ARRAY);
</code> This will return a bunch of nested php arrays. It could look something like this, assuming we fetched some comments:
<code>
array(5) (
[0] => array(
'title' => 'Title1',
'message' => 'Hello there! I like donuts!',
'author' => array(
'first_name' => 'Bart',
'last_name' => 'Simpson'
)
),
[1] => array(
'title' => 'Title2',
'message' => 'Hullo!',
'author' => array(
'first_name' => 'Homer',
'last_name' => 'Simpson'
)
),
...
)
</code> Here 'author' is a related component of a 'comment' and thus results in a sub-array. If you always use the array syntax for accessing data, then the switch to array fetching requires nothing more than adding the additional parameter. The following code works regardless of the fetching style:
</code> **Array fetching is the best choice whenever you need data read-only like passing it to the view for display. And from my experience, most of the time when you fetch a large amount of data it's only for display purposes. And these are exactly the cases where you get the best performance payoff when fetching arrays instead of objects.**
There are two possible ways when it comes to using DQL. The first one is writing the plain DQL queries and passing them to Doctrine_Connection::query($dql). The second one is to use a Doctrine_Query object and its fluent interface. The latter should be preferred for all but very simple queries. The reason is that using the Doctrine_Query object and it's methods makes the life of the DQL parser a little bit easier. It reduces the amount of query parsing that needs to be done and is therefore faster.
**Efficient relation handling**
When you want to add a relation between two components you should **NOT** do something like the following:
<code type="php">
// Assuming a many-many between role - user
$user->roles[] = $newRole;
</code> This will load all roles of the user from the database if they're not yet loaded! Just to add one new link! Do this instead:
<code type="php">
// Assuming a many-many between role - user, where UserRoleXref is the cross-reference table
A bytecode cache like APC will cache the bytecode that is generated by php prior to executing it. That means that the parsing of a file and the creation of the bytecode happens only once and not on every request. This is especially useful when using large libraries and/or frameworks. Together with file bundling for production this should give you a significant performance improvement. To get the most out of a bytecode cache you should contact the manual pages since most of these caches have a lot of configuration options which you can tweak to optimize the cache to your needs.