Zend_Db

Database Replication Adapter for Zend Framework Applications

Last updated: 21 Feb, 2010

Database replication is an option that allows the content of one database to be replicated to another database or databases, providing a mechanism to scale out the database. Scaling out the database allows more activities to be processed and more users to access the database by running multiple copies of the databases on different machines.

The problem with monolithic database designs is that they don’t establish an infrastructure that allows for rapid changes in business requirements. Here is where database replication comes into play. Replication can be used effectively for many different purposes, such as separating data entry and reporting, distributing load across servers, providing high availability, etc.

Zf_Orm_DataSource is a Zend Framework Replication Adapter class flexible enough to support the most commonly used replication scenarios:

Single-Master Replication

In the simplest replication scenario, the master copy of directory data is held in a single read-write replica on one server called the supplier server. The supplier server also maintains changelog for this replica. On another server, called the consumer server, there can be multiple read-only replicas.

Configuration array:

$config = array(
    'adapter'        => 'Pdo_Mysql',
    'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
    'username'       => 'root',
    'password'       => 'root',
    'dbname'         => 'test',
    'master_servers' => 1,
    'servers'        => array(
        array('host' => 'db.master-1.com'),
        array('host' => 'db.slave-1.com'),
        array('host' => 'db.slave-2.com')
    )
);

// or ...

$config = array(
    'adapter'        => 'Pdo_Mysql',
    'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
    'dbname'         => 'test',
    'master_servers' => 1,
    'servers'        => array(
        array('host' => 'db.master-1.com', 'username' => 'user1', 'password'=>'pass1'),
        array('host' => 'db.slave-1.com', 'username' => 'user2', 'password' => 'pass2'),
        array('host' => 'db.slave-2.com', 'username' => 'user3', 'password' => 'pass3')
    )
);

In the setup above, all writes will go to the master connection and all reads will be randomly distributed across the available slaves.

Multi-Master Replication

This type of configuration can work with any number of consumer servers. Each consumer server holds a read-only replica. The consumers can receive updates from all the suppliers. The consumers also have referrals defined for all the suppliers to forward any update requests that the consumers receive.

$config = array(
    'adapter'        => 'Pdo_Mysql',
    'driver_options' => array(PDO::ATTR_TIMEOUT=>5),
    'username'       => 'root',
    'password'       => 'root',
    'dbname'         => 'test',
    'master_servers' => 2,
    'master_read'    => true,
    'servers'        => array(
        array('host' => 'db.master-1.com'),
        array('host' => 'db.master-2.com')
    )
);

Using a distributed memory caching system

Database connections are expensive and it’s very inefficient for an application to try to connect to a server that is down or not responding. A distributed memory caching system can help alleviate this problem by keeping a list of all the failed connections in memory, sharing that information across multiple servers and allowing the application to access it before attempting to open a connection.

To enable this option, you have to pass an instance of the Memcached adapter class:

class Bootstrap extends Zend_Application_Bootstrap_Base
{
    protected function _initCache()
    {
        ...
    }

    protected function _initDatabase()
    {
        $config = include APPLICATION_PATH . '/config/database.php';
        $cache = $this->getResource('cache');
        $dataSource = new Zf_Orm_DataSource($config, $cache, 'cache_tag');
        Zend_Registry::set('dataSource', $dataSource);
    }
}

And here is a short example of how the Replication Adapter might be used in a ZF application:

class TestDao
{
    public function fetchAll()
    {
        $db = Zend_Registry::get('dataSource')->getConnection('slave');
        $query = $db->select()->from('test');
        return $db->fetchAll($query);
    }

    public function insert($data)
    {
        $db = Zend_Registry::get('dataSource')->getConnection('master');
        $db->insert('test', $data);
        return $db->lastInsertId();
    }
}

Source Code:
http://fedecarg.com/repositories/show/replicationadapter

Posted in Databases, Frameworks, Open-source, Programming

Author: Federico

by News Robot on August 28, 2010 in News, No Comments »
tags: ,

Database Abstraction Layers Must Live!

I come preaching true hope, against the fallacies.

I’ve heard the arguments for and against database abstraction layers (DALs) time and time again. I must say first, I agree with them all, both sides, equally. Interestingly, I can put the vocal proponents of each side of the argument in one of two boxes: a programmer guy box, or a database guy box. For some unknown reason though, they never seem to see eye to eye.

Honestly though, I like to put myself in the middle of that argument. I see both sides. I think fine tuning an application’s core business with vendor specific features is tremendously important, after all, that is why there are so many competing database vendors. Generally speaking of database driven projects, I feel like planning to use a specific vendor up front, knowing its pro’s and con’s, and tailoring an application to the chosen database’s strengths can only help in the long run. Also, I feel that building a database model first before any code, offers many performance and scalability advantages than does code first development.

That said, I also see value in using a database as a simple data-store when the actual database is not a key component of the overall application. That’s right, it is completely valid to say that the data-storage & database component of an application sometimes is not the key component; a database guy probably will never agree with you there. Just as there are programmers who swear by this code first, database later mantra, there are database developers that will swear by the database first, code later mantra.

The fact is, each project is unique. It’s this uniqueness of projects and their execution that ultimately shapes the perspectives of developers as well as the tools they write and consume. To say that one mantra is clearly a better choice over another is simply being ignorant.

The Use Case of Abstraction Layers

To be honest, I don’t really buy the “I might switch database vendors at some point” argument either, as Jeremy Zawodny points out. For larger projects (on the scale of the facebooks, the twitters, etc), switching the database underneath after a project has been in production is a monumental task- regardless if you have an abstraction layer or not. Chances are, you used some of the database specific features, not to mention, you now have a large set of mission critical data that also has to be ported. Long story short, its never as easy as swapping the abstraction layers database adapter out.

What I will buy though, is there are some problems that fall in thicker end of the Pareto Principle that can be solved with a database abstraction layer. For the uninitiated, the Pareto Principle is effectively the 80/20 rule. In software use cases, when applying this term- the 80% use case is the majority of use cases. These use cases are generally not that interesting in terms of database interaction. To give it a label, we can call these the CRUD, BREAD, or <<insert your favorite terminology here>> operations. That is not to say that these operations are not important, but they are not special. In fact, they are so un-special, that we can just about apply a standard query syntax (SQL 92) to them, and expect that the query is both portable between databases and common across applications that wish to use them.

This is where database abstraction fits in. As a developer, you’ll come across this problem time and time again. A large portion of an application are CRUD screens and the smaller more interesting part of your application is your reporting screens. With an abstraction layer, we are able to code against both a unified API as well as have a layer that will produce consistent and vendor compatible queries. This allows us to build more specialized data access layers (patterns) for multiple database vendors with great ease. You want Table Gateway- done, you want Row Gateway- done, you want Active Record- done. Each can be implemented to tackle the 80% part of the 80/20 rule when applied to the database centric business code of an application.

The Slow Path & The Fast Path

When I talk about this 80/20 rule in terms of the applications we write, I like to further refine the terminology so that it easier to visualize. The most prominent terms that helps developers visualize the 80/20 rule in their application is the slow path of your application, and the fast path of your application. Each of these terms has a set of characteristics that set each apart from one another:

Slow Path:

  • Performance is not of primary importance
  • Has an interactive nature
  • Validation and verification of data are of high priority
  • Application to data-store interactions are fairly trivial
  • Does not comprise applications core business logic

Fast Path:

  • Performance is of importance
  • Limited interactive nature, information flow is fairly static (non-interactive)
  • Flow of information consist of already verified and validated data (originates from the databsae)
  • Application to data-store interaction can become complex (JOINs, SUB-SELECTS, VIEWS)
  • Is the core business of the application

To get a better understanding of how the terms are applied, lets look at a typical web application. Generally speaking, there are a few web based forms that users interact with. These forms are the entry point of a code path that does not get a lot of throughput. This is generally because forms are submitted by people, and people can only type and submit forms so fast. In addition to this being a less traveled code path, it also has a few checks along the way- validation of data, and verification of data. Typically, the problems of verification and validation of data are not too unique to the application being executed. In fact, the web forms, validation and verification problems have been solved over and over again by various libraries.

On the other side of the equation, there is the aggregation and merging of the stored data (which inevitably came from the aforementioned web forms.) Since the unique aggregation and processing of this data is the core aspect of business of said application, it stands to reason that this code path will be more well traveled by users. This, is the fast path. The problems solved in this code path are generally unique and since they are unique, it’s hard to find an off the shelf solution to these problems.

Since this is where the money is to be made, it also stands to reason that developers should concentrate their efforts in the fast path of their application. This means they should solve the slow path problems of their application with existing tried and tested solutions- this includes generic forms solutions, validation and verification libraries and yes, database abstraction layers.

Getting Cozy With Zend_Db, a Database Abstraction Layer

Not that we’ve made a use case for DAL’s, what would one look like? Well, I’ll use Zend Frameworks Zend_Db as my use case.

The connection code:

$dbAdapter = Zend_Db::factory(array(
    'adapter' => 'Pdo_Mysql', // could be Pdo_Sqlite, Mysqli, Pdo_Mysql, Db2, or even Oracle
    'params' => array(
        'username' => 'test_user',
        'password' => 'test_pwd',
        'dbname' => 'test'
        )
    ));

You’ll note that since this factory takes a standardized array, it makes it trivial to swap out various connection information for different adapters.

Simple queries:

$data = array(
    'name'        => 'Remember the Milk',
    'description' => '2% Milk'
    'due_on'      => '2009-07-15',
    );
$dbAdapter->insert('todo_list', $data); // insert that data

// or
$lastInsertId = $dbAdapter->lastInsertId('todo_list');
$dbAdapter->update('todo_list', array('completed' => 'YES'), 'id = ' . $lastInsertId);

$dbAdapter->delete('todo_list', 'id = ' . $lastInsertId);

Here you’ll notice the generic and abstracted nature of this API. Since there are several tasks in database interaction that are consistent across the board, those such as INSERT, UPDATE and DELETE, it makes sense that we can create a generic API for handling such interactions. These interactions (INSERT, UPDATE and DELETE) represent the mutation methods of a database and as such, represent the most predominant way of getting data into a system.

For all intents and purposes though, simple SELECTs are fairly standardized too. They are standardized enough as to compliment the INSERT, UPDATE, and DELETE abstractions so that we can find actual rows to do these mutation operations.

Now that we have a simple and consistent API for doing simple SELECTs, INSERTs, UPDATEs, and DELETEs; we can implement something a little more interesting: the table & row gateway:

Zend_Db_Table_Abstract::setDefaultAdapter($dbAdapter);
$userTable = new Zend_Db_Table('user'); // ZF 1.9 feature
$userRow = $table->find(5); // find user by id 5 (primary key);
echo $userRow->username;

Immediately, you should see the inherent value in the above example. Rudimentary and common tasks can now be handled with a consistent and simple API. But what happens when you’ve started using this DAL, and you want to use a vendor specific feature? Well..

// assuming what you want is really REPLACE or INSERT IGNORE from mysql
$dbAdapter->query('INSERT IGNORE INTO configuration (name, value) VALUES (?, ?)', array($name, $value));

// OR
$dbAdapter->query('REPLACE INTO configuration (name, value) VALUES (?, ?)', array($name, $value));

As you can see, the query method of our database adapter will allow us to pass custom SQL into the database thus taking advantage of vendor specific features.

What if you want to combine both paradigms for ultimate flexibility?


// assuming Zend_Db_Table_Row, with a FriendshipReference rule
$friendRowset = $currentUserRow->findDependentRowset('User', 'FriendshipReference');

// collect friend id's
foreach ($friendRowset as $friendRow) {
    $friendIds[] = $friendRow->related_user_id;
}

$inClause = ' IN (' . implode(',', $friendIds) . ')';

$select = $dbAdapter->select();
$select
    ->from('user', array(
        'user_id',
        'related_user_id',
        'became_friends_on'
        ))
    ->where('user_id ' . $inClause);

// interact with driver directly
$mysqli = $dbAdapter->getConnection();
$mysqli->query('CREATE TEMPORARY TABLE friend ('
        . ' `user_id` int(11) NOT NULL,'
        . ' `related_user_id` int(11) NOT NULL,'
        . ' `became_friends_on` DATE NOT NULL'
        . ' ) ENGINE=MEMORY;'
    );
$mysqli->query('INSERT INTO friend ' . (string) $select);

// query new friend view
$friendTable = new Zend_Db_Table('friend');
$rows = $friendTable->fetchAll(
    'became_friends_on > DATE_SUB(CURDATE(), INTERVAL 6 MONTH)',
    'became_friends_on'
    );

While that above example is “a bit out there”, it does show that even with a DAL, if it’s flexible enough, you can code as close to or as far away from the database as you like. Ultimately the mantra here is: lets get the job done in the most effective, efficient and sound way possible.

Conclusions

Simply put, a database abstraction layer is just another tool in the toolbox. You don’t have to completely change your paradigm of programming, nor do you have to apply an all-or-none approach to using a DAL. When applied correctly, you can build out the slow path of your application in little to no time, while leaving extra time for developing and fine-tuning the fast path of your application. And to keep code from becoming unruly, simply apply some best-practices code organization to your project.

Author: Ralph Schindler

by News Robot on August 27, 2010 in News, No Comments »
tags:

Michelangelo van Dam’s Blog: Zend Framework data models

Since data mappers have been introduced in Zend Framework Quickstart lot’s of developers get confused about how to use the same pattern with relations. In one of his recent posts Michelangelo van Dam show to model relations between tables and how to create  complex data models build from data scattered over few dependent tables.

I was struggling getting my data models (as described in the Zend Framework Quickstart) to work with relations. My first solution was to create a database view that merged that data using joins to collect this data in a format that I could use in my data models. This was going great until I looked at my database where it contained over 20 views (along with 20 data models, mappers and db table gateways) ! So I said to myself there had to be another way.

His post is interesting reading for developers who would like to follow data mappers pattern presented in Zend Framework Quickstart and keep their code clean.

Federico Cargnelutti’s blog: Database Replication Adapter for Zend Framework Applications

Scalability problems are kind of problems many developers and entrepreneurs would like to have. I your already dealing with such problems, you had to notice important feature missing in Zend_Db – support of database replication.

Database replication is an option that allows the content of one database to be replicated to another database or databases, providing a mechanism to scale out the database. Scaling out the database allows more activities to be processed and more users to access the database by running multiple copies of the databases on different machines.

In a recent post on his blog Federico Cargnuletti presents his implementation of Zend_Db replication adapter. It supports single-master and multi-master architectures, as well as connection status caching. Read his post and code before implementing your own replication adapter – you might have it already done.

Federico Cargnelutti’s Blog: Zend Framework DAL DAOs and DataMappers

Zend Framework’s MVC pattern implementation is often criticized for models not being model as definition says. Zend Framework Team even does not pretend that framework has models, although suggests using Zend_Db component classes as simple and dirty replacement. It does not mean that community can’t do anything about it.

In his latest post Federico Cargnelutti shows how DAO, DAL and Data Mappers can be implemented in Zend Framework. He explains purpose of DAO, DAL and Data Mappers, proposes directory structure for it shows example implementation of all the classes.

His approach is very interesting and ready to use. Who knows, maybe it will be a sparkle for new component proposal, the model.