Skip to content

[WIP] Second level cache#580

Closed
FabioBatSilva wants to merge 71 commits intodoctrine:masterfrom
FabioBatSilva:second-level-cache
Closed

[WIP] Second level cache#580
FabioBatSilva wants to merge 71 commits intodoctrine:masterfrom
FabioBatSilva:second-level-cache

Conversation

@FabioBatSilva
Copy link
Copy Markdown
Member

Hi guys. :)

After a look into some implementations I end up with the following solution for the second level cache..

There is lot of work todo before merge it, but i'd like to get your thoughts before i go any further on this approach.
I hope my drafts are good enough to explain the idea :

Cache strategies

* READ_ONLY (DEFAULT)   : ReadOnly cache can do reads, inserts and deletes, cannot perform updates or employ any locks.
* NONSTRICT_READ_WRITE  : Nonstrict Read Write Cache doesn’t employ any locks but can do reads, inserts , updates and deletes.
* READ_WRITE            : Read Write cache employs locks the entity before update/delete.

classes / interfaces

  • Region :
    Defines a contract for accessing a entity/collection data cache. (Doesn’t employ any locks)
  • ConcurrentRegion :
    Defines contract for concurrently managed data region. (Locks the data before update/delete.)
  • CacheKey / EntityCacheKey / CollectionCacheKey/ QueryCacheKey:
    Defines entity / collection key to be stored in the cache region.
  • EntityHidrator / CollectionHidrator
    Build cache entries and rebuild entities/colection from cache
  • CacheFactory
    Factory from second level cache components

Collection Caching

The most common use case is to cache entities. But we can also cache relationships.
A “collection cache” caches the primary keys of entities that are members of a collection (OneToMany/ManyToMany).
and each element will be cached into its region.

Only identifiers will be cached for collection. When a collection is read from the second level cache it will create proxies based on the cached identifiers, if the application needs to access an element, Doctrine will go to the cache to load the element data.

Query Cache

The query cache does not cache the state of the actual entities in the result set;
it caches only identifier values for an individual query.
So the query cache should always be used in conjunction with the second-level cache.

Query Cache validation

UpdateTimestampsCacheRegion (hibernate approach) :
  • The timestamp cache region keeps track of the last update for each table. (updated for each table modification)
  • A single timestamps region it's utilized by all query cache instances.
  • In most hibernate cache implementations is recommend do not configured cache timeout at all.
  • When a query is loaded from cache, the timestamp region is checked for all tables in the query.
  • If the timestamp of the last update on a table is greater than the time the query results were cached,
    Then the entry is removed and the query goes straight to the database.

OPERATIONS

INSERT :

*************************************************************************************************
UnitOfWork#commit
    Connection#beginTransaction
    Persister#executeInserts
    Connection#commit
    CachedPersister#afterTransactionComplete
        -> Region#put
*************************************************************************************************
                    | READ-ONLY             | NONSTRICT-READ-WRITE      | READ-WRITE            |
-------------------------------------------------------------------------------------------------
pre-insert          |                       |                           |                       |
-------------------------------------------------------------------------------------------------
on-insert           |                       |                           |                       |
-------------------------------------------------------------------------------------------------
after-transaction   | add item to the cache | add item to the cache     | add item to the cache |
-------------------------------------------------------------------------------------------------

UPDATE :

*************************************************************************************************
UnitOfWork#commit
    Connection#beginTransaction
    CachedPersister#update
        -> Region#lock
        -> execute
    Connection#commit
    CachedPersister#afterTransactionComplete
        -> Region#put
        -> Region#unlock
*************************************************************************************************
                    | READ-ONLY             | NONSTRICT-READ-WRITE      | READ-WRITE            |
-------------------------------------------------------------------------------------------------
pre-update          |                       |                           | lock item             |
-------------------------------------------------------------------------------------------------
on-update           | throws exception      |                           |                       |
-------------------------------------------------------------------------------------------------
after-transaction   |                       |  update item cache        | remove item cache     |
-------------------------------------------------------------------------------------------------

DELETE :

*************************************************************************************************
UnitOfWork#commit
    Connection#beginTransaction
    CachedPersister#delete
        -> Region#lock
        -> execute
    Connection#commit
    CachedPersister#afterTransactionComplete
        -> Region#evict
*************************************************************************************************
                    | READ-ONLY             | NONSTRICT-READ-WRITE      | READ-WRITE            |
-------------------------------------------------------------------------------------------------
pre-remove          |                       |                           |                       |
-------------------------------------------------------------------------------------------------
on-remove           |                       |                           | lock item             |
-------------------------------------------------------------------------------------------------
after-transaction   | remove item cache     |  remove item cache        | remove item cache     |
-------------------------------------------------------------------------------------------------

USAGE :

<?php

/**
 * @Entity
 * @Cache("NONSTRICT_READ_WRITE")
 */
class State
{
    /**
     * @Id
     * @GeneratedValue
     * @Column(type="integer")
     */
    protected $id;
    /**
     * @Column
     */
    protected $name;
    /**
     * @Cache()
     * @ManyToOne(targetEntity="Country")
     * @JoinColumn(name="country_id", referencedColumnName="id")
     */
    protected $country;
    /**
     * @Cache()
     * @OneToMany(targetEntity="City", mappedBy="state")
     */
    protected $cities;
}
<?php

$em->persist(new State($name, $country));
$em->flush();                                // Put into cache

$em->clear();                                // Clear entity manager

$state   = $em->find('Entity\State', 1);     // Retreive item from cache
$country = $state->getCountry();             // Retreive item from cache
$cities  = $state->getCities();              // Load from database and put into cache

$state->setName("New Name");
$em->persist($state);
$em->flush();                                // Update item cache

$em->clear();                                // Clear entity manager

$em->find('Entity\State', 1)->getCities();   // Retreive from cache


$em->getCache()->containsEntity('Entity\State', $state->getId())  // Check if the cache exists
$em->getCache()->evictEntity('Entity\State', $state->getId());    // Remove an entity from cache
$em->getCache()->evictEntityRegion('Entity\State');               // Remove all entities from cache

$em->getCache()->containsCollection('Entity\State', 'cities', $state->getId());   // Check if the cache exists        
$em->getCache()->evictCollection('Entity\State', 'cities', $state->getId());      // Remove an entity collection from cache
$em->getCache()->evictCollectionRegion('Entity\State', 'cities');                 // Remove all collections from cache

TODO :

  • Handle many to many collection
  • Handle inheritance
  • Remove/add colection items on update
  • Improve region tests
  • Improve cached persisters coverage
  • Implement xml / yml / php drivers
  • Implement transaction region (Improve cache drivers)
  • Implement query cache (need more tests)
  • Read and write region when using query cache
  • Update documentation
  • Implement cache cache hits and misses log
  • .... ????

@doctrinebot
Copy link
Copy Markdown

Hello,

thank you for positing this Pull Request. I have automatically opened an issue on our Jira Bug Tracker for you with the details of this Pull-Request. See the Link:

http://doctrine-project.org/jira/browse/DDC-2295

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why countable? Most caches cannot know the number of entries.

@beberlei
Copy link
Copy Markdown
Member

So far it looks awesome specficially that it doesn't affect the code path of the current implementation of UnitOfWork much. It would be really awesome if we could keep it that way. With regard to the changes in the persisters we might need to employ a strategy pattern to get this working better. Think of the getEntityPersister() method detecting which code strategy to inject into the persister. I want to avoid cluttering the persisters even more, they already decide on so many different configurations.

Regarding the implementation I made comments on the cache, the mapEntityKeys() implementation is not good the way it is.

How will locking of items work? Can you explain? It seems to be a critical thing and i don't see how it can be implemented. Java has some pretty powerful caches.

Otherwise keep up the good work, and try not affecting the current code paths too much ;)

@beberlei
Copy link
Copy Markdown
Member

@FabioBatSilva just thinking - is it possible to refactor the persisters from inheritance into composition?

The inherited code is either SQL Building, or update/insert related. By splitting the persisters API up in more single responsibilities it might be easier to introduce caching in a clean way. Currently just for by id queries its rather simple, i suppose the persisters require much more changes for the other caches.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typehint is wrong. It should either be the ClassMetadata interface from Common or ClassMetadataInfo (which implements all the logic in the ORM and is the parent of ClassMetadata for BC reasons)

@beberlei
Copy link
Copy Markdown
Member

@stof sorry to bark in, but we should focus code reviews on the functional level for now, this PR will take another 1-2 months to get stable and will be very tricky. It would be much easier to follow in that period if the comments were related to the implementation only so long.

@FabioBatSilva
Copy link
Copy Markdown
Member Author

@beberlei

How will locking of items work? Can you explain? It seems to be a critical thing and i don't see how it can be implemented. Java has some pretty powerful caches.

If i got it corectly Hibernate suport 2 types of concurrently managed data.

read/write

  • It basically lock the item while try to update the cache
  • UPDATE
    • lockItem :
      • Stop any other transactions reading or writing this item from/to the cache. (assuming that no other transaction holds a simultaneous lock).
      • Send them straight to the database instead.
    • afterUpdate
      • Update item cache.
    • unlockItem
      • Release the lock on the item. Other transactions may now read / write the cache

Transactional -> (This one we CAN NOT implement.)

  • Provides support for fully transactional cache providers based on JTA.

@FabioBatSilva
Copy link
Copy Markdown
Member Author

The inherited code is either SQL Building, or update/insert related. By splitting the persisters API up in more single responsibilities it might be easier to introduce caching in a clean way. Currently just for by id queries its rather simple, i suppose the persisters require much more changes for the other caches.

Seems good..
While i was looking into hibernate implementation i saw an interesting approach to handle these responsibilities.
They use commands to encapsulate each persister operation,
So persister are acessible into commands and are used basicly for SQL Building, operation executions are inside commands like InsertCommand/ UpdateCommand.

@austinh
Copy link
Copy Markdown

austinh commented Mar 20, 2013

Does anyone know when this second level cache would be released? I was looking to implement something like this myself, especially because just result caching collections in 2.3 causes duplication in cache (too much memory)/outdated objects(inconvenient for users), and this second level cache implemntation seems to solve all those problems. Not sure if I should wait for upgrade (depends on how long before 2.5)

@FabioBatSilva
Copy link
Copy Markdown
Member Author

Wow !!

71 Commits, 141 Files Changed..

I'm going to rebase and reopen this PR..

@Majkl578
Copy link
Copy Markdown
Contributor

Majkl578 commented Oct 1, 2013

I don't get why you didn't just force-push... :)

@FabioBatSilva FabioBatSilva deleted the second-level-cache branch December 20, 2013 20:14
@skafandri
Copy link
Copy Markdown

Very nice work!!
Anyone has an estimation when this will be released?

@guilhermeblanco
Copy link
Copy Markdown
Member

@skafandri this already got into 2.5 branch (master) as of #808

@skafandri
Copy link
Copy Markdown

Awesome! Many thanks!

@jarrettj
Copy link
Copy Markdown

Hi, awesome work.

I'm using ZF2 + doctrine currently. Could you please tell me how to enable this feature using configuration files? I use module.config.php to configure my entitymanager.

Followed the documentation at http://doctrine-orm.readthedocs.org/en/latest/reference/second-level-cache.html but it does not have an example of how to pass the setting as a configuration option.

I've tried putting it in the module.config.php in the doctrine=>configuration=>orm_default section as second_level_cache_enabled, but that does not work.

Thanks.

Cheers.

@FabianKoestring
Copy link
Copy Markdown

@jarrettj I think you have to wait for this pr to be merged.
doctrine/DoctrineORMModule#295

@skafandri
Copy link
Copy Markdown

@jarrettj in your entity manager config add:

 second_level_cache:
                    enabled: true
                    region_cache_driver: memcached
                    regions:
                        region_name:
                            lifetime: 3600
                            type: default

@jarrettj
Copy link
Copy Markdown

@skafandri have you actually got it to work in zend framework 2 + doctrine? I tried adding your options to module.config.php:

        'configuration' => array(
            'orm_default' => array(
                'generate_proxies'  => true,
                'proxy_dir'         => __DIR__ . '/../src/Application/Proxy',
                'proxy_namespace'   => 'Application\Proxy',
                'filters'           => array(),
                'string_functions'   => array(
                    'REGEXP'  => 'Application\DoctrineFunction\Regexp'
                ),
                'second_level_cache' => array(
                        'enabled' => true,
                        'region_cache_driver' => 'memcached',
                        'regions' => array(
                                        'region_name' => '',
                                        'lifetime' => 3600,
                                        'type' => 'default'
                            )
                ),
            )
        ),

Sorry for being dumb! No luck. My errors:

Fatal error: Uncaught exception 'Zend\Stdlib\Exception\BadMethodCallException' with message 'The option "second_level_cache" does not have a matching "setSecondLevelCache" setter method which must be defined' in /var/www/litigation/vendor/zendframework/zendframework/library/Zend/ServiceManager/ServiceManager.php on line 1093
( ! ) Zend\Stdlib\Exception\BadMethodCallException: The option "second_level_cache" does not have a matching "setSecondLevelCache" setter method which must be defined in /var/www/litigation/vendor/zendframework/zendframework/library/Zend/Stdlib/AbstractOptions.php on line 110

Thanks. Cheers.

@FabianKoestring
Copy link
Copy Markdown

@jarrettj He is talking about entity manager config. If you want to use it with zf2 module you have definitely wait for doctrine/DoctrineORMModule#295.

@jarrettj
Copy link
Copy Markdown

Cool @FabianKoestring thanks! Then I'll patiently wait. :)

@jarrettj
Copy link
Copy Markdown

Well, the waiting is over @bakura10 thanks for merging 2nd level cache into doctrine-orm-module. Added the following to module.config.php:

                'second_level_cache' => array(
                    'enabled'               => true,
                    'default_lifetime'      => 7200,
                    'default_lock_lifetime' => 500,

And updated my composer.json:

        "doctrine/doctrine-orm-module": "master",
        "doctrine/orm": "2.5.*@dev",

Hope that helps anyone else using ZF2 + doctrine.

@jarrettj
Copy link
Copy Markdown

How would you set the file lock region directory to use the READ_WRITE mode?

@bakura10
Copy link
Copy Markdown
Member

Setting the cache usage is done in your entity with annotations. Setting the second level cache is a two steps process : you mark your entity as cacheable with an optional region, and you configure the region through arrays

@jarrettj
Copy link
Copy Markdown

@bakura10 All my entities have the Cache("READ_WRITE") annotation. But that does not work. It throws an error saying:

Fatal error: Uncaught exception 'LogicException' with message 'If you what to use a "READ_WRITE" cache an implementation of "Doctrine\ORM\Cache\ConcurrentRegion" is required, The default implementation provided by doctrine is "Doctrine\ORM\Cache\Region\FileLockRegion" if you what to use it please provide a valid directory, DefaultCacheFactory#setFileLockRegionDirectory(). ' in /var/www/litigation/vendor/doctrine/orm/lib/Doctrine/ORM/Cache/DefaultCacheFactory.php on line 225

The configuration documentation does not have any details on how to set this file lock region directory.

If I use Cache("NONSTRICT_READ_WRITE") annotation, it works. But it does not cache relationships. Even when adding the Cache("NONSTRICT_READ_WRITE") annotation to associations and their target entity classes.

Thought if I tried Cache("READ_WRITE") instead, the entity association would maybe work.

@bakura10
Copy link
Copy Markdown
Member

Looks like an option is missing. I'll have a look at this after lunch

@jarrettj
Copy link
Copy Markdown

Cool man :)

@bakura10
Copy link
Copy Markdown
Member

@jarrettj , addressed by doctrine/DoctrineORMModule#378

Thanks for reporting!

@jarrettj
Copy link
Copy Markdown

Only happy to help dude! Will have a look later.

@jarrettj
Copy link
Copy Markdown

File lock directory option works! :) Associations still don't work though. :(

I'll have a closer look tomorrow though. Thanks for the help so far though guys.

@jarrettj
Copy link
Copy Markdown

All jesus, all working! Just did a restart of memcache. This is Freakin Awesome! :)

@bakura10
Copy link
Copy Markdown
Member

Good to know! :) Second Level Cache is indeed an awesome feature!

@guilhermeblanco
Copy link
Copy Markdown
Member

@bakura10 @jarrettj SLC will leverage its power once all cache drivers implement multi get/save support. =D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.