hibernate-second-level-cache

Posted on: 2019-09-29 2019-11-19
Tags: Caching, Hibernate

Hibernate Second-Level Cache

1. Overview

One of the advantages of database abstraction layers such as ORM (object-relational mapping) frameworks is their ability to transparently cache data retrieved from the underlying store. This helps eliminate database-access costs for frequently accessed data.

Performance gains can be significant if read/write ratios of cached content are high, especially for entities which consist of large object graphs.

In this article we explore Hibernate second-level cache.

We explain some basic concepts and as always we illustrate everything with simple examples. We use JPA and fall back to Hibernate native API only for those features that are not standardized in JPA.

2. What Is a Second-Level Cache?

As most other fully-equipped ORM frameworks, Hibernate has the concept of first-level cache. It is a session scoped cache which ensures that each entity instance is loaded only once in the persistent context.

Once the session is closed, first-level cache is terminated as well. This is actually desirable, as it allows for concurrent sessions to work with entity instances in isolation from each other.

On the other hand, second-level cache is SessionFactory-scoped, meaning it is shared by all sessions created with the same session factory. When an entity instance is looked up by its id (either by application logic or by Hibernate internally, e.g. when it loads associations to that entity from other entities), and if second-level caching is enabled for that entity, the following happens:

If an instance is already present in the first-level cache, it is returned from there
If an instance is not found in the first-level cache, and the corresponding instance state is cached in the second-level cache, then the data is fetched from there and an instance is assembled and returned
Otherwise, the necessary data are loaded from the database and an instance is assembled and returned

Once the instance is stored in the persistence context (first-level cache), it is returned from there in all subsequent calls within the same session until the session is closed or the instance is manually evicted from the persistence context. Also, the loaded instance state is stored in L2 cache if it was not there already.

3. Region Factory

Hibernate second-level caching is designed to be unaware of the actual cache provider used. Hibernate only needs to be provided with an implementation of the org.hibernate.cache.spi.RegionFactory interface which encapsulates all details specific to actual cache providers. Basically, it acts as a bridge between Hibernate and cache providers.

In this article we use Ehcache as a cache provider, which is a mature and widely used cache. You can pick any other provider of course, as long as there is an implementation of a RegionFactory for it.

We add the Ehcache region factory implementation to the classpath with the following Maven dependency:

<dependency>
    <groupId>org.hibernate</groupId>
    <artifactId>hibernate-ehcache</artifactId>
    <version>5.2.2.Final</version>
</dependency>

Take a look here for latest version of hibernate-ehcache. However, make sure that hibernate-ehcache version is equal to Hibernate version which you use in your project, e.g. if you use hibernate-ehcache 5.2.2.Final like in this example, then the version of Hibernate should also be 5.2.2.Final.

The hibernate-ehcache artifact has a dependency on the Ehcache implementation itself, which is thus transitively included in the classpath as well.

4. Enabling Second-Level Caching

With the following two properties we tell Hibernate that L2 caching is enabled and we give it the name of the region factory class:

hibernate.cache.use_second_level_cache=true
hibernate.cache.region.factory_class=org.hibernate.cache.ehcache.EhCacheRegionFactory

For example, in persistence.xml it would look like:

<properties>
    ...
    <property name="hibernate.cache.use_second_level_cache" value="true"/>
    <property name="hibernate.cache.region.factory_class"
      value="org.hibernate.cache.ehcache.EhCacheRegionFactory"/>
    ...
</properties>

To disable second-level caching (for debugging purposes for example), just set hibernate.cache.use_second_level_cache property to false.

5. Making an Entity Cacheable

In order to make an entity eligible for second-level caching, we annotate it with Hibernate specific @org.hibernate.annotations.Cache annotation and specify a cache concurrency strategy.

Some developers consider that it is a good convention to add the standard @javax.persistence.Cacheable annotation as well (although not required by Hibernate), so an entity class implementation might look like this:

@Entity
@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Foo {
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    @Column(name = "ID")
    private long id;

    @Column(name = "NAME")
    private String name;

    // getters and setters
}

For each entity class, Hibernate will use a separate cache region to store state of instances for that class. The region name is the fully qualified class name.

For example, Foo instances are stored in a cache named com.baeldung.hibernate.cache.model.Foo in Ehcache.

To verify that caching is working, we may write a quick test like this:

Foo foo = new Foo();
fooService.create(foo);
fooService.findOne(foo.getId());
int size = CacheManager.ALL_CACHE_MANAGERS.get(0)
  .getCache("com.baeldung.hibernate.cache.model.Foo").getSize();
assertThat(size, greaterThan(0));

Here we use Ehcache API directly to verify that com.baeldung.hibernate.cache.model.Foo cache is not empty after we load a Foo instance.

You could also enable logging of SQL generated by Hibernate and invoke fooService.findOne(foo.getId()) multiple times in the test to verify that the select statement for loading Foo is printed only once (the first time), meaning that in subsequent calls the entity instance is fetched from the cache.

6. Cache Concurrency Strategy

Based on use cases, we are free to pick one of the following cache concurrency strategies:

READ_ONLY: Used only for entities that never change (exception is thrown if an attempt to update such an entity is made). It is very simple and performant. Very suitable for some static reference data that don’t change
NONSTRICT_READ_WRITE: Cache is updated after a transaction that changed the affected data has been committed. Thus, strong consistency is not guaranteed and there is a small time window in which stale data may be obtained from cache. This kind of strategy is suitable for use cases that can tolerate eventual consistency
READ_WRITE: This strategy guarantees strong consistency which it achieves by using ‘soft’ locks: When a cached entity is updated, a soft lock is stored in the cache for that entity as well, which is released after the transaction is committed. All concurrent transactions that access soft-locked entries will fetch the corresponding data directly from database
TRANSACTIONAL: Cache changes are done in distributed XA transactions. A change in a cached entity is either committed or rolled back in both database and cache in the same XA transaction

7. Cache Management

If expiration and eviction policies are not defined, the cache could grow indefinitely and eventually consume all of available memory. In most cases, Hibernate leaves cache management duties like these to cache providers, as they are indeed specific to each cache implementation.

For example, we could define the following Ehcache configuration to limit the maximum number of cached Foo instances to 1000:

<ehcache>
    <cache name="com.baeldung.persistence.model.Foo" maxElementsInMemory="1000" />
</ehcache>

8. Collection Cache

Collections are not cached by default, and we need to explicitly mark them as cacheable. For example:

@Entity
@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Foo {

    ...

    @Cacheable
    @org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
    @OneToMany
    private Collection<Bar> bars;

    // getters and setters
}

9. Internal Representation of Cached State

Entities are not stored in second-level cache as Java instances, but rather in their disassembled (hydrated) state:

Id (primary key) is not stored (it is stored as part of the cache key)
Transient properties are not stored
Collections are not stored (see below for more details)
Non-association property values are stored in their original form
Only id (foreign key) is stored for ToOne associations

This depicts general Hibernate second-level cache design in which cache model reflects the underlying relational model, which is space-efficient and makes it easy to keep the two synchronized.

9.1. Internal Representation of Cached Collections

We already mentioned that we have to explicitly indicate that a collection (OneToMany or ManyToMany association) is cacheable, otherwise it is not cached.

Actually, Hibernate stores collections in separate cache regions, one for each collection. The region name is a fully qualified class name plus the name of collection property, for example: com.baeldung.hibernate.cache.model.Foo.bars. This gives us the flexibility to define separate cache parameters for collections, e.g. eviction/expiration policy.

Also, it is important to mention that only ids of entities contained in a collection are cached for each collection entry, which means that in most cases it is a good idea to make the contained entities cacheable as well.

10. Cache Invalidation for HQL DML-Style Queries and Native Queries

When it comes to DML-style HQL (insert, update and delete HQL statements), Hibernate is able to determine which entities are affected by such operations:

entityManager.createQuery("update Foo set … where …").executeUpdate();

In this case all Foo instances are evicted from L2 cache, while other cached content remains unchanged.

However, when it comes to native SQL DML statements, Hibernate cannot guess what is being updated, so it invalidates the entire second level cache:

session.createNativeQuery("update FOO set … where …").executeUpdate();

This is probably not what you want! The solution is to tell Hibernate which entities are affected by native DML statements, so that it can evict only entries related to Foo entities:

Query nativeQuery = entityManager.createNativeQuery("update FOO set ... where ...");
nativeQuery.unwrap(org.hibernate.SQLQuery.class).addSynchronizedEntityClass(Foo.class);
nativeQuery.executeUpdate();

We have too fall back to Hibernate native SQLQuery API, as this feature is not (yet) defined in JPA.

Note that the above applies only to DML statements (insert, update, delete and native function/procedure calls). Native select queries do not invalidate cache.

11. Query Cache

Results of HQL queries can also be cached. This is useful if you frequently execute a query on entities that rarely change.

To enable query cache, set the value of hibernate.cache.use_query_cache property to true:

hibernate.cache.use_query_cache=true

Then, for each query you have to explicitly indicate that the query is cacheable (via an org.hibernate.cacheable query hint):

entityManager.createQuery("select f from Foo f")
  .setHint("org.hibernate.cacheable", true)
  .getResultList();

11.1. Query Cache Best Practices

Here are a some guidelines and best practices related to query caching:

As is case with collections, only ids of entities returned as a result of a cacheable query are cached, so it is strongly recommended that second-level cache is enabled for such entities.
There is one cache entry per each combination of query parameter values (bind variables) for each query, so queries for which you expect lots of different combinations of parameter values are not good candidates for caching.
Queries that involve entity classes for which there are frequent changes in the database are not good candidates for caching either, because they will be invalidated whenever there is a change related to any of the entity classed participating in the query, regardless whether the changed instances are cached as part of the query result or not.
By default, all query cache results are stored in org.hibernate.cache.internal.StandardQueryCache region. As with entity/collection caching, you can customize cache parameters for this region to define eviction and expiration policies according to your needs. For each query you can also specify a custom region name in order to provide different settings for different queries.
For all tables that are queried as part of cacheable queries, Hibernate keeps last update timestamps in a separate region named org.hibernate.cache.spi.UpdateTimestampsCache. Being aware of this region is very important if you use query caching, because Hibernate uses it to verify that cached query results are not stale. The entries in this cache must not be evicted/expired as long as there are cached query results for the corresponding tables in query results regions. It is best to turn off automatic eviction and expiration for this cache region, as it does not consume lots of memory anyway.

12. Conclusion

In this article we looked at how to set up Hibernate second-level cache. We saw that it is fairly easy to configure and use, as Hibernate does all the heavy lifting behind the scenes making second-level cache utilization transparent to the application business logic.

The implementation of this Hibernate Second-Level Cache Tutorial is available on Github. This is a Maven based project, so it should be easy to import and run as it is.

getdocs

3043