batch / bulk insert/update in jpa/hibernate with flush and clear


use entity manager to do flush/clear

int batchSize = 1000;
for (int i = 0; i < taloes.size(); i++) {
    TalaoAIT talaoAIT = taloes.get(i);
    if(i % batchSize == 0) {

For hibernate

basically just switch the entity manager with hibernate session.

    public void  saveNiidsMessages(List<SrcNiidsXmlEntity> entities)
        Session session = getSession();

        for(SrcNiidsXmlEntity entity : entities)
        //flush a batch of inserts and release memory:

    protected Session getSession()
        return sessionFactory.getCurrentSession();

When making new objects persistent flush() and then clear() the session regularly in order to control the size of the first-level cache.

The suggest batch size is 20-50 by hibernate. However I found 1500 is good in some of my scenarios.

More on flush and clear

Hibernate administers the persistent objects within a transaction in the so-called session. In JPA, the EntityManager takes over this task. In the following, the term “EntityManager” will also be used as a synonym for the Hibernate session, as both have a similar interface. As long as an object is attached to an EntityManager, all changes to the object will be synchronized with the database automatically. This is called flushing of the objects. The point in time of the object synchronization with the database is not guaranteed – the later a flush occurs, the more optimizing potential has the EntityManager, because e.g. updates to an object can be bundled to prevent SQL statements.

If you call clear, all currently managed objects of the EntityManager will be detached and the status is not synchronized with the database. As long as the objects are not explicitly attached again, they are standard Java objects, whose change does not have any effect on the data base. In many applications that use Hibernate or JPA, flush() and clear() are frequently called explicitly, which often has fatal effects on performance and maintainability of the application. A manual call of flush() should be prevented by a clear design of the application and is similar to a manual call of System.gc() which requests a manual garbage collection. In both cases, a normal, optimized operation of the technologies is prevented. For Hibernate and JPA this means that generally more updates are made than necessary in the case the EntityManager would have decided about the point in time.

The call of clear(), in many cases preceded by a manual flush(), leads to all objects being decoupled from the EntityManager. For this reason you should define clear architecture- and design guidelines about where aclear() can be called. A typical usage scenario for clear() is in batch processing. Working with unnecessary extensive sessions should be prevented. Apart from that, this should be noted in the Javadoc of the method explicitly, otherwise the application could show some unpredictable behaviour if the call of a method can lead to the deletion of the complete EntityManager context. This means that the objects must be re-inserted into the context of the EntityManager. Normally, the status of the objects has to be re-imported from the database for this. Depending on the fetching strategies, there are cases in which the status of the objects must be read manually to have all associations attached again. In the worst case, even modified object data will not be saved permanently.

Reference 1



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s