How to do bulk (multi row) inserts with JpaRepository?

To get a bulk insert with Spring Boot and Spring Data JPA you need only two things:

  1. set the option spring.jpa.properties.hibernate.jdbc.batch_size to appropriate value you need (for example: 20).

  2. use saveAll() method of your repo with the list of entities prepared for inserting.

Working example is here.

Regarding the transformation of the insert statement into something like this:

INSERT INTO table VALUES (1, 2), (3, 4), (5, 6)

the such is available in PostgreSQL: you can set the option reWriteBatchedInserts to true in jdbc connection string:

jdbc:postgresql://localhost:5432/db?reWriteBatchedInserts=true

then jdbc driver will do this transformation.

Additional info about batching you can find here.

UPDATED

Demo project in Kotlin: sb-kotlin-batch-insert-demo

UPDATED

Hibernate disables insert batching at the JDBC level transparently if you use an IDENTITY identifier generator.


You can configure Hibernate to do bulk DML. Have a look at Spring Data JPA - concurrent Bulk inserts/updates. I think section 2 of the answer could solve your problem:

Enable the batching of DML statements Enabling the batching support would result in less number of round trips to the database to insert/update the same number of records.

Quoting from batch INSERT and UPDATE statements:

hibernate.jdbc.batch_size = 50

hibernate.order_inserts = true

hibernate.order_updates = true

hibernate.jdbc.batch_versioned_data = true

UPDATE: You have to set the hibernate properties differently in your application.properties file. They are under the namespace: spring.jpa.properties.*. An example could look like the following:

spring.jpa.properties.hibernate.jdbc.batch_size = 50
spring.jpa.properties.hibernate.order_inserts = true
....

The underlying issues is the following code in SimpleJpaRepository:

@Transactional
public <S extends T> S save(S entity) {
    if (entityInformation.isNew(entity)) {
        em.persist(entity);
        return entity;
    } else {
        return em.merge(entity);
    }
}

In addition to the batch size property settings, you have to make sure that the class SimpleJpaRepository calls persist and not merge. There are a few approaches to resolve this: use an @Id generator that does not query sequence, like

@Id
@GeneratedValue(generator = "uuid2")
@GenericGenerator(name = "uuid2", strategy = "uuid2")
var id: Long

Or forcing the persistence to treat the records as new by having your entity implement Persistable and overriding the isNew() call

@Entity
class Thing implements Pesistable<Long> {
    var value: Int,
    @Id
    @GeneratedValue
    var id: Long = -1
    @Transient
    private boolean isNew = true;
    @PostPersist
    @PostLoad
    void markNotNew() {
        this.isNew = false;
    }
    @Override
    boolean isNew() {
        return isNew;
    }
}

Or override the save(List) and use the entity manager to call persist()

@Repository
public class ThingRepository extends SimpleJpaRepository<Thing, Long> {
    private EntityManager entityManager;
    public ThingRepository(EntityManager entityManager) {
        super(Thing.class, entityManager);
        this.entityManager=entityManager;
    }

    @Transactional
    public List<Thing> save(List<Thing> things) {
        things.forEach(thing -> entityManager.persist(thing));
        return things;
    }
}

The above code is based on the following links:

  • http://www.hameister.org/SpringBootUsingIdsForBulkImports.html
  • http://www.hameister.org/SpringBootBulkImportWithCrudRepository.html
  • https://vladmihalcea.com/the-best-way-to-do-batch-processing-with-jpa-and-hibernate/