Using lazy-loading in JPA to speed up your application

Hibernate makes working with the database a lot easier. You just need to define your entities and the library does the job of retrieving data or persisting your objects. This is true for all modern JPA libraries, like Hibernate or Spring Data. The more complex you have your objects, the more the benefits of using such libraries. However, there is a big downside to this approach and one that can make your application feel slow or sluggish.

An example of slow loading times

Let’s assume that you have a really complex object structure that you need to persist. For academic purposes, we will be modeling the structure of a university. We have different classes, each class with different students and courses. Each student can also have optional courses from other classes. There are also the teachers, assignments, and many more. A simplified version can look something like the classes below. We will be assuming that they are @Entitie annotated and have all the extra boilerplate code needed.

public class UniversityClass {
    private String name;
    private List<Student> students;
    private List<Course> courses;
}

public class Student {
    private String name;
    private List<Assignment> assignments;
    private List<Course> optional;
}

public class Course {
    private String name;
    private Teacher teacher;
}

public class Teacher {
    private String name;
    private List<Course> teachesCourses;
}

public class Assignment {
    private Course course;
    private Student student;
    private Double grade;
}

The structure can be even more complex, with many circular relations between elements or many layers. All this needs to be modeled in the database and thanks to the @Entity annotation, as well as the relationship ones (@OneToMany, @ManyToMany, @ManyToOne, @OneToOne), things are not that complicated. You can easily start work, model everything and have basic operations up and running in no time.

After that, you deploy your application and start imputing data. You easily reach 4000 students spread over multiple classes and with many teachers. You go to a beautifully designed list where each class should be listed in alphabetical order and, to your surprise, it takes 30+ seconds for it to get it displayed even though you only get the names. What is happening?

The problem: all data is loaded

Upon investigation, you see that when you get the list of classes you execute many DB queries that retrieve a lot of information. You need the classes (their names actually) but you end up retrieving all the students in the database, all their assignments, the teachers, and essentially all the stored information. Why is this happening?

Hibernate sees the structure of your UniversityClass objects and sees that in order to have it fully build, it needs the students and the courses as well. So it retrieves them, and all the data needed to have those fully build. Essentially, you are executing queries to the database to get all the needed information to have your full object structure.

This is because Hibernate does not know what objects you truly need, so it retrieves everything. I had this happen in my projects and simple operations on a small set of objects took seconds just because there was a complex structure that was being retrieved, even when it was not needed.

The fix: lazy-loading

JPA offers the possibility to only retrieve certain objects when they are needed. This is called lazy loading (or lazy fetching) of data. In our scenario, we only need the names, so we can lazy load the Students and the Courses. This will tell Hibernate not to fetch those entities from the database.

To do this, we just need to add the fetch mode parameter to our relations annotation. So, our UniversityClass now looks like this:

@Entity
public class UniversityClass {
    private String name;

    @OneToMany(fetch = FetchType.LAZY)
    private List<Student> students;

    @OneToMany(fetch = FetchType.LAZY)
    private List<Course> courses;

}

Now, the students and courses will only be retrieved from the database when the getters on them are called, saving precious processing time and DB queries whenever they are not needed. So, in our scenario where only the name is needed, only one simple query will be executed.

One common mistake

One common mistake that is often done relates to the mapping from the entity to a transfer object. Many times you have a UniversityClass (the entity) and a UniversityClassDto that is used for the transfer to the outside (either via JSON or some other representation). Obviously, you need to map between the two. This is done either in simple mappers that call the getters and setters of each, using BeanUtils.copyProperties() or some more advanced mapping library like Mapstruct or Dozer.

The problem consists in the fact that when those mappers are called, they will try to map everything, resulting in the same big fetches, only done a bit later. The mappers don’t know that the objects are lazy-loaded and not truly needed. To fix this, make sure you use intelligent mappers and DTO objects that only have the needed data for your specific use case. Otherwise, you may not fix the problem, just pass it on to a different part of the application.


Source link