Fixing the N+1 query problem

👋 Introduction

In an earlier blog post, I touched upon the cardinal sin of performance issues - N+1 queries. In this blog post I am going to explain through a few examples what the N+1 query problem is, and also suggest how to fix each in your application code.

The N+1 query problem is sometimes hard to diagnose from reading your code - which is another reason why you should check the generated sql from your application. With the N+1 problem, you will often find that the issue is not the performance of the query generated, but instead the sum of number of executions and that query time. Therefore, the larger the value of N, the more queries that will be executed, and the larger the performance impact. Unless you test your application at scale, you may find that you do not flush out some of these issues until much later in the development cycle..

🧐 Examples

Consider the following car_garage and car tables which from a one-to-many relationship;

Schema

FetchType.EAGER

Using FetchType.EAGER is generally best avoided because you are going to fetch more data than is required, but also because this fetch type is prone to N+1 issues. The good news, however, this is not always used by default but if your entity relationship is using @ManyToOne or @OneToOne then unfortunately this is used the default value.

So if your mappings look like this;

@ManyToOne
@JoinColumn(name = "garage_id", nullable = false)
private CarGarage garage;

Then you are using the FetchType.EAGER strategy, and every time you fetch the car entity;

@Query("SELECT c FROM Car c")
List<Car> findAllCars();

Then you are going to trigger the N+1 issue;

select car0_.id as id1_0_, 
       car0_.garage_id as garage_i5_0_, 
       car0_.make as make2_0_, 
       car0_.model as model3_0_, 
       car0_.registration as registra4_0_ 
from car car0_

select cargarage0_.id as id1_1_0_, cargarage0_.name as name2_1_0_ from car_garage cargarage0_ where cargarage0_.id=?
select cargarage0_.id as id1_1_0_, cargarage0_.name as name2_1_0_ from car_garage cargarage0_ where cargarage0_.id=?
select cargarage0_.id as id1_1_0_, cargarage0_.name as name2_1_0_ from car_garage cargarage0_ where cargarage0_.id=?
select cargarage0_.id as id1_1_0_, cargarage0_.name as name2_1_0_ from car_garage cargarage0_ where cargarage0_.id=?

Notice how the additional SELECT statements are executed too because the car_garage association has to be fetched after loading the list of car entities.

FetchType.LAZY

Almost contradictory to the above example, switching to FetchType.LAZY can also cause the N+1 query problem, for example if switching the CarGarage association to;

@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "garage_id", nullable = false)
private CarGarage garage;

Fetching a list of car entities will now only execute one sql statement, but when referencing the lazy-loaded association;

List<Car> cars = (List<Car>) carRepository.findAllCars();
Set<String> garageNames = cars.stream()
        .map(Car::getGarage)
        .map(CarGarage::getName)
        .collect(Collectors.toSet());

You will again trigger the N+1 query problem;

select cargarage0_.id as id1_1_0_, cargarage0_.name as name2_1_0_ from car_garage cargarage0_ where cargarage0_.id=?
select cargarage0_.id as id1_1_0_, cargarage0_.name as name2_1_0_ from car_garage cargarage0_ where cargarage0_.id=?
select cargarage0_.id as id1_1_0_, cargarage0_.name as name2_1_0_ from car_garage cargarage0_ where cargarage0_.id=?
select cargarage0_.id as id1_1_0_, cargarage0_.name as name2_1_0_ from car_garage cargarage0_ where cargarage0_.id=?

🛠 Fixing the N+1 issue

The first step you should take to fix the N+1 issue should be to ask yourself; “Do I even need to load this data?”

If the answer is no, you should ensure that your @ManyToOne (or @OneToMany) association has its fetchType set to FetchType.LAZY - in fact, it is better to use this by default to avoid the N+1 issue creeping in across all of your entity relationships.

If you do, there are a few ways of solving this, and it depends on a number of things - including inspecting the execution plan used for any of these options and choosing the one that is most optimal for your example, while also weighing up the pros and cons of each solution.

Using JOIN FETCH

By using JOIN FETCH we can force the generated query to use an INNER JOIN to load the car_garage rows from the database, and therefore eradicating the N+1 issue;

@Query("SELECT c FROM Car c JOIN FETCH c.garage")
List<Car> findAllCars();

Will generate the follow query;

select car0_.id as id1_0_0_, 
       cargarage1_.id as id1_1_1_, 
       car0_.garage_id as garage_i5_0_0_, 
       car0_.make as make2_0_0_, 
       car0_.model as model3_0_0_, 
       car0_.registration as registra4_0_0_, 
       cargarage1_.name as name2_1_1_ 
from car car0_ 
inner join car_garage cargarage1_ on car0_.garage_id=cargarage1_.id 
Entity Graphs

Entity graphs are another way of loading the garage association and avoiding the N+1 issue, I will cover these in more detail in a later post.

By specifying a new entity graph for the garage relationship.

@NamedEntityGraph(
    name = "Car.garage",
    attributeNodes = { @NamedAttributeNode(value = "garage") }
)
public class Car {

@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "garage_id", nullable = false)
private CarGarage garage;

// ETC

And updating the query to make use of this entity graph.

@EntityGraph(value = "Car.garage", type = EntityGraph.EntityGraphType.LOAD)
List<Car> findAll();

Then the following query is generated;

select car0_.id as id1_0_0_, 
       cargarage1_.id as id1_1_1_, 
       car0_.garage_id as garage_i5_0_0_, 
       car0_.make as make2_0_0_, 
       car0_.model as model3_0_0_, 
       car0_.registration as registra4_0_0_, 
       cargarage1_.name as name2_1_1_ 
from car car0_ 
left outer join car_garage cargarage1_ on car0_.garage_id=cargarage1_.id

⚡️ Conclusion

The N+1 can be subtle, and often undiagnosed for some time - it is best to do your best to avoid it by always defaulting to FetchType.LAZY in your entity relationships, this way you have to adjust your code to fetch the extra information - rather than always having it there when you don’t need it to be!