In order to optimize the performance of web applications that interact with
databases, developers should aim to minimize both database queries and
memory usage. While minimizing database queries is a common goal, reducing
memory usage can also have a significant impact on performance. However,
developers often prioritize minimizing queries over minimizing memory usage,
which can result in unintended consequences.
One of the ways this can happen is when developers use techniques like
eager-loading to limit database queries, but end up loading too much data
into memory. This can lead to increased memory usage, slower performance,
and even crashes.
To address this issue, developers can use strategies like lazy-loading and
pagination to balance the goals of minimizing database queries and memory
usage. By implementing these techniques, developers can improve the
performance of their web applications while avoiding memory issues.
The Problem
Imagine we have a blog application that has a posts table and a user,
class User extends Model
{
public function posts()
{
return $this->hasMany(Post::class);
}
}
class Post extends Model
{
public function user()
{
return $this->belongsTo(User::class);
}
}
So how do we go about creating the users page above? In particular, how do
we get the last post date? The easy answer here might be to do the
following:
$users = User::all();
@foreach ($users as $user)
<tr>
<td>{{ $user->name }}</td>
<td>{{ $user->email }}</td>
<td>
@if ($lastPost = $user->posts()->latest()->first())
{{ $lastPost->created_at->format('M j, Y \a\t g:i a') }}
@else
Never
@endif
</td>
</tr>
@endforeach
Using the approach discussed can create an N+1 issue, resulting in multiple
queries for each user displayed. This can impact the application's
performance, but we can address this issue by using eager-loading techniques
and caching strategies to retrieve the data more efficiently.
$users = User::with('posts')->get();
@foreach ($users as $user)
<tr>
<td>{{ $user->name }}</td>
<td>{{ $user->email }}</td>
<td>
@if ($lastPost = $user->posts->last())
{{ $lastPost->created_at->format('M j, Y \a\t g:i a') }}
@else
Never
@endif
</td>
</tr>
@endforeach
The proposed solution involves only two database queries, one for the users
and one for the corresponding post records. However, this approach can lead
to memory issues despite avoiding the N+1 problem.
How about denormalizing and caching?
In the previous example, we used eager-loading to retrieve the posts for each user. However, this approach can lead to memory issues if the number of posts for each user is large. To address this issue, we can use a denormalization strategy to store the last post date in the users table. This will allow us to retrieve the last post date without loading the associated posts into memory.
Schema::table('users', function (Blueprint $table) {
$table->timestamp('last_post_at')->nullable();
});
$users = User::all();
@foreach ($users as $user)
<tr>
<td>{{ $user->name }}</td>
<td>{{ $user->email }}</td>
<td>
@if ($user->last_post_at)
{{ $user->last_post_at->format('M j, Y \a\t g:i a') }}
@else
Never
@endif
</td>
</tr>
@endforeach
Although denormalization can be an effective solution, it is important to
recognize that caching can be more complex than simply denormalizing data.
While there may be situations where denormalization is appropriate, it
should not be the default solution to address limitations in the ORM. It is
essential to explore other strategies and approaches to achieve better
caching results.
My preferred solution!
An alternative solution to address the issue discussed in the text is to use subqueries. With subqueries, we can select additional columns (attributes) directly in the database query, such as the users query. This approach can help us avoid performance issues related to N+1 queries and memory usage.
$users = User::query()
->addSelect(['last_post_at' => Post::select('created_at')
->whereColumn('user_id', 'users.id')
->latest()
->take(1)
])
->withCasts(['last_post_at' => 'datetime'])
->get();
@foreach ($users as $user)
<tr>
<td>{{ $user->name }}</td>
<td>{{ $user->email }}</td>
<td>
@if ($user->last_post_at)
{{ $user->last_post_at->format('M j, Y \a\t g:i a') }}
@else
Never
@endif
</td>
</tr>
@endforeach
By utilizing a subquery in this manner, we can retrieve all the necessary
information for our users page in a single query. This technique offers
significant performance benefits by minimizing both database queries and
memory usage, while also eliminating the need for caching.
Introducing dynamic relationships
So, we have a subquery that gives us the last post date for each user, but
what if we need more information about the last post, like its excerpt? One
approach is to create another subquery scope, but that could become
cumbersome if we have many attributes to retrieve.
Alternatively, we could create a belongs-to relationship called lastPost
that returns a Login model instance. However, since we don't want to
denormalize and add a last_post_id column to the users table, we'll use a
subquery to select the foreign key. Eloquent will treat it like a real
column, and we can define additional functionality like accessors or
relationships. Here's the code to implement it:
class User extends Model
{
public function lastPost()
{
return $this->belongsTo(Post::class);
}
public function scopeWithLastPost($query)
{
$query->addSelect(['last_post_id' => Post::select('id')
->whereColumn('user_id', 'users.id')
->latest()
->take(1)
])->with('lastPost');
}
}
$users = User::withLastPost()->get();
<table>
<tr>
<th>Name</th>
<th>Email</th>
<th>Last Post</th>
</tr>
@foreach ($users as $user)
<tr
<td>{{ $user->name }}</td>
<td>{{ $user->email }}</td>
<td>
@if ($user->lastPost)
{{ $user->lastPost->created_at->format('M j, Y \a\t g:i a') }}
@else
Never
@endif
</td>
</tr>
@endforeach
</table>
Gotchas
This technique has one limitation that needs to be taken into account. By default, you won't be able to lazy-load dynamic relationships because the scope won't be added automatically.
User::first()->lastPost; // will return null
If you'd like lazy-loading to work, you can still do this by adding a global scope to your model. However, this will add the scope to all queries, which may not be desirable. If you need to lazy-load the relationship, you can always use the withLastPost scope.
Before we conclude, you may have a question in mind: Could we have avoided all this effort by using a has-one relationship instead? The answer is no, and here's why.
class User extends Model
{
public function lastPost()
{
return $this->hasOne(Post::class)
->latest();
}
}
$users = User::with('lastPost')->get();
<table>
<tr>
<th>Name</th>
<th>Email</th>
<th>Last Post</th>
</tr>
@foreach ($users as $user)
<tr>
<td>{{ $user->name }}</td>
<td>{{ $user->email }}</td>
<td>
@if ($user->lastPost)
{{ $user->lastPost->created_at->format('M j, Y \a\t g:i a') }}
@else
Never
@endif
</td>
</tr>
@endforeach
</table>
Initially, it may seem like a has-one relationship would solve the problem.
When we access the lastPost relationship on our users, it returns the
correct last post instance. However, upon examining the generated query, we
can identify an issue.
select * from "users"
select * from "posts" where "posts"."user_id" in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) order by "created_at" desc limit 10
As you can see, it does not put a limit of 1 on the subquery. This means
that Eloquent will load all the posts for each user, which is not what we
want. We only want to load the last post for each user.
Adding limit on the has one has no effect and only returns the first post for all users.