Optimising Entity Framework with Projection-Friendly Mapping
Discovering more efficient ways to handle data queries can be quite enlightening. I recently had an “aha” moment with Entity Framework projections that I’d like to share, especially since it’s a pattern I’ve been using for quite some time.
I’ve set up an example ASP.NET Core Web API with the following entities:
public class CategoryEntity
{
public int Id { get; set; }
public string Name { get; set; } = null!;
public string? Description { get; set; }
public List<BookEntity> Books { get; set; } = null!;
}
public class AuthorEntity
{
public int Id { get; set; }
public string Name { get; set; } = null!;
public string? Biography { get; set; }
public string? Image { get; set; }
public List<BookEntity> Books { get; set; } = null!;
}
public class BookEntity
{
public int Id { get; set; }
public string Name { get; set; } = null!;
public string Blurb { get; set; } = null!;
public string? Isbn { get; set; }
public string? Publisher { get; set; }
public DateTime? PublishedOn { get; set; }
public decimal? Price { get; set; }
public string? CoverImage { get; set; }
public string? PreviewLink { get; set; }
public string? PurchaseLink { get; set; }
public string? AudibleLink { get; set; }
public int AuthorId { get; set; }
public AuthorEntity Author { get; set; } = null!;
public int CategoryId { get; set; }
public CategoryEntity Category { get; set; } = null!;
}
On projects I have worked on, to transfer data, we often create DTOs with MapFromEntity
methods, which help maintain a clean separation of concerns:
public record AuthorDto
{
public int Id { get; init; }
public string Name { get; init; } = null!;
public List<BookDto> Books { get; init; } = new();
public static AuthorDto MapFromEntity(AuthorEntity entity)
{
return new()
{
Id = entity.Id,
Name = entity.Name,
Books = entity.Books.Select(BookDto.MapFromEntity).ToList()
};
}
}
public record BookDto
{
public int Id { get; init; }
public string Name { get; init; } = null!;
public int CategoryId { get; init; }
public string CategoryName { get; init; } = null!;
public static BookDto MapFromEntity(BookEntity entity)
{
return new()
{
Id = entity.Id,
Name = entity.Name,
CategoryId = entity.CategoryId,
CategoryName = entity.Category.Name,
};
}
}
These methods are then used in EF queries to fetch and map data:
List<AuthorDto> authors = await _dbContext.Authors
.Include(_ => _.Books)
.ThenInclude(_ => _.Category)
.Select(_ => AuthorDto.MapFromEntity(_))
.ToListAsync();
Have a look at the SQL query that was generated by EF. Note how this pulls out all the data from the relevant Authors
, Books
, and Categories
tables, even though we’re only interested in a select few columns:
SELECT [a].[Id], [a].[Biography], [a].[Image], [a].[Name], [t].[Id], [t].[AudibleLink], [t].[AuthorId], [t].[Blurb], [t].[CategoryId], [t].[CoverImage], [t].[Isbn], [t].[Name], [t].[PreviewLink], [t].[Price], [t].[PublishedOn], [t].[Publisher], [t].[PurchaseLink], [t].[Id0], [t].[Description], [t].[Name0]
FROM [Authors] AS [a]
LEFT JOIN (
SELECT [b].[Id], [b].[AudibleLink], [b].[AuthorId], [b].[Blurb], [b].[CategoryId], [b].[CoverImage], [b].[Isbn], [b].[Name], [b].[PreviewLink], [b].[Price], [b].[PublishedOn], [b].[Publisher], [b].[PurchaseLink], [c].[Id] AS [Id0], [c].[Description], [c].[Name] AS [Name0]
FROM [Books] AS [b]
INNER JOIN [Categories] AS [c] ON [b].[CategoryId] = [c].[Id]
) AS [t] ON [a].[Id] = [t].[AuthorId]
ORDER BY [a].[Id], [t].[Id]
The issue should have become apparent to us by the fact that we have to use Include
for the related data, and the fact that we were able to breakpoint and debug into MapFromEntity
methods which should have suggested to us that all the data was being returned from the database before performing the mapping locally in memory.
If we change the EF query to:
List<AuthorDto> authors = await _dbContext.Authors
.Select(author => new AuthorDto
{
Id = author.Id,
Name = author.Name,
Books = author.Books.Select(book => new BookDto
{
Id = book.Id,
Name = book.Name,
CategoryId = book.CategoryId,
CategoryName = book.Category.Name,
}).ToList()
})
.ToListAsync();
Then the resulting SQL query is:
SELECT [a].[Id], [a].[Name], [t].[Id], [t].[Name], [t].[CategoryId], [t].[CategoryName], [t].[Id0]
FROM [Authors] AS [a]
LEFT JOIN (
SELECT [b].[Id], [b].[Name], [b].[CategoryId], [c].[Name] AS [CategoryName], [c].[Id] AS [Id0], [b].[AuthorId]
FROM [Books] AS [b]
INNER JOIN [Categories] AS [c] ON [b].[CategoryId] = [c].[Id]
) AS [t] ON [a].[Id] = [t].[AuthorId]
ORDER BY [a].[Id], [t].[Id]
Now, with this approach, we’re only querying the data that we actually need and making proper use of projection. Excellent.
The problem is that I find the MapFromEntity
method to be very useful for a clean separation of concerns.
We could have used AutoMapper, and we have on previous projects, but we moved away from that for various reasons like lack of static code analysis. Since getting rid of AutoMapper, we’ve never looked back, but that’s a different story.
To allow us to still use our MapFromEntity
pattern, we can use Expressions
instead:
public record BookDto
{
public int Id { get; init; }
public string Name { get; init; } = null!;
public int CategoryId { get; init; }
public string CategoryName { get; init; } = null!;
public static readonly Expression<Func<BookEntity, BookDto>> MapFromEntity = entity => new BookDto
{
Id = entity.Id,
Name = entity.Name,
CategoryId = entity.CategoryId,
CategoryName = entity.Category.Name,
};
}
public record AuthorDto
{
public int Id { get; init; }
public string Name { get; init; } = null!;
public List<BookDto> Books { get; init; } = new();
public static readonly Expression<Func<AuthorEntity, AuthorDto>> MapFromEntity = entity => new AuthorDto
{
Id = entity.Id,
Name = entity.Name,
Books = entity.Books.AsQueryable().Select(BookDto.MapFromEntity).ToList()
};
}
Now, we can use our MapFromEntity
pattern efficiently:
List<AuthorDto> authors = await _dbContext.Authors
.Select(AuthorDto.MapFromEntity)
.ToListAsync();
And we achieve the same optimised SQL as with direct projection (i.e. newing up the DTO in the EF query):
SELECT [a].[Id], [a].[Name], [t].[Id], [t].[Name], [t].[CategoryId], [t].[CategoryName], [t].[Id0]
FROM [Authors] AS [a]
LEFT JOIN (
SELECT [b].[Id], [b].[Name], [b].[CategoryId], [c].[Name] AS [CategoryName], [c].[Id] AS [Id0], [b].[AuthorId]
FROM [Books] AS [b]
INNER JOIN [Categories] AS [c] ON [b].[CategoryId] = [c].[Id]
) AS [t] ON [a].[Id] = [t].[AuthorId]
ORDER BY [a].[Id], [t].[Id]
By embracing Expressions
, we can retain our clean MapFromEntity
pattern and still enjoy the benefits of EF Core’s projection capabilities. This approach keeps our codebase efficient and maintainable.