Optimising Entity Framework with Projection-Friendly Mapping

Introduction

Discovering more efficient ways to handle data queries can be quite enlightening. I recently had an “aha” moment with Entity Framework projections that I’d like to share, especially since it’s a pattern I’ve been using for quite some time.

The Initial Setup

I’ve set up an example ASP.NET Core Web API with the following entities:

public class CategoryEntity
{
    public int Id { get; set; }

    public string Name { get; set; } = null!;

    public string? Description { get; set; }

    public List<BookEntity> Books { get; set; } = null!;
}

public class AuthorEntity
{
    public int Id { get; set; }

    public string Name { get; set; } = null!;

    public string? Biography { get; set; }

    public string? Image { get; set; }

    public List<BookEntity> Books { get; set; } = null!;
}

public class BookEntity
{
    public int Id { get; set; }

    public string Name { get; set; } = null!;

    public string Blurb { get; set; } = null!;

    public string? Isbn { get; set; }

    public string? Publisher { get; set; }

    public DateTime? PublishedOn { get; set; }

    public decimal? Price { get; set; }

    public string? CoverImage { get; set; }

    public string? PreviewLink { get; set; }

    public string? PurchaseLink { get; set; }

    public string? AudibleLink { get; set; }

    public int AuthorId { get; set; }
    public AuthorEntity Author { get; set; } = null!;

    public int CategoryId { get; set; }
    public CategoryEntity Category { get; set; } = null!;
}

On projects I have worked on, to transfer data, we often create DTOs with MapFromEntity methods, which help maintain a clean separation of concerns:

public record AuthorDto
{
    public int Id { get; init; }

    public string Name { get; init; } = null!;

    public List<BookDto> Books { get; init; } = new();

    public static AuthorDto MapFromEntity(AuthorEntity entity)
    {
        return new()
        {
            Id = entity.Id,
            Name = entity.Name,
            Books = entity.Books.Select(BookDto.MapFromEntity).ToList()
        };
    }
}

public record BookDto
{
    public int Id { get; init; }

    public string Name { get; init; } = null!;

    public int CategoryId { get; init; }

    public string CategoryName { get; init; } = null!;

    public static BookDto MapFromEntity(BookEntity entity)
    {
        return new()
        {
            Id = entity.Id,
            Name = entity.Name,
            CategoryId = entity.CategoryId,
            CategoryName = entity.Category.Name,
        };
    }
}

These methods are then used in EF queries to fetch and map data:

List<AuthorDto> authors = await _dbContext.Authors
    .Include(_ => _.Books)
    .ThenInclude(_ => _.Category)
    .Select(_ => AuthorDto.MapFromEntity(_))
    .ToListAsync();

Have a look at the SQL query that was generated by EF. Note how this pulls out all the data from the relevant Authors, Books, and Categories tables, even though we’re only interested in a select few columns:

SELECT [a].[Id], [a].[Biography], [a].[Image], [a].[Name], [t].[Id], [t].[AudibleLink], [t].[AuthorId], [t].[Blurb], [t].[CategoryId], [t].[CoverImage], [t].[Isbn], [t].[Name], [t].[PreviewLink], [t].[Price], [t].[PublishedOn], [t].[Publisher], [t].[PurchaseLink], [t].[Id0], [t].[Description], [t].[Name0]
FROM [Authors] AS [a]
LEFT JOIN (
    SELECT [b].[Id], [b].[AudibleLink], [b].[AuthorId], [b].[Blurb], [b].[CategoryId], [b].[CoverImage], [b].[Isbn], [b].[Name], [b].[PreviewLink], [b].[Price], [b].[PublishedOn], [b].[Publisher], [b].[PurchaseLink], [c].[Id] AS [Id0], [c].[Description], [c].[Name] AS [Name0]
    FROM [Books] AS [b]
    INNER JOIN [Categories] AS [c] ON [b].[CategoryId] = [c].[Id]
) AS [t] ON [a].[Id] = [t].[AuthorId]
ORDER BY [a].[Id], [t].[Id]

The issue should have become apparent to us by the fact that we have to use Include for the related data, and the fact that we were able to breakpoint and debug into MapFromEntity methods which should have suggested to us that all the data was being returned from the database before performing the mapping locally in memory.

If we change the EF query to:

List<AuthorDto> authors = await _dbContext.Authors
    .Select(author => new AuthorDto
    {
        Id = author.Id,
        Name = author.Name,
        Books = author.Books.Select(book => new BookDto
        {
            Id = book.Id,
            Name = book.Name,
            CategoryId = book.CategoryId,
            CategoryName = book.Category.Name,
        }).ToList()
    })
    .ToListAsync();

Then the resulting SQL query is:

SELECT [a].[Id], [a].[Name], [t].[Id], [t].[Name], [t].[CategoryId], [t].[CategoryName], [t].[Id0]
FROM [Authors] AS [a]
LEFT JOIN (
    SELECT [b].[Id], [b].[Name], [b].[CategoryId], [c].[Name] AS [CategoryName], [c].[Id] AS [Id0], [b].[AuthorId]
    FROM [Books] AS [b]
    INNER JOIN [Categories] AS [c] ON [b].[CategoryId] = [c].[Id]
) AS [t] ON [a].[Id] = [t].[AuthorId]
ORDER BY [a].[Id], [t].[Id]

Now, with this approach, we’re only querying the data that we actually need and making proper use of projection. Excellent.

The problem is that I find the MapFromEntity method to be very useful for a clean separation of concerns.

We could have used AutoMapper, and we have on previous projects, but we moved away from that for various reasons like lack of static code analysis. Since getting rid of AutoMapper, we’ve never looked back, but that’s a different story.

Retaining MapFromEntity with Expressions

To allow us to still use our MapFromEntity pattern, we can use Expressions instead:

public record BookDto
{
    public int Id { get; init; }

    public string Name { get; init; } = null!;

    public int CategoryId { get; init; }

    public string CategoryName { get; init; } = null!;

    public static readonly Expression<Func<BookEntity, BookDto>> MapFromEntity = entity => new BookDto
    {
        Id = entity.Id,
        Name = entity.Name,
        CategoryId = entity.CategoryId,
        CategoryName = entity.Category.Name,
    };
}

public record AuthorDto
{
    public int Id { get; init; }

    public string Name { get; init; } = null!;

    public List<BookDto> Books { get; init; } = new();

    public static readonly Expression<Func<AuthorEntity, AuthorDto>> MapFromEntity = entity => new AuthorDto
    {
        Id = entity.Id,
        Name = entity.Name,
        Books = entity.Books.AsQueryable().Select(BookDto.MapFromEntity).ToList()
    };
}

Now, we can use our MapFromEntity pattern efficiently:

List<AuthorDto> authors = await _dbContext.Authors
    .Select(AuthorDto.MapFromEntity)
    .ToListAsync();

And we achieve the same optimised SQL as with direct projection (i.e. newing up the DTO in the EF query):

SELECT [a].[Id], [a].[Name], [t].[Id], [t].[Name], [t].[CategoryId], [t].[CategoryName], [t].[Id0]
FROM [Authors] AS [a]
LEFT JOIN (
    SELECT [b].[Id], [b].[Name], [b].[CategoryId], [c].[Name] AS [CategoryName], [c].[Id] AS [Id0], [b].[AuthorId]
    FROM [Books] AS [b]
    INNER JOIN [Categories] AS [c] ON [b].[CategoryId] = [c].[Id]
) AS [t] ON [a].[Id] = [t].[AuthorId]
ORDER BY [a].[Id], [t].[Id]

Conclusion

By embracing Expressions, we can retain our clean MapFromEntity pattern and still enjoy the benefits of EF Core’s projection capabilities. This approach keeps our codebase efficient and maintainable.