EF CTP4 Tips & Tricks: Mapping to an Existing Database

Posted on July 18, 2010. Filed under: Entity Framework, Visual Studio | Tags: , , , , , , |

This is the third in a series of posts about the recently released Entity Framework Feature CTP4 (AKA “EF Magic Unicorn Edition”).

Most of the samples/demos of Code First so far have focused on creating a new database that is generated from a set of classes, however Code First also can be used in scenarios where you have an existing database.

For the majority of folks the existing designer/edmx based reverse engineering process (often referred to as Database First) already provides a good story for generating a model based on an existing database, with the option to tweak the object model and mapping. There are however some folks who would rather specify mapping using code than a designer.

You can download the completed VS2010 project from this post.

The Schema

Let’s start by looking at the schema we are going to map to, I’ve intentionally used table/column naming that makes it clear what is happening when we start mapping in code… I’m definitely not implying it’s good database design to post-fix table names with “Table” Smile

The schema is pretty straightforward, there are a couple of things to call out:

  • PostTagsTable represents a many to many relationship between Tags and Posts
  • UserTable is used to store two types of Users
    • Normal Users will have a value of “U” in the user_type column and should have ‘null’ in the admin_from column
    • Administrators will have a value of “A” in the user_type column and will have a valid date value in the admin_from column

 

The Object Model

Let’s define a set of classes that we are going to use EF to map to. We want our classes to be completely independent of whatever data access technology we are using so I’m just writing my own POCO classes.

The object model is pretty self explanatory, a couple of points:

  • For the Post->Blog relationship we are not exposing a foreign key property on Post but for the Post->User relationship we are (Post.AuthorId)
    (This is purely to show the difference in mapping, I’d recommend being consistent one way or the other throughout your model)
  • For the Post->User relationship we are only exposing a navigation property on one side (i.e. there is a Post.Author but no User.Posts)
  • There is no class corresponding to the PostTagsTable, just collection properties on the Post and Tag classes to represent the relationship
  • Although Users and Admins are stored in the same table in the database we have split them out into an inheritance hierarchy in the object model
public class Blog
{
    public int BlogId { get; set; }
    public string Name { get; set; }

    public ICollection<Post> Posts { get; set; }
}

public class Post
{
    public int PostId { get; set; }
    public string Title { get; set; }
    public string Extract { get; set; }
    public string Body { get; set; }

    public Blog Blog { get; set; }

    public ICollection<Tag> Tags { get; set; }

    public int AuthorId { get; set; }
    public User Author { get; set; }
}

public class Tag
{
    public string TagText { get; set; }

    public ICollection<Post> Posts { get; set; }
}

public class User
{
    public int UserId { get; set; }
    public string UserName { get; set; }
}

public class Admin : User
{
    public DateTime AdminFrom { get; set; }
}

The Context

Before we dive into mapping, we need to define a context to use for interaction with the database.

A couple of points to note:

  • I’m exposing a DbConnection based constructor just to keep everything in code for this post

    The easier alternative is to use the default constructor and add a connection string in your app/web.config file with the name “BlogContext”

    EF will match the app/web.config entry with the context name at runtime and use it

  • I’m overriding the protected OnModelCreating method which gives us the opportunity to tweak the model that is built via convention, this method is where we will add our mapping code
public class BlogContext : DbContext
{
    public BlogContext(DbConnection connection)
        : base(connection)
    { }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        // TODO: Insert mapping here
    }

    public DbSet<Blog> Blogs { get; set; }
    public DbSet<Post> Posts { get; set; }
    public DbSet<Tag> Tags { get; set; }
    public DbSet<User> Users { get; set; }
    public DbSet<Admin> Admins { get; set; }
}

Switch Off Database Initialization

DbContext has some smarts built in to try and help you out with managing database initialization, because we are taking care of managing the database schema we should turn that off.

The initialization strategy is an AppDomain wide setting for each context type so this code should execute prior to constructing any instances of the context in your application:

Database.SetInitializer<BlogContext>(null);

Mapping

If we tried to use our context for data access now we’d get a bunch of exceptions because the database we defined at the start doesn’t match what EF is expecting. For example Code First has used a convention to decide that the table for Blog should be called Blogs, which is fine if we let Code First create the database but not so good if we already have a table called BlogTable. In fact the first exception we get would tell us the Code First doesn’t know what the primary key property is for Tag because it doesn't follow the Id or <ClassName>Id convention.

There is however a bunch of stuff that the conventions will correctly take care of for us, like detecting that Blog.BlogId is a primary key and set as an Identity column in the database. Let’s take a look at the mapping code we would add in the OnModelCreating method for each of our five classes.

Blog

Blog is pretty simple to configure, first we are specifying that Name is a required property in the database and has a maximum length of 50.

modelBuilder.Entity<Blog>()
    .Property(b => b.Name)
    .IsRequired()
    .HasMaxLength(50);

Secondly we are specifying which table it maps to and how the properties map to the columns. Notice the anonymous type projection syntax for specifying which properties are mapped to the database and what their respective column names are. On the upside this syntax is concise and reads pretty well if you look at the code, on the down side it isn’t very discoverable when you look at the method signature for MapSingleType; public EntityMap MapSingleType(Expression<Func<TEntity, object>> propertyMap). Any feedback or comments you have on this syntax will help us shape the Fluent API in the future.

modelBuilder.Entity<Blog>()
    .MapSingleType(b => new { blog_id = b.BlogId, name = b.Name })
    .ToTable("BlogTable");

Post

Post is probably the most interesting mapping, we have a few properties to configure which are just variations on code we have already seen.

modelBuilder.Entity<Post>()
    .Property(p => p.Title)
    .IsRequired()
    .HasMaxLength(50);

modelBuilder.Entity<Post>()
    .Property(p => p.Extract)
    .HasMaxLength(500);

modelBuilder.Entity<Post>()
    .Property(p => p.Body)
    .IsRequired();

Next we are configuring the many to many relationship between Post and Tags, here we see the relationship API in action and also the syntax for specifying how the join table of a many to many relationship is controlled.

modelBuilder.Entity<Post>()
    .HasMany(p => p.Tags)
    .WithMany(t => t.Posts)
    .Map("PostTagsTable", (p, t) => new
    {
        post_id = p.PostId,
        tag_id = t.TagText
    });

The one to many relationship between Post and User only exposed a navigation property on Post. We can still specify that Users have many Posts, even though there is no User.Posts property, by using the parameterless overload of HasMany(). We also exposed a foreign key property on the Post class so we need to specify that Post.Author and Post.AuthorId both represent the same relationship, we use the HasConstraint method to do this.

modelBuilder.Entity<Post>()
    .HasRequired<User>(p => p.Author)
    .WithMany()
    .HasConstraint((p, u) => p.AuthorId == u.UserId);

Finally we specify the table and column mappings. Note how the syntax for specifying foreign key columns differs depending on whether a corresponding foreign key property was exposed on the object. If there is a foreign key property in the object mdoel then we can just map that directly to the column (author_id = p.AuthorId), if not then we can dot through the navigation property to the primary key of the related entity (blog_id = p.Blog.BlogId).

modelBuilder.Entity<Post>()
    .MapSingleType(p => new
    {
        post_id = p.PostId,
        title = p.Title,
        extract = p.Extract,
        body = p.Body,
        author_id = p.AuthorId,
        blog_id = p.Blog.BlogId
    })
    .ToTable("PostTable");

Tag

Because Tag doesn’t have an obvious primary key property for Code First to detect by convention we need to configure which property is the key.

modelBuilder.Entity<Tag>()
    .HasKey(t => t.TagText);

Apart from the key the rest is just a simple table and column mapping.

modelBuilder.Entity<Tag>()
    .MapSingleType(t => new
    {
        tag_id = t.TagText
    })
    .ToTable("TagTable");

User & Admin

We map User and Admin at the same time because they form part of the same inheritance hierarchy. Note that we are now using the MapHierarchy method rather than MapSingleType. We then use the Case<T> methods to specify how the two types map to the same table. We specify the values that should go in the discriminator column (user_type) by specifying constants in each mapping fragment. Also note that we don’t  need to reconfigure properties as we move up the inheritance hierarchy (i.e. in the Admin block we don’t need to map ‘user_id’ or ‘user_name’ because we already mapped them for the base type).

modelBuilder.Entity<User>()
    .Property(u => u.UserName)
    .IsRequired()
    .HasMaxLength(50);

modelBuilder.Entity<User>()
    .MapHierarchy()
    .Case<User>(u => new
    {
        user_id = u.UserId,
        user_name = u.UserName,
        user_type = "U"
    })
    .Case<Admin>(a => new
    {
        admin_from = a.AdminFrom,
        user_type = "A"
    })
    .ToTable("UserTable");

Access Data

With all the mapping specified we are now ready to start accessing data:

using (var connection = new SqlConnection(@"Server=.\SQLEXPRESS;Database=ExistingBlogDb;Trusted_Connection=True;"))
{
    using (var context = new BlogContext(connection))
    {
        var blog = new Blog { Name = "romiller.com" };
        var me = new Admin { UserName = "Rowan", AdminFrom = DateTime.Today };
        var ctp4 = new Tag { TagText = "CTP4" };

        var post = new Post
        {
            Title = "Mapping to an existing database",
            Extract = "Blah blah blah",
            Body = "Lost of useful info",
            Blog = blog,
            Author = me,
            Tags = new List<Tag> { ctp4 }
        };

        context.Posts.Add(post);
        context.SaveChanges();
    }
}

Configuration Classes

Obviously our OnModelCreating method is getting pretty huge and this is only a small model, this is where derived configuration classes come in handy.

For example I could define the following configuration for Blog:

public class BlogConfig : EntityConfiguration<Blog>
{
    public BlogConfig()
    {
        this.Property(b => b.Name)
            .IsRequired()
            .HasMaxLength(50);

        this.MapSingleType(b => new { blog_id = b.BlogId, name = b.Name })
            .ToTable("BlogTable");
    }
}

And then just register it in the OnModelCreating method:

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Configurations.Add(new BlogConfig());
    ...
}

Code Generation

If we had a big database with a lot of tables then writing this configuration code is going to become a painful (probably unrealistic) task, in much the same way mapping a huge model to a database from scratch using the designer or xml wouldn’t be a fun task. For this very reason the designer will automatically generate a default 1:1 mapping for you as a starting point. So the logical next step would be to have something analyze your database and give you an object model and mapping code that you could then start to tweak… for a little side project I’m playing around with a generator that does exactly this, it’s not quite ready to share yet but I’ll post it up shortly.

Summary

Mapping to an existing database is achievable with Code First, although the edmx/designer based Database First approach is probably a better fit for the majority of folks. If your database schema varies a lot from the default Code First conventions then you will end up writing quite a bit of code. It is possible to write a generator that will spit out a starting point to reduce the amount of code you need to write by hand, I’ll be sharing the results of a pet project on this shortly.

About these ads

Make a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

19 Responses to “EF CTP4 Tips & Tricks: Mapping to an Existing Database”

RSS Feed for RoMiller.com Comments RSS Feed

Thanks for this nice post!
I have a doubt. How did you mapped the Blog > Post relationship? How should I map an one-to-many rel anyway?

I didn’t need to configure the Blog->Post relationship because everything was correctly detected by convention, that said there is no reason I couldn’t have configured it anyway as follows:

modelBuilder.Entity()
.HasRequired(p => p.Blog)
.WithMany(b => b.Posts)

I have a doubt. Does this configuration still working properly with EF 4.1 Code-First Approach? I tried the configuration above, but raise a run-time configuration error when I try to submit changes.
Thanks in advance for your help.

Thanks – this is a really helpful post.
I’ve read all the posts here plus the ado.net blog posts on CTP4, and there are two areas where I could use a little help:
1: Using this in an asp.net setting, can you give some guidance on what to do at app startup (build the model?) and what to do per request (create a context?). I’m not quite understanding how to do each of these.
2: A little while back K Scott Allen wrote a great article for MSDN about testability in EF4. Do we still follow that guidance – is there a better way to do this now?
(Hint: these might make good blog posts!)
Thanks!

Hi James,

1) DbContext will take care of caching the model for you within an AppDomain (if you set a breakpoint in the OnModelCreating method you’d see it only gets called the first time you construct your context in each AppDomain). Because the model is cached it’s pretty cheap to construct contexts so you can have one per request, or even just create them as needed within the page.
2) In CTP4 the patterns are the same as with EF in general, except you would use IDbSet instead of IObjectSet. We are looking at adding some more functionality in future releases to make testing easier.

~Rowan

What would happen if we changed the column name in the database schema from: “PostTable.author_id” to lets say… “PostTable.AuthorId” or another arbitrary name?
The following mapping code would no longer be usable because the new column name could no longer be determined by convention:
modelBuilder.Entity().HasRequired(p => p.Author).WithMany().

Is it possible to set the author_id column name explicitly using the fluent api?

Oh well.. I guess the solution stared me right in the eyes.
What I was looking for was something similar to the mapping on column PostTable.blog_id. This solved my problem.

Good article btw.
It shows a lot of relationship mapping examples that I was not able to find documentation on anywhere else.
Keep up the good work.

thanks for excellent article. this gives me lot of ideas for my own project. thanks.

I’m trying to create a simple example following this post, but I’ve hit a roadblock.

I have two classes: Sku and Revision (i.e. an iPhone SKU might have several revisions… Rev.A, Rev.B and so forth).

Here’s the Sku class:
public class Sku
{
public int SkuId { get; set; }
public string ModelNumber { get; set; }

public virtual ICollection Revisions { get; set; }
}

And here’s the Revision class:
public class Revision
{
public int RevisionId { get; set; }
public string Name { get; set; }

public int SkuId { get; set; }
public Sku ParentSKU { get; set; }
}

The tables in my db are as follow:
SKU (id, name, modelNumber)
Revision (id,name,skuId)

Here’s how I’ve mapped them:

builder.Entity()
.MapSingleType(s => new {
id = s.SkuId,
name = s.Name,
modelNumber = s.ModelNumber
})
.ToTable(“SKU”);

builder.Entity().HasKey(s => s.SkuId);

builder.Entity()
.MapSingleType(r => new
{
id = r.RevisionId,
name = r.Name,
skuId = r.SkuId
})
.ToTable(“Revision”);

builder.Entity()
.HasRequired(r => r.ParentSKU)
.WithMany(s => s.Revisions);

and here are my two questions:

1) When I try to enumerate Revisions, I get the following inner exception: Invalid Object Name “dbo.ParentSKU_Revisions”. It seems like it thinks it’s a many to many relationship defined in a third table… what am I doing wrong?

2) So if I understood right, the mapping goes is defined in the detail entity? Wouldn’t it make more sense to define the mapping that actually defines the ICollection?

One more thing, if using the DataContext as suggested here, at some point you’ll get the “there’s already an open DataReader…” error. The way I got around this was to declare a “;MultipleActiveResultSets=True;” on my connection string, but I’m not sure this is the best practice. Any thoughts?

Hi,

1) This is a bug in CTP4, to work around it you need to also specify the Foreign Key constraint when you configure the relationship:

builder.Entity()
.HasRequired(r => r.ParentSKU)
.WithMany(s => s.Revisions)
.HasConstraint((r, s) => r.SkuId == s.SkuId);

2) I’m not quite sure what you are asking, do you mean that the above code starts with the Revision rather than the SKU? If so then you can write the code either way around, i.e. the following achieves the same thing:

builder.Entity()
.HasMany(s => s.Revisions)
.WithRequired(r => r.ParentSKU)
.HasConstraint((r, s) => r.SkuId == s.SkuId);

Hope this helps,
~Rowan

Regarding the open data reader error, specifying “MultipleActiveResultSets=True” is the correct thing to do.

I am trying to do a project for a customer using .NET 4. For many reasons I cant use any CTPs at this time. I have gone down the model-first route, which is mostly good, but have come across this problem. I want several read only properties (calculated fields in db terms) but cannot get the fields included in the metadata of the EF model. Is there a way to use the code-first features to make my property part of the metadata using released code?

Hi Ronald,

Do the properties have an underlying column in the database? If so then just add a property in the designer and mark it’s “StoreGeneratedPattern” as “Computed”. If not then you can add extra properties to your classes by defining a partial class (code generation makes all classes partial by default).

Hope this helps,
~Rowan

What happen if you have a many-to-many table that hold compound/composite PK for 3 (not 2) different tables? Your code above links to the Posts and Tags, what if I have another PK from table “Categories”?

modelBuilder.Entity()
.HasMany(p => p.Tags)
.WithMany(t => t.Posts)
.Map(“PostTagsTable”, (p, t) => new
{
post_id = p.PostId,
tag_id = t.TagText
});

Thanks.

Hi,

EF in general only supports 2-way relationships, you would need to have an entity that represents the join table in your model with three relationships off to the related entities.

~Rowan

Hi Rowan, thanks for the info. Do you have an example on how to accomplish that? Thanks again!

For a larger DB would it be possible to use Database first approach to make a model then have EF4 generate the POCO classes for you. Then delete the edmx file? So then you would be essentially doing code first?

Is your side project ready that does the code generation?


Where's The Comment Form?

    About

    Rowan works as a Program Manager for the ADO.NET Entity Framework team at Microsoft. He speaks at technical conferences and blogs at romiller.com. Rowan lives in Seattle, Washington with his wife Athalie. Prior to moving to the US he resided in the small state of Tasmania in Australia. Outside of technology Rowan's passions include snowboarding, mountain biking, horse riding, rock climbing and pretty much anything else that involves being active. The primary focus of his life, however, is to follow Jesus.

    RSS

    Subscribe Via RSS

    • Subscribe with Bloglines
    • Add your feed to Newsburst from CNET News.com
    • Subscribe in Google Reader
    • Add to My Yahoo!
    • Subscribe in NewsGator Online
    • The latest comments to all posts in RSS

    Meta

Liked it here?
Why not try sites on the blogroll...

Follow

Get every new post delivered to your Inbox.

Join 166 other followers

%d bloggers like this: