Testing Entity Framework context in-memory

I’m a great believer in having your tests as integrated and as “end-to-end” as possible, avoiding mocks on anything that does not cross a port or boundary, ala https://blogs.msdn.microsoft.com/ericgu/2014/12/01/unit-test-success-using-ports-adapters-and-simulators/

One thing I have always found was that this became tricky when it came to mocking out your data access, particularly with EF. You can quite easily abstract away your data access layer and that’s absolutely fine. However, I always found I lost something doing this, especially if a test encompasses that  more than one bit of database access – as you start dealing in fake data too often. Not to mention if you anything more funky than simple read/writes in the layer you’re abstracting then this isn’t going to be covered.

Entity Framework 7 or core are providing in memory versions of your collections which is great for this: https://prashantbrall.wordpress.com/2015/08/31/entity-framework-7-in-memory-testing/ so look into that if you’re on that version.

I am, however, still using EF6. Some third parties exists to do this, like Effort: https://github.com/tamasflamich/effort which works nicely. I like to have a wee bit more control over the mocked instance and have started creating my own library for mocking your context.

MockEF.<Framework>

https://github.com/MartinMilsom/MockEF

https://www.nuget.org/packages/MockEF.Moq/

https://www.nuget.org/packages/MockEF.Rhino/

I found I like to deal with the context in the same way I would any other mocked service/object etc and so I decided to create variations of the library for various popular mocking frameworks. To date, just Rhino.Mocks and Moq. Let’s say we are using MockEF.Rhino…  Then the library will use Rhino to create its mock of the context. I.e. An in-memory collection version of the context will be created by Rhino’s MockRepository, and thus any functions you like to use of Rhino still work – this also has the added benefit of being able to mock anything that exists on your context’s interface – so if you have a bunch of methods you’ve created – theses are just mocked in the same way you always would.

Getting Started

To set up your context mock, simply call:

var context = new ContextBuilder<IMyContext>()
      .Setup(x => x.Authors, new List<Author> { new Author { Id = 1, Name = "bob" }})
      .Setup(x => x.Books)
  .GetContext();

The type argument of Context builder must be an interface. Then to use this in your code, you can set up a context factory, like so:

public interface IFactory
{
  IMyContext Create();
}

public class Factory : IFactory
{
  public IMyFactory Create()
  {
    return new MyContext();
  }
}

//class that needs to use the context
public class Example
{
  private IFactory _factory;
  public Example(IFactory factory)
  {
    _factory = factory;
  }

  public bool MethodThatUsesContext()
  {
    //Important. Your context interface MUST implement IDisposable. 
    //Firstly because this won't work otherwise, Secondonly because - you should anyway.
    using (var context = _factory.Create())
    {
      //Use context here.
    }
    return true;
  }
}

//When testing you can then stub your factory to return the context you have set up, like so:
var factory = MockRepository.GenerateMock<IFactory>();
factory.Stub(x => x.Create()).Return(myMockedContext); //context built using Rhino.MockEF.
//then
var result = new Example(factory).MethodThatUsesContext();
Assert.IsTrue(result);


//When running the code not in test, in you DI registrations you can use something like:
container.Register<IFactory, Factory>();

Current Limitations

Currently, if you have any calls to .Find(..) this will only look for fields with the [PrimaryKey] attribute – if this is set up as something else or setup elsewhere -it’s likely to fail.

.SaveChanges() or .SaveChangesAsync() will pretty much do nothing. Any adds or updates are automatically applied to the in memory collection. Which is not the same behaviour as real world. A good example of this is that the Attach(..) and Add(..) methods do the same thing.

To be Done

Plan to add more mocking frameworks.

Extension methods on libraries. Rather than manually creating a ContextBuilder – hide the methods behind an extension applied to the mocking framework’s library.

Test support for versions < EF6

 

 

Keep an Eye on Your Memory Consumption When using Akka.Net

I was recently working on a system that essentially involved using Akka to repeatedly call an Api over HTTP, transform the data and store the result. Pretty simple stuff. Using Akka to do this seemed ideal as some of the data transforms were a little complex and I was dealing with a lot of data and a huge amounts of requests. I had setup the system to start its process every 5 minutes, using the built-in scheduler – like so:


system
.Scheduler
.Schedule(
TimeSpan.FromSeconds(0),
TimeSpan.FromMinutes(5),
someActor, someMessage);

What I did wrong

Now, to keep it simple, lets say I had three actor types; CollectDataActor, TransformDataActor and StoreDataActor – and the process simply involved the CollectDataActor calling the Api then telling the TransformDataActor to do its thing and in turn this tells the StoreDataActor to, well, save the data.

What I was doing to achieve this was calling:


var actor = Context.ActorOf(Context.Create<CollectDataActor>());
actor.Tell(MyMessage);

And this would then collect the data which came back as a List<> if data. And then essentially created a TransformDataActor for every item in this list, and told them to process it.

Why this is a problem

As I mentioned at the top of the post, I was scheduling this process to run every 5 minutes. This meant that every single time the process was scheduled to run new actors were created at each stage, a new actor to collect, a huge amount to transform and subsequently store. This resulted in memory consumption just increasing and increasing over time. Not good.

Fix

At first my solution to this was to kill all the actors I had after I was done with them. However, this became hard to manage as the system grew and it didn’t work at all well for re-sending dead messages – it was a hack, to be honest.

The solution I used was to create in all of the actors I’d need before I start running. I knew I’d need only one CollectDataActor per process run, I’d need a shed load of TransformDataActors and the same for SaveDataActors (1 for each data item I transform).

What I did was create a router with a round robin pool of the TransformDataActors and SaveDataActors. This is a feature of Akka.Net that will essentially keep a pool of any given actor type, and when you message the router it send to an Actor within the said pool which is not busy (or has the fewest numbers of letters in its mailbox). Then, rather than creating an actor each time from the CollectDataActor, I can just select the router by its path and send it.

Code:


//setup code.
var transformPops = Props.Create<TransformDataActors>().WithRouter(new RoundRobinPool(50));
var transformRouter = system.ActorOf(transformPops , "transformDataRouter");

var saveProps = Props.Create<SaveDataActors>().WithRouter(new RoundRobinPool(50));
var saveRouter = system.ActorOf(saveProps , "saveDataRouter");

With these set up, the CollectDataActor code can look a little something like:


public class CollectDataActor : ReceiveActor
{
    public CollectDataActor()
    {
         Receive<object>(message => Handle(message));
    }

    private void Handle(object message)
    {
        List<TypeMcType> data = CollectData();

        foreach (var item in data)
        {
           var actorPath = "akka://path/to/your/system/transformRouter";
           var actor = Context.ActorSelection(actorPath);
           actor.Tell(item);
           //NOTE: here we use "ActorSelection", we do not create a new actor
           // -  this is the key difference!
        }
    }
}

Authenticate to Mongo Database that isn’t “admin”

If you have set up your Mongo instance by adding a user to the “admin” database to authenticate against, you may of run into some confusion about how you connect & authenticate against another database on that server.

Let’s say you are trying to use the “local” database… If you using command line, you would need to add the parameter –authenticationDatabase. So your connection would look something like:

mongo myserver.com --username martin.milsom --authenticationDatabase admin -p myPassword

The extra parameter lets Mongo know where the user is to authenticate against.

Now, if you are using the C# driver for this, as I was, then the answer is to include the same parameter in the connection URL, for example:

var connectionString = "mongodb://martin.milsom:myPassword@myServer.com:27017/?authenticationDatabase=admin";
var database = new MongoClient(connectionString).GetDatabase("local");

Note here, the Url Parameter "?authenticationDatabase=admin" is how you add the parameter.

A Few Extra AutoMapper Features

AutoMapper is really handy tool used to map properties from one object to another. Very useful in situations such as mapping your domain objects into simplified POCO classes you would expose from an API. For more visit their site. Having played with it a bit I have stumbled across a couple of cool little feature you may not know about.

Flattening

This can be useful if the structure of your two objects do not match, say for example you have the classes:


    public class Address
    {
        public string Street { get; set; }
        public string City { get; set; }
        public string PostCode { get; set; }
    }
    public class Phone
    {
        public string Home { get; set; }
        public string Mobile { get; set; }
    }
    public class Contact
    {
        public string Name { get; set; }
        public Address Address { get; set; }
        public Phone Phone { get; set; }
    }
    public class FlattenedContact
    {
        public string Name { get; set; }
        public string AddressStreet { get; set; }
        public string AddressCity { get; set; }
        public string AddressPostCode { get; set; }
        public string Home { get; set; }
        public string Mobile { get; set; }
    }

**You’ll notice that the property naming convention on the ‘FlattenedContact’ is different, this is on purpose, to show you how AutoMapper handles things differently.**

The simplest way to map, is just use

Mapper.CreateMap<FlattenedContact, Contact>();

And then call

var contact = Mapper.Map<FlattenedContact>(myContact);

Except it’s not quite that simple. AutoMapper, in this instance, will not match the properties coming from the ‘Phone’ class, this is because the name of the properties they correspond to on the flattened object (‘Home’ & ‘Mobile’) are not prefixed with ‘Phone’ (the name of the source class). Whereas the ‘Address’ properties will be matched, because they are called things such as ‘AddressCity’ etc. Makes sense? If you do not wish to re-name your properties, there is something you can do to tell AutoMapper:

            Mapper.CreateMap<Contact, FlattenedContact>()
                .ForMember(dest => dest.Mobile, opt=> opt.MapFrom(src=> src.Phone.Mobile))
                .ForMember(dest => dest.Home, opt=> opt.MapFrom(src=> src.Phone.Home));

Using this mapper instead of the simple previous one gives AutoMapper the information it needs, so all will be fine and dandy!

Un-flattening

As the heading may suggest, this is the opposite of what we wanted to achieve before. We will stick with the same example classes we just used – imagine this time we want to convert from ‘FlattenedContact’ to ‘Contact’. To do this, you need to set a simple mapper for the custom type within the complex object and set rules to ensure AutoMapper adds them when mapping that object. Put simply, you create a map from ‘FlattenedContact’ to both ‘Address’ and ‘Phone’. It will look a little like this:

 Mapper.CreateMap<FlattenedContact, Address>();
 Mapper.CreateMap<FlattenedContact, Phone>();
 Mapper.CreateMap<FlattenedContact, Contact>()
      .ForMember(dest=> dest.Phone, opt=> opt.MapFrom(src=>src))
      .ForMember(dest=> dest.Address, opt=> opt.MapFrom(src=> src));

Then simply call:

var contact = Mapper.Map<Contact>(myFlattenedContact);

As with flattening, this only works with one convention. In this case only the Phone class is mapped correctly, the address properties are not. This is because it does not recognise the name of property “AddressCity” as a match to just “City”. You will have to explicitly do some member mapping for this, one option would be to map them all individually:

 .ForMember(dest => dest.Address.City, opt => opt.MapFrom(src => src.AddressCity))
 .ForMember(dest => dest.Address.PostCode, opt => opt.MapFrom(src => src.AddressPostCode))
 .ForMember(dest => dest.Address.Street, opt => opt.MapFrom(src => src.AddressStreet));

Alternatively, you could tell automapper to look out anything with a specified prefix in the property name, like so:

 Mapper.RecognizePrefixes("Address");
Mapper.CreateMap<FlattenedContact, Address>();
Mapper.CreateMap<FlattenedContact, Phone>();
Mapper.CreateMap<FlattenedContact, Contact>()
.ForMember(dest => dest.Phone, opt => opt.MapFrom(src => src))
.ForMember(dest => dest.Address, opt => opt.MapFrom(src => src));

Reverse Map

This is a nice and simple one. If you’re mapping from one object to another, you can call

Mapper.CreateMap<Source, Destination>().ReverseMap();

Instead of

Mapper.CreateMap<Source, Destination>();
Mapper.CreateMap<Destination, Source>();

Creating Your Own Rules

Sometimes you may want to do something complicated during your map, such as casting a property. This can be done by setting up a rule. To make this happen, you need to create a class that inherits from AutoMapper’s ‘TypeConverter’ class

 public class CommaDelimitedStringConverter : TypeConverter<string, IEnumerable<string>>;
{
  //This method is a v.simple one to convert from a comma
  //delimted string and output it as a list of strings
  protected override IEnumerable<string> ConvertCore(string source)
  {
       return source.Split(',').ToList();
  }
}

and then register this rule using

 Mapper.CreateMap<string, IEnumerable<string>>()
    .ConvertUsing<CommaDelimitedStringConverter>();

Alternatively you can just pass in a Func, for example:

Mapper.CreateMap<string, int>().ConvertUsing(Convert.ToInt32);

EDIT: As of version 4.2 of AutoMapper, registrations are not longer static and look a little more like:

 Mapper.Initialize(cfg =>
 {
 cfg.CreateMap<FlattenedContact, Contact>();
 cfg.CreateMap<MyType, MyOtherType>();
 });

Happy Mapping!

An Educated Guess is Still a Guess

In the real world…

Two weeks ago I moved house. I, obviously, needed to pack up my room. To do this task I had estimated both the time it would take and the amount of transportable storage I would need. I decided I could perform the task over three evenings, first packing the electrical devices, then some odd bits and bobs I had lying around and lastly, I would pack clothes. I had three big cardboard boxes, two large suitcases as well as a couple of rucksacks.

The good news was, I fit all of my stuff into the storage I had. Admittedly with some squashing of suitcases and some overflowing boxes – but I did it!!

Unfortunately, the time I allotted myself was nowhere near enough and I ended up packing until the early hours in the morning before I moved – but, hey, we all do overtime every now and then.

Even more unfortunate for me was that this move did not work out from the get go, and I’m now sat here surrounded by half-packed stuff – ready to move again. Joy.

The Educated Guess

Now that I’m packing again I felt this time I knew exactly how long it would take and what level of storage I would need. I had exactly the same amount of stuff as before and knew where everything was, easy!

Sadly, incorrect! The packing took even longer. I guess the stress of the double-move had a negative effect on my motivation for the task (I guess that’s why I’m sat writing this & not packing!). Also the boxes & suitcases etc that I had did not suffice and I had acquire another large box and another bag. How annoying!

So, what am i getting at?

In the software world most people estimate and size our work items before doing them, usually this leads us committing to do a task in this time. If we relate this way of working back to my real-world moving example…

On the first move the perception would be that I “succeeded”. All of the stuff that needed to be moved, appeared to be moved. Awesome. Except:

  • I crammed so much stuff into my suitcase that its broke the zip opening it. doh!
  • My boxes we so full that stuff fell out when carrying it – and I still can’t find my computer mouse…
  • I stayed up so late that move day was a real slog.

So, from this I would like to draw your attention to the dangers of doing anything (and at any cost) to stick to your estimate. That missing computer mouse could be a very damaging bug you overlook. Not only that, I “worked overtime” to get the job done. This caused a huge detriment to my ability on the following day – if moving was my day job i certainly wouldn’t be able to keep this up every week.

Second-time around is worse, I’ve already underestimated both the time and the storage I need. What’s the reason for this? Simple, this task [despite appearances] is not the same as the last one. First I have changed – lets face it moving twice in a fortnight is not a small amount of stress and no matter the task motivation is key. Secondly, the task itself is different. Okay, I’m moving the same stuff, but everything has a different place to be before. This is the same when writing software, no two tasks are the same, sure you can draw similarities but ultimately you’re just guessing again – and just because one task is similar to another you can still be wildly far off your estimate.

Is Estimation Really Worthwhile?

Getting better at estimation work as a programmer is a a skill that most seem to seek out. There are some great tips on how to get more accurate estimates, none more so than here:

http://simpleprogrammer.com/2014/10/27/5-ways-software-developers-can-become-better-estimation/

But how much time must we spend estimating tasks? We can all sit researching the code base, assessing the backlog items until we’re blue in the face but ultimately, you will never know how long it will take until its done. Spending a large amount of your time preparing to give an estimation seems a lot like waste, to me.

 

Why?

So why do we get hung up on it? A lot of us run a SCRUM process and at the end of our sprints we’re either all very happy, or very depressed. Our mood is dependant on whether we finished the work that we, ourselves, estimated we could do before we started the sprint. If we do everything on time – we estimated well. If we have stuff left to do – we estimated badly. So why would we want to feel depressed after a two week slog of producing (I hope) great work because of a speculative call I made at the beginning of this process?  I think it is a much better thing to judge our work based on how good our work is – simple. But to do this we need to stop ourselves putting so much importance on our estimations – or at the very least prevent them from becoming a binding contract between ourselves and the business. After all, a certain agile principle does state:

“Working software is the primary measure of progress.”

 

Another way

I’ve been reading a fair bit recently on the idea of NoEstimates, which I believe can be attributed to Neil Killick. Read:

http://neilkillick.com/2013/01/31/noestimates-part-1-doing-scrum-without-estimates/

or check out #NoEstimates for a whole lot of discussion on this.

I tend to hear a lot that estimations are required otherwise we will not know we can deliver on time. This is fair, some people have clients and customers to keep happy. However, I have previously worked in a team that do not estimate their work, we ran a Cranked (www.crankedalliance.org) process. In this process, we did not fix the time of our iterations but the scope of them. By doing this, we can achieve some forecasting of work, without once having to guess how long a piece of work will take. All we did was take the range of time of our iterations (removing outliers) and multiplied by the number of work items a feature would require. Simple.

For example, say we needed to implement a login feature. This feature would require creating the back-end code, some form of UI and we need to make sure we restrict access to the application for non-logged-in users. That’s three pieces of work to me. The team’s average iterations take 2-3 days. So that equates to 6 to 9 days (3*[2 to 3 days]). Okay, its a simple example but the idea works. You keep you work items small and focussed and you ultimately don’t have waste any time estimating any more.

The beauty of removing this concern away from us programmers and techy folk is that we can just concentrate on making the product beautiful without spending our time or our emotional well-being on the concern of sticking to our estimations.

And this is just one thing you can do to help remove the waste in estimating – there is a whole bunch of material out there 🙂

Put simply

Let’s start making the quality of our code and the quality of the products we create our primary concern, not our estimations!!

 

 

More Than Just ‘Passing’ the Sprint

When running a SCRUM process you will have a product backlog with a list of items. Each of theses items will be sized at a planning meeting before they are brought onto a sprint iteration. All pretty standard. The point I would like to talk about is the sizing. Sizing is for everything. The design, the coding, the testing and whatever else you have to do. This means if you do not test the code then that item is not completed.

The problem

A situation arose once where a bunch of code was written to satisfy a product backlog item but it arrived late on the last day of the sprint cycle, no manual testing was done. This was handled not by declaring the backlog item unfinished but by carrying the testing work through to the next sprint as a new item – so it appeared the item had been completed.

The problem with this is not that work was not completed, or that work was carried forward. This happens, it’s fine. The issue is firstly with calling untested work ‘done’ and secondly the de-coupling of the coding with the testing. By doing this you sacrifice collaboration between the developer and tester – this should not happen. Collaboration is key to agile.

What’s the motive?

In my opinion it’s down to the fear of ‘failing’. But what really fails? Ultimately, what we want to achieve is creating great software. Agile/SCRUM are enablers for this. ‘Failing’ a sprint iteration because some work is incomplete does  not mean you fail to create great software effectively. It just means this time round, things didn’t go perfectly (when does it?), and one feature will be slightly later than expected . By measuring the success of your software based on the number of times you get a green tick next to the sprint, you’re likely to fall into some traps. Like in this example where you have had to fudge the principles in order to get ‘results’. Ultimately here you’re sacrificing your software in order for the audits to read well. To me, that’s a huge loss of sight on what’s really important – The software being the best it can be.

The Solution

The SCRUM guide states:

“All incomplete Product Backlog Items are re-estimated and put back on the Product Backlog. The work done on them depreciates quickly and must be frequently re-estimated.”

So there’s your answer. It’s the same backlog item. You re-estimate it. If part of the work is done (yes, that means its tested), then it is likely to be smaller next time around.

To me the solution is to also not to get too hung up ‘passing’ a sprint. If you tinker with the results just to get a result in the now then you will most likely never get a good grasp on your team’s velocity and find yourself in the same situation many more times.