Monday, February 21, 2011

Testing serializability

Serialization in .NET is pretty straightforward. It is most commonly used to (de)serialize XML or to send byte streams over teh interwebs. The latter is an important part of how the Windows Communication Foundation (WCF) works - as well as it's predecessor, .NET Remoting.

In order to (de)serialize a type in .NET you have to decorate it with the SerializableAttribute. This will tell the runtime to take an object of a specific type and all of it's data, flatten it into a series of bytes (hence the term "serialized"), which then can be written as text into an XML file or chopped into packets and sent over the wire.

To do this, the runtime must know how to handle each and every member of the class. Value types such as int, double or enums as well as (special-case reference types) strings can be taken care of out-of-the-box since the size of their data is known to the runtime. In case of fields for custom classes, however, you need to do a little more work. Those are expected to be decorated with the SerializableAttribute as well. Means that as soon as you start serializing complex object graphs, every one of those complex types needs to be marked with said attribute.

using System.Runtime.Serialization;

// must mark this serializable, even if it would implement ISerializable
[Serializable]
public class MySerializableClass
{
    private double someDouble;
}

This is where stuff can get hairy. Say you have a main class that happens to be the "root" of that object graph. And someone on your team makes a modification to it, e.g. refactors some of it into new classes that are added as fields. Then those new classes must have the SerializableAttribute, otherwise you will end up getting exceptions. At runtime. No line of defense at compile time! What a mess. I would have expected Visual Studio to offer some help here, or at least have a nice feature in ReSharper to make all affected types [Serializable] recursively.

using System.Runtime.Serialization;

[Serializable]
public class MySerializableClass
{
    private double someDouble;
     
    // if not set to be ignored, will be expected to be serializable at runtime
    private ThatOtherSerializableClass complexObject;
}

// now this guy has to be marked serializable, too!
[Serializable]
public class ThatOtherSerializableClass
{
    private string SomeString { get; set; }
}

At some point I got fed up with debugging cryptic exceptions in log files for hours just to find that someone made a change to that (way to complex) class and had forgotten to mark a new component serializable. In order to move the debugging closer to compile time I wrote a very simple test that is now executed for every change in the source control repository. Here it is:

using NUnit.Framework;

[TestFixture]
public class MySerializableClassTest
{
    [Test]
    public void Serialize_BinaryDeserialize_ThrowsNoSerializationException()
    {
        var serializer  = new MySerializableClass();
        var stream = new MemoryStream();
        var formatter = new BinaryFormatter();
 
        try
        {
            Assert.DoesNotThrow(() => formatter.Serialize(stream, serializer),
            "Class and all of it's components must be [Serializable]");
        }
        finally
        {
            // close even if test fails
            stream.Close();
        }
    }
}

What it does should be quite obvious to those of you who actually looked through it. For those who just skimmed to the explanation (boo!), here goes. MySerializableClass is some type that is marked serializable and consists of other members that should be serializable. This is usually the root of your object graph that you will be serializing. It uses NUnit's Assert.DoesNotThrow assertion method to make sure the deserialization process does not cause any SerializationExceptions to be thrown.

You might notice the try/finally block in there. Yes, I am aware of the fact that handling exceptions in tests is usually a bad smell. It usually indicates that you are catching exceptions, making it less readable, more complicated to comprehend (=maintain!) and could even falsify the test by catching one of NUnit's exceptions. In this case I am just going the extra mile and ensuring that the MemoryStream used to serialize the object for this test into memory is closed, even if the test fails. In that case, NUnit will throw an exception and without the finally that stream would not be closed. 

You can probably omit the try/finally block and the call to Close(), since those resources will be cleaned up by the runtime sooner or later. That part does hurt readability alot. It is your call which one you prefer, I usually go for readability in tests. So, for the sake of completeness, here is a version without that try/finally - much better readable.

using NUnit.Framework;
 
[TestFixture]
public class MySerializableClassTest
{
    [Test]
    public void Serialize_BinaryDeserialize_ThrowsNoSerializationException()
    {
         var serializer  = new MySerializableClass();
         var stream = new MemoryStream();
         var formatter = new BinaryFormatter();
 
         Assert.DoesNotThrow(() => formatter.Serialize(stream, serializer),
         "Class and all of it's components must be [Serializable]");
    }
}

That's it, simple test that can save lots of time. Now we know about that issue almost at compile time, if you run tests before you check-in code (and you should!), you will find this before any damage is done. At the very latest it should show up on your continuous integration server (assuming you have one; and you probably should!), which is still much better than at runtime. 

One more thing. I consider this test to be an integration test, not a unit test. It all runs in-memory, which could classify it as a unit test - but depending on the size and complexity of the serialized graph it can be a time consuming test (serveral seconds!) and involve many other classes (the members that should be marked serializable). Therefor, if you have a place (i.e. project) dedicated to integration tests, I recommend to put it in there.

For details on how serialization works in .NET, check out the ubiquitious MSDN documentation.

(Update: It's worth mentioning that this approach will only tell you if the class and it's explicit member types are serializable. It will not, however, cover the fact that the references in the serialized class could hold derived types at run-time. Since the SerializableAttribute is not inheritable, you could still end up with run-time exceptions when the actual types are not serializable.)

Monday, February 14, 2011

Just do it

Blogs have become really popular several years ago. Now that the hype is over it should be safe for me to create one without risking to be considered surfing on the hype wave. (Note to self for 2015: Have a look at that Twitter-thingy.) But where do I begin? Many topics have been building up in my head, and I often have thought to myself "this would make a neat blog post". Getting something started can be tough, a phenomenon known to have killed many aspiring (not only) software projects before they even began.

Starting a new project is not so different from me creating this new blog. You are full of ideas, have a pretty good picture of what you want the result to be like, but wonder where to set that very first step to get there. It can be very tempting and alot of  fun to research, plan an design for a young and innocent project. It is like a greenfield wonderland where we can dream up anything we would like to do. Go for it. Go nuts. But only a little.  Otherwise you will find yourself trapped inside the infamous procrastination loop instead of getting any real work done. Timeboxing your preparations can help.


The infamous procrastination loop
Just do it - this catchy and slightly dingy slogan is something people probably should keep in mind when starting something. To avoid what is known as "startup fatigue" it is important that you stop talking about something and actually start doing it. You will soon find that many things you were dreaming of turn out to be more complicated or even entirely different than you thought they would be. And - assuming that there is a customer somewhere upstream - it gets even worse: all your beautiful little plans will be shattered by changing requirements. It is important to discover these real problems early. You can fix things while you go.

This is what I am now doing with this blog. I am churning out a first post. That makes it far more valuable to you than the mere idea of a blog. You can actually use it, read it. My English might not be perfect, and my plans for this blog are not very detailed. Maybe it's slightly confusing, badly structured or full of typos. But bear with me. You might not like it now, but I hope to get feedback from you so I can make it better. I will tackle challenges as they come up instead of trying to predict all possible issues that might arise. And who knows, maybe some of you will find one or two useful things in it, too. If there ae no readers and thus no feedback, then at least I did not waste too much time polishing it for a null-audience.

At this point you might think this sounds like a logical and reasonable way to approach a task. That's because it is. And I am surely not the first one to point that out. Some call it common sense. Others call it pragmatic. Again others call it agile. Brand it what you like, it is just my preferred way of approaching a problem. Not  only for blogs, but also in software. That just happens to be what I do for a living. So that is what I intend to write about here. Smart & lazy ways of solving problems. Stuff I come accross, think about, like or dislike. This was the first step, and the subsequent ones will be much easier now that I actually got started.

Tags