Hooking into session start events in an HTTP module

According to the MSDN documentation you can't handle Session Start events in an HTTP module. The reason for this is because when you are initialising an HTTP module you hook into events associated to the HttpApplication class, and HttpApplication doesn't expose any events relating to the starting of sessions.

As a refresher, the Init method will typically look like something like this:

public void Init(HttpApplication context)
{
    // Hook into the HttpApplication events you want to respond to
    context.BeginRequest += this.BeginRequest;
}

So it's not possible… right? Wrong, but only if you don't mind a slightly dirty hack.

I should probably make the standard disclaimer for this sort of thing - it works on my machine! -  There are no guarantees if it will work in your environment, and would strongly recommend testing it fully!

The trick is getting access to the SessionStateModule in the Init method, and hooking into the Start event from there, like this:

public void Init(HttpApplication context)
{
    var module = context.Modules["Session"] as SessionStateModule;
    if (module != null)
    {
        module.Start += this.Session_Start;
    }
}

private void Session_Start(object sender, EventArgs e)
{
    // Respond to the session start event however you need
}

This works by relying on the fact that there's a module in the application's HttpModuleCollection called Session - a fairly safe bet unless you're really messing with the httpModules definition in the .NET Framework's web.config file. Note, however, it could break if a new version of the framework comes along and names the module differently, but it hasn't changed in any of the framework versions that have been released to date.

Article on using SqlBulkCopy with POCOs

I realised that I never pushed this on my blog at all, so belatedly I will.

Last year I wrote an article for Developer Fusion that discussed how you could make use of SqlBulkCopy to do high performance inserts when you were working with POCOs, rather than inserting them entity by entity using an ORM.

The bottom line was that inserting 10,000 records took only 57ms using the generic SqlBulkCopy approach I describe, rather than 2159ms to insert them on a record-by-record basis.

You can check the article out here: Using SqlBulkCopy for high performance inserts

DynaCache- just like page output caching, but for classes

Anyone who has done any serious work with any ASP.NET based framework will know that page output caching is a great feature. For those not familiar with it, the basic premise is that it makes sure that the generation of content is done only once for a set of parameters, and that all subsequent requests with the same parameters are served up from the cache for a specified period of time.

Wouldn't it be nice if you could do a similar thing for methods on classes, just by applying an attribute like this?

[CacheableMethod(30)]
public virtual string GetData(int id)

Thanks to a little library called DynaCache, you can!

How does it work?

Say you have a class called TestClass:

image

The LoadData method is marked as virtual, and has an attribute called CacheableMethod applied to it, indicating the number of seconds the results should be cached for.

The CachableMethod attribute is the first interaction with the DynaCache framework - the second is with a class called Cacheable:

image

Calling Cacheable.CreateType<TestClass>() at runtime creates a new class called CacheableTestClass, deriving from TestClass and overriding any methods with the CachableMethod attribute, resulting in a class hierarchy like this:

image

This new class is created using Reflection.Emit and only exists in memory, so you can't use a reflector-like program to see what's generated, but if you could it would look something like this:

public class CacheableTestClass : TestClass
{
    private IDynaCacheService cacheService;
    
    public TestClass(IDynaCacheService cacheService)
    {
        this.cacheService = cacheService;
    }

    public override string LoadData(int id)
    {
        string cacheKey = String.Format(CultureInfo.InvariantCulture, "TestClass_LoadData(Int32).{0}", id);
        object result;
        if (!this.cacheService.TryGetCachedObject(cacheKey, out result))
        {
            result = base.LoadData(id);
            this.cacheService.SetCachedObject(cacheKey, result, 200);
        }
        
        return (string)result;
    }
}

Notice the IDynaCacheService parameter in the constructor? That's the third piece of the DynaCache framework - an instance of a class capable of interacting with whatever backing cache is being used. Out of the box DynaCache includes a concrete implementation of this called MemoryCacheService - it's just a wrapper around a .NET 4 MemoryCache instance. There's no reason why you shouldn't create your own though, e.g. for the ASP.NET or Windows Azure cache.

Making it simple with dependency injection

What this all means is that at runtime you need to be using CacheableTestClass rather than TestClass, otherwise all the generated caching code will never be used. Although it's possible to construct and use these types yourself, the simplest and best way to do that is to use a dependency injection framework, such as Ninject, StructureMap, etc.

For the sake of illustration, I'm going to use Ninject. The original configuration would have simply mapped ITestClass to TestClass, like this:

kernel.Bind<ITestClass>().To<TestClass>();

Using DynaCache is only marginally more complicated, you just configure your kernel like this:

kernel.Bind<IDynaCacheService>().To<MemoryCacheService>();
kernel.Bind<ITestClass>().To(Cacheable.CreateType<TestClass>());

The first line configures the cache service to pass to instances of CacheableTestClass, whilst the second binds ITestClass to the cacheable version of TestClass.

That's all there is to it! Now every time an instance of ITestClass is required, Ninject will construct and return an instance of CacheableTestClass - the rest of your code that consumes ITestClass will automatically make use of the dynamically constructed caching code.

Where to get DynaCache

You can get it from the CodePlex project site, or you can install it into your project using the Nuget command:

Install-Package DynaCache

Summary

Hopefully all the detail hasn't put you off - this whole article essentially boils down to just these three steps:

  1. Make the methods overridable and apply the CacheableMethod attribute to them.
  2. Configure your DI framework to return an instance of IDynaCacheService, either MemoryCacheService or one you have implemented yourself.
  3. Configure your DI framework to return the result of Cacheable.CreateType for your type.

I'd be really interested in feedback for DynaCache - let me know your thoughts in comments below, or on the project discussion board.

Tutorial: Using LIFTI in an MVC 3 web application

Updated 25/02/2012 - it was highlighted that some of the seach phrases used towards the end of this article were not returning the expected results - that was down to me making assumptions about which words would be stemmed - these examples have been updated.

This tutorial will take you through an end-to-end implementation of a simple web site to manage a list of employees. LIFTI will be used to perform full text searching of plain text information associated to the employees.

Whilst the site will be built upon MVC 3, Entity Framework Code First and Ninject, the use of these technologies is largely arbitrary - you should be able to switch them out for any other appropriate frameworks.

Important: This tutorial relies on you having nuget installed - trust me, it makes the process of setting up your project dependencies so much easier.

Getting started

To get started, create a new empty MVC 3 application so you have a basic structure for the web application to be built from.

image

image

Now use nuget to add LIFTI - open the Package Manager Console (View/Other Windows/Package Manager Console) and type:

Install-Package LIFTI

The LIFTI assembly will be downloaded and added as a reference to your project.

Install the EntityFramework.SqlServerCompact and Ninject.MVC3 packages using the package manager:

Install-Package EntityFramework.SqlServerCompact
Install-Package Ninject.MVC3

Adding these packages will automatically pull through all these packages (some of them are the dependencies of the two you added):

  • EntityFramework
  • WebActivator
  • SqlServerCompact
  • EntityFramework.SqlServerCompact
  • Ninject
  • Ninject.MVC3

Both Ninject.MVC3 and EntityFramework.SqlServerCompact packages add code files into a folder called App_Start. The SqlServerCompact class configures the default connection factory for the entity framework library to use the SQL Server Compact connection factory; the NinjectMVC3 class allows you to configure your dependency injection - you'll come onto that later.

Create a model and data context

Add Employee.cs to the Models folder of the project containing:

public class Employee
{
    [Key]
    public int EmployeeId { get; set; }

    [StringLength(100), Required]
    public string Name { get; set; }

    [Required]
    public DateTime DateOfEmployment { get; set; }

    public string Notes { get; set; } 
}

Add EmployeeDataContext.cs to the Models folder:

public class EmployeeDataContext : DbContext
{
    public DbSet<Employee> Employees
    {
        get;
        set;
    }
}

Scaffold out the site

Make sure that you have built the project and right-click on the Controllers folder, selecting Add Controller…:

  • Name the controller EmployeesController
  • Make sure the template "Controller with read/write actions and views…" is selected
  • Select Employee as the model
  • Select EmployeeDataContext as the data context
  • Press Add

The scaffolded actions and views will be created for you (very handy in tutorials like this!).

Before you run the project, make sure you have added the App_Data ASP.NET folder to the project by right clicking on the project and selecting Add > Add ASP.NET Folder > App_Data.

At this point you should be able to run the project to make sure that everything you've done so far is correct. You should be able to fire up the project and navigate to http://localhost:xxxx/Employees where xxxx is the port number for your project:

image

Ok, so it doesn't look very sexy, but it's a website with a SQL Server Compact database sitting behind it, all up and running in just a few steps.

Introducing LIFTI

The full text index that you will be using is an updatable full text index that will contain the IDs of employees indexed against the text in their notes. More specifically, this will be an instance of PersistedFullTextIndex<int> - this type of index is backed by a file store, which means that if the web application is stopped and started, the index will not lose all its data and will be able to pick up where it left off. Significantly this means that you will be able to keep the index and database in sync, assuming that whenever the database is updated you also update the index.

Under most circumstances there should only ever be one instance of your index in memory. This means that all of your code, whatever thread it is on, should interact with the same index (don't worry, LIFTI's implementation of the full text index is thread safe.). You could implement this is any number of ways:

  • Using the singleton pattern
  • Storing the index in a static variable somewhere that can be access by the depending code
  • Storing the index in Application state
  • Using dependency injection to provide one common index to any depending code

Like all good developers these days, I'm sure you'll want to take the dependency injection route! The first step to this is to add your index into the dependency injection framework.

Open App_Start\NinjectMVC3.cs and change the RegisterServices method so it looks like this:

private static void RegisterServices(IKernel kernel)
{
    string filePath = Path.Combine(
        (string)AppDomain.CurrentDomain.GetData("DataDirectory"), 
        "Index.dat");

    kernel.Bind<IUpdatableFullTextIndex<int>>()
        .To<PersistedFullTextIndex<int>>()
        .InSingletonScope()
        .WithConstructorArgument("backingFilePath", filePath)
        .OnActivation((IUpdatableFullTextIndex<int> i) =>
        {
            i.WordSplitter = new StemmingWordSplitter();
            i.QueryParser = new LiftiQueryParser();
        });
}

The filePath variable is configured so that the index data file (index.dat) will be stored in the App_Data folder for the application. (that's where "DataDirectory" points by default in a web application.)

Then the Ninject kernel is instructed that:

  • Whenever an instance of IUpdatableFullTextIndex<int> is requested (Bind)
  • Map it to an instance of PersistedFullTextIndex<int> (To)
  • And re-use it globally, i.e. only ever create one instance (InSingletonScope)
  • When it is constructed, pass the path to the index data to the "backingFilePath" parameter (WithConstructorArgument)
  • And finally, when it is activated, set the WordSplitter and QueryParser properties to instances of StemmingWordSplitter and LiftiQueryParser, respectively. (OnActivation)

Now you just have to consume the index and use it in the controller.

Updating the index

Open the EmployeesController class and add a constructor:

private IUpdatableFullTextIndex<int> index;
public EmployeesController(IUpdatableFullTextIndex<int> index)
{
    this.index = index;
}

Ninject (in conjunction with the nice dependency resolution in MVC 3) will take care of providing your controller with the relevant instance of the index whenever it is constructed.

There are 3 places in the controller that you need to interact with the index to keep it in sync with the database:

  • Creating an employee - the index should be updated if notes are provided for the employee
  • Updating an employee - the index should be updated if the employee has notes, or have any indexed text removed if the employee no longer has any notes
  • Deleting an employee - any previously indexed notes for the employee should be removed

Update the HttpPost Create method so it looks like this:

[HttpPost]
public ActionResult Create(Employee employee)
{
    if (ModelState.IsValid)
    {
        db.Employees.Add(employee);
        db.SaveChanges();

     if (!String.IsNullOrEmpty(employee.Notes))
        {
            this.index.Index(employee.EmployeeId, employee.Notes);
        }

        return RedirectToAction("Index");  
    }

    return View(employee);
}

Then the HttpPost Edit method:

[HttpPost]
public ActionResult Edit(Employee employee)
{
    if (ModelState.IsValid)
    {
        db.Entry(employee).State = EntityState.Modified;
        db.SaveChanges();

     if (String.IsNullOrEmpty(employee.Notes))
        {
            this.index.Remove(employee.EmployeeId);
        }
        else
        {
            this.index.Index(employee.EmployeeId, employee.Notes);
        }

        return RedirectToAction("Index");
    }
    return View(employee);
}

And finally, the HttpPost Delete method:

[HttpPost, ActionName("Delete")]
public ActionResult DeleteConfirmed(int id)
{            
    Employee employee = db.Employees.Find(id);
    db.Employees.Remove(employee);
    db.SaveChanges();

 this.index.Remove(employee.EmployeeId);

    return RedirectToAction("Index");
}

Try the application out again, you should be able to create, update and delete employees without any problems - the full text index will be built up in the background.

Searching for employees

The last step is to allow users of your site to search for interesting text within the employee notes.

Update the Views\Employees\index.cshtml file with a search textbox just after the <h2> tag:

<h2>Index</h2>

@using (Html.BeginForm()) {
<p>
    @Html.Label("searchCriteria", "Search employee notes:")
    @Html.TextBox("searchCriteria") 
    <input type="submit" value="Search" />
</p>
}

Now add a new method to the EmployeesController to handle the posting of the search text:

[HttpPost]
public ViewResult Index(string searchCriteria)
{
    if (String.IsNullOrEmpty(searchCriteria))
    {
        return Index();
    }

    var matchingIds = this.index.Search(searchCriteria).ToArray();
    var employees = this.db.Employees
        .Where(e => matchingIds.Contains(e.EmployeeId))
        .ToList();

    return View(employees);
}

The astute amongst you will notice that the data context isn't injected in the same way that the index is - good spot. This tutorial is long enough, so I'll leave that as an exercise for you to fix up if it's bugging you that much!

Testing the application

Build and run the project create these employees:

Name Date of employment Notes
Ralph 12/08/2008 Often arriving late to work and frequently takes long lunch breaks
Tracy 02/02/2010 New employee, very diligent worker, and no-one is doubting their commitment, but sometimes acts suspicious when asked about the amount of sick leave taken
Bob 23/11/2003 Works long hours. Arrives early to work and generally stays later than others and works through lunch. Has a tendency to break the build with a high frequency though.
Andy 10/08/2007 Very clean desk - there are doubts that he doesn't actually do anything at work


Try some of these search criteria on the index page:

doubts

Andy is obviously matched - in his notes he has "doubts" specified exactly. However there's more going on here because Tracy is also matched. This is because you're using the stemming word splitter process words in the index, and that automatically removes some word suffixes, such as the "s" from "doubts" and "ing" in Tracy's "doubting" - this means that when you searched for "doubts" you were actually searching for anything that stemmed to "doubt".

emp

Nothing will come back - not even Tracy. This is because the default behaviour for the LIFTI query parser is to match words exactly.

emp*

Tracy will be returned - she is the only employee with notes containing a word that starts with "emp".

lunch & break (or lunch break)

Both Ralph and Bob will be returned - both of these contain derivatives of "lunch" and "break" in their notes.

"lunch break"

Only Ralph contains a phrase that contains "lunch" followed immediately by a derivative of "break".

… your search criteria here

The LIFTI search engine is quite powerful and there are loads of different search permutations you could try out, so try creating a few more employees and searching using some of the other operators, such as or (|) or near (~).

Entity Framework Code First: "The path is not valid. Check the directory for the database"

If you create a new MVC application and try to go down the Code First with SQL Server Compact route, you might encounter this error when you first start the application:

The path is not valid. Check the directory for the database. [ Path = …\WebSample\App_Data\YourDatabase.sdf ]

This error might be a surprise because by default the code first approach should create the database for you if it doesn't already exist.

Whilst this is true, what it apparently won't do is create any parent folders that the database will be stored under - in this case the App_Data folder. Adding this folder is easily accomplished by right-clicking on your project and selecting Add/ASP.NET Folder/App_Data. Once you've done this, everything should work as expected.

As an aside, the reason it's getting created in the App_Data folder is because that's the default location for "DataDirectory" in a web application. You might have seen this mentioned in sample connection strings like this:

<connectionStrings>
    <add name="MyDataContext" 
        providerName="System.Data.SqlServerCe.4.0" 
        connectionString="Data Source=|DataDirectory|MyDatabase.sdf" />
</connectionStrings>

You might get this error even if you're not explicitly specifying the connection string, as a very similar connection string will be used by convention if you don't provide one.

LIFTI XmlWordSplitter

The XmlWordSplitter is a new word splitter class in the latest release of LIFTI. I created it mainly because it was required for the persisted index sample, but it seemed too useful to keep out of the core framework.

At a very high level the XmlWordSplitter just enumerates words contained within elements in a piece of XML text. This means that element names, attributes and their associated values will not be indexed. For example, consider the following XML:

image

The xml splitter will return the following words:

Word Word index
THE 0, 6
QUICK 1
BROWN 2
FOX 3
JUMPED 4
OVER 5
LAZY 7
DOG 8

Importantly, notice that the word "the" is reported at positions 0 and 6 - the word index is relative to first word in the document, regardless of whether there are XML elements that interrupt the flow of the text.

To stem or not to stem?

One question that sprung to mind when developing this was whether the splitter should stem the words it returned, like the StemmingWordSplitter, or just return them verbatim, like the basic WordSplitter does? (The above example would be representative of the latter; a stemming word splitter would have returned words like "jump" instead of "jumped".)

Taking this question a step further, what if someone has put together their own custom word splitter and they want the xml splitter to behave like that?

To cater for this I decided to defer the splitting of text contained within XML nodes to a child IWordSplitter implementation. So when you construct an XmlWordSplitter, you do so like this:

var wordSplitter = new StemmingWordSplitter();
var xmlSplitter = new XmlWordSplitter(wordSplitter);

So if you don't want the stemming word splitter behaviour for returned words, you just need to swap it out for a different implementation. Neat.

Splitting search words

Previously LIFTI would always use the same word splitter implementation when splitting words that were being indexed and words that were being searched upon. Introducing the XmlWordSplitter had an interesting side-effect - although you were wanting to index text contained in XML, you probably didn't want to search for words contained in an XML format.

To handle this I added the SearchWordSplitter property to the IFullTextIndex interface - this allows you to specify a different word splitting implementation that should be used when splitting words in a search string. As a small token of my respect to backwards compatibility, if this property isn't specified or is set to null, then the splitter specified in the WordSplitter property is used, meaning that behaviour is unaffected for existing code.

Changes to the LIFTI API

This post relates to the breaking changes between version 0.4 and 0.5 of LIFTI. LIFTI is a full-text indexing library for .NET - find out more on its CodePlex site.

The latest release of LIFTI has several breaking changes that will affect you, so I wanted to take some time to explain not only what they are, but why I made them. I'll start with the what, then move onto the why.

No more constructor delegates

Previously when you wanted to construct a full text index, updatable or otherwise, you would have had to provide a delegate that was capable of taking the type of item in the index and returning the text it should be indexed against, like this:

var index = new FullTextIndex<Customer>(c => c.Name);

This is no longer the case - constructing an index is now as simple as it can be:

var index = new FullTextIndex<Customer>();

If you have been using LIFTI you'll probably be aware that the reason the constructor used to take a delegate was to provide some of the Index methods with the text an item should be index against; this leads me nicely onto the next change.

A rationalised set of Index methods

There used to be 5 Index methods - these have been reduced to 4, which can be broken into two categories, indexing keys and indexing arbitrary classes.

Indexing "keys" rather than "items"

It's often useful (and when it comes to using a persisted index, just plain sensible) to store just a key to an item in the index. That is to say, rather than storing a "Movie" class in the index, just storing the customer ids, e.g. an integer. There are two Index methods that support this:

// Explicitly pass the key and text values
index.Index(movie.MovieId, movie.Description);
index.Index(movieId, description);

// Or you can index an enumerable of keys like this
var ids = new[] { 23, 44, 192 };
index.Index(ids, i => LoadDescriptionForMovie(i));

Indexing arbitrary classes in the index

There are two Index methods on the IFullTextIndex interface that you can use to index instances of Movie directly, should you so wish:

// Pass the movie instance and use delegates to extract the 
// relevant information
var movie = new Movie { MovieId = 1, Description = "Best movie ever!" };
index.Index(movie, m => m.MovieId, m => m.Description);

// The equivalent "index many":
var movies = new[] 
{
    new Movie { MovieId = 1, Description = "Best movie ever!" },
    new Movie { MovieId = 2, Description = "Worst movie ever!" }
};
index.Index(movies, m => m.MovieId, m => m.Description);

No more Reindex methods

I'm hoping you'll not miss them though. Instead, when you're using an updatable index, either UpdatableFullTextIndex or PersistedFullTextIndex, calls to any of the Index methods will automatically remove the item from the index prior to indexing, if it's already there.

Serialization namespace is gone

That's right, gone, along with all the serialization classes in it. If you were using it, I really am sorry, but there were good reasons to do so, and I think there are much better alternatives now.

Ok, so why?

First up, I'll deal with the changes to the constructor and the Index methods, because they are both closely related.

Constructors and Indexing

While I was writing the original serialization code (even before the persisted full text index work) it became apparent that under most circumstances it was going to be best to store a simple value type (e.g. int) in the index, rather than an arbitrary class (e.g. Customer). The reason for this was twofold:

  1. Primitive types are just a lot easier to serialize - most of the time LIFTI can handle primitive types without any configuration.
  2. If you're serializing the full text index somewhere, the chances are the classes are going to be persisted somewhere else, probably a database of some sort. Persisting the classes in the serialized index is a bad case of data duplication and things will definitely get out of sync sooner or later.

Although it's possible, storing a simple id in the index doesn't lend itself naturally to using a delegate to read out the related text. It would usually mean having to call out to another method, like this:

var index = new FullTextIndex<int>(i => GetTextForCustomerId(i));

Ok, you could write it like this, but it feels even more odd:

var index = new FullTextIndex<int>(GetTextForCustomerId);

Indeed, sometimes it may not even be possible to write a delegate ahead of time to return the text for an id - you might just have access to an id and a piece of text at the time it comes to perform the indexing.

Taking all this into account, hopefully it's fairly clear why I decided to remove the delegate from the constructor and, because some of the Index methods relied on there being a pre-defined way of getting hold of the text for a key value, why it was necessary to rationalise the Index methods.

But couldn't I just have added a separate overload for the constructor, or allow the delegate to be null? Well, yes, naturally I could, and I tried it for a while, but I felt that it made the API a bit more confusing. Some of the Index methods had to throw exceptions at runtime if no delegate was provided upon construction - not very nice at all. At least this way you know where you stand - each and every Index method must be given enough information to identify a key value and its associated text.

Getting rid of the Reindex methods

I did this primarily because I was a little uncomfortable just throwing an exception up if an Index method was called and the key already existed in the index - in some circumstances it felt like this was the wrong behaviour. Realistically all this does is save the use of API needing to check to see if item exists and adjust their behaviour depending on the result.

Getting rid of old Serialization classes

The decision to do this influenced by a couple of factors. The first was the fact that I had just implemented the persisted index, which covers exactly what the old serialization code did with the added benefit that you don't need to remember to serialize the index when your application exits and you don't have to manually deserialize it when the application starts. The deserialization point is particularly interesting; the old serialization process required that the entire index was loaded into memory before it could be used - the persisted index lazy loads parts of the index as it is accessed, which for large indexes can make it available for use in a much shorter space of time. I'll cover more on this lazy loading in a later post.

Another factor was that in order to support the old serialization process I had to expose elements of the full text index class that I wasn't really happy doing. For example in 0.4 the RootNode property of the IFullTextIndex interface had a setter - allowing the root node to be changed this way was a bit scary and had big implications for the persisted index implementation. There were probably other ways around this, but this combined with the first point made the decision fairly easy.

Wrap up

This is the first time I have had to make significant breaking changes to the API - I'm not going to promise it's the last, but I think it is approaching something that resembles a stable state.

Some of these changes may be controversial, and I'm sure that other people will have differing opinions on how it should have been done. I'd be really interested to hear if the changes have been significantly problematic for you - either leave me a comment here, or start a thread on the discussions board. I want to hear all feedback, positive or negative!

Debugging "The agent process was stopped while the test was running"

I've encountered the unhelpful "The agent process was stopped while the test was running" MSTest error result a couple of times recently, and I thought that I'd share a few of my findings and approaches to debugging the unit tests that cause them.

The MSTest "Error" Result

In my experience, the Error result usually manifests itself for one of a two reasons:

  • A critical error occurred during the execution of the test. This might be due to something caused by the code executed by the test, for example a stack overflow error, but it may also occur when the test framework is unable to execute the test for some reason, e.g. it is unable to copy dependent files.
  • A test caused some code to be executed on another thread, and an exception occurred on that thread, not the main test one.

The first of these is easy to overcome, as I'll show later, however the second can, under some circumstances, be much trickier to diagnose.

Viewing the Test run error information

The error results are usually presented in the Test Results window like this:

image

Side note: you can see that I've grouped the tests by their result - this is my preferred approach to viewing my unit tests in this window.

When the test results contain an error, you can get more details by clicking on the "Test run error" hyperlink. This shows you the error details for each of the tests that ended up with the Error result, and may well provide you with enough information to solve your problem:

image

Here you can see that an error on the background thread caused the test to fail. If you're lucky, this window should given you enough information to fix your problem.

Background threads

Something that's important to consider is that if your code makes use of background threads that live on beyond the lifetime of the test, an exception raised on this thread can cause the error to be reported against a subsequent test!

Take these sample tests:

[TestMethod]
public void BadBackgroundTestCausingDelayedError()
{
    // Create another thread and execute it - this test will pass,
    // but the exception raised by the thread will cause ANOTHER test to fail.
    var thread = new Thread(() =>
        {
            DoSomething();
        });

    thread.Start();
}

[TestMethod]
public void GoodTest1()
{
    // Pretend to do some work
    Thread.Sleep(2000);
}

private void DoSomething()
{
    throw new Exception("I'm sorry Dave, you can't do that.");
}

When executed in order, you'll see this in the test results:

image

Uh, oh. The test that did nothing wrong was blamed for the failure of the bad one. Looking at the Test run error report will help a little in this case - the call stack looks like this:

One of the background threads threw exception: 
System.Exception: I'm sorry Dave, you can't do that.
   at TestProject1.UnitTest1.DoSomething() in UnitTest1.cs:line 56
   at TestProject1.UnitTest1.<BadBackgroundTestCausingDelayedError>b__0() in UnitTest1.cs:line 20
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.runTryCode(Object userData)
   at System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode code, CleanupCode backoutCode, Object userData)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()

If you look carefully, you might see some indication of the test that the exception originated from.

No call stack in the results window

You can't always infer the test at the root of the problem from this call stack. Indeed, there are times when you don't get any stack trace information - like this:

image

When this happens, I've found that it's usually because the exception that is being thrown from the other thread is a custom exception and the test framework is unable to de-serialize it into the test AppDomain. If you dig around in the debug output you might find exception details that indicate this:

E, 1020, 17, 2011/04/25, 22:34:14.366, MGLAPTOP\QTAgent32.exe, 
    Unhandled Exception Caught, reporting through Watson: 
    System.Runtime.Serialization.SerializationException: 
    Unable to find assembly 'TestProject1, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'.
   at System.Runtime.Serialization.Formatters.Binary.BinaryAssemblyInfo.GetAssembly()
   ...
   at System.Runtime.Remoting.Channels.CrossAppDomainSerializer.DeserializeObject(MemoryStream stm)
   at System.AppDomain.Deserialize(Byte[] blob)
   at System.AppDomain.UnmarshalObject(Byte[] blob)
The program '[1020] QTAgent32.exe: Program Trace' has exited with code 0 (0x0).
The program '[1020] QTAgent32.exe: Managed (v4.0.30319)' has exited with code -2 (0xfffffffe).

Debugging the tests

If you still haven't got to the bottom of your test Error, then you can always debug the tests. Before you start debugging, make sure that you're catching all exceptions from the Debug/Exceptions menu (or Ctrl-Alt-E):

image

Hopefully these tips will help you identify the cause of your unit test errors - good luck!

Using Windows Live Mesh to synchronize your draft blog posts

I use Windows Live Writer to put together all my blog posts, but I also use multiple computers. I'll sometimes find myself writing a draft post on one computer and wanting to carry on editing it on another.

Live Writer stores draft posts in Documents\My Weblog Posts\Drafts, and in the past I've just manually copied these draft files from one computer to another, however it recently occurred to me that I could use Windows Live Mesh to automate this for me. I'll show you how you can do it too.

Windows Live Mesh is part of Windows Live Essentials 2011 - you might not have Live Mesh if you're still using an older version of Essentials, or you explicitly chose not to install it as part of the latest version.

First, open Windows Live Mesh. If it's already running, then you'll see it's icon in the system tray ( image), otherwise launch it from the Start orb/button. Click Sync a folder:

image

Browse to Documents\My Weblog Posts and press Sync. Selecting the My Weblog Posts will synchronize both drafts and recent post entries.

image

All you have to do then is select the machines that you want to synchronize to:

image

If you want, you can even synchronize it to your SkyDrive, but I'm not bothering because having my draft posts backed up across 3 different computers seems like enough redundancy for me.

Once you've set Mesh up like this, any new posts that you create through Live Writer will automatically be synchronized and you can carry on working on your next big blog post from whichever machine you end up at.

Describing the LIFTI persistence file format

This post will break down how the data in a LIFTI persisted full text index is structured on disk. It might be a bit dry for some, so I've tried to spice it up as much as possible with pretty pictures!

Birds-eye view

image

The file is broken up into two areas, headers and page data. Each data page is is 8Kb in length and contains 1 or more entries. (The only time you might find a page that is not marked as unused and still contains 0 entries is when the index is completely empty.)

There are two types of data pages, item index pages and index node pages.

Item index pages contain entries that describe the actual items (keys) indexed within in the full text index. Each item is allocated its own unique ID, an Int32, which is used to refer to the item elsewhere in the file.

Index node pages contain information about the nodes in the n-ary tree. As with the entries in the item data pages, each node is allocated its own unique ID.

File headers

image

The index header starts with a known array of 6 bytes followed by the version number of the persisted index file format. This section is used to verify that the file is indeed a LIFTI data file, and that the current assembly is capable of reading it.

The page manager header contains the total number of data pages that the file currently contains (including any unused pages) and pointers to the first index node data page, and the first item data page.

Also contained here are the next sequential IDs that will be allocated to new entries contained within the different varieties of data pages.

Data page headers

Both data page types start with a header, structured in the following way:

image

After a byte indicating the type of page follows the next and previous page numbers. From this you might correctly infer that data pages are effectively doubly-linked lists, i.e. you can traverse them forwards and backwards. It's important to note that because of the way that pages are allocated (or split, if you want to use a SQL Server term) that the logical order of pages will significantly differ from their physical order.

By way of a trivial example, the pages may be physically ordered like this:

image

But are logically ordered like this:

image

The data page header also contains the internal IDs of the first and last entries in the page. Entries are stored in ascending order of their internal IDs throughout the data pages, so being able to reference the first and last entry ids in this way allows for a page containing a specific ID to be located quickly by performing a binary search, without loading the data in the page's body.

Finally, the data page header describes the number of entries in the page, and the current size of the page, including the data in the page header. The size of the page will always be less than or equal to 8Kb. Any unused space in a data page will be in an unknown state and not necessarily zeroed out.

Item index pages

image

The item entries contained within an item index page are just the internal ID of the item followed by the item data itself.

The item data will be whatever key data is stored in your full text index. For example, if you were storing file paths in your index, the item data would be a serialized string, whereas if you were storing integer IDs, the item data would simply be that integer.

Index node pages

If you consider the in-memory structure of the full text index (see the original LIFTI article for a more information about this) you might imagine something like this for an index of URLs against their content:

image

Each node in the tree has one or more references, either to another node or, in the case of the end nodes, to an indexed item.

Index node pages contain entries that reflect these references: referenced item entries and referenced index node entries:

image

Referenced item entries

In addition to the internal index node ID and the ID of the referenced item, the word index at which the word was matched is also persisted. (Word index positions are used by positional query operators, such as near and preceding)

One word may be matched in multiple positions for any given item - each one of these matches results in a separate entry.

Referenced index node entries

These entries contain the index node ID and the ID of the referenced index node. In addition they also store the character associated to the referenced index node.

Both these entries are best explained by example, so consider the index below - the internal node and item ids are the numbers in red:

image

The referenced index node entries that you would see in the file are:

Index node ID Referenced index node ID Matching character
0 1 A
1 2 P
2 4 P

And the referenced item entries would be:

Index node ID Referenced item ID Matching word position
4 22 12
4 22 86
4 37 2

 

Summary

At a high level, that's all there is to the contents of the persisted file. It's just a series of entries across a series of pages. Of course when it comes to managing the data there is a lot more that could be discussed, but I'll save that for another post.