Choosing a BDD framework for .Net

tl;dr

Specs aren’t written by developers: choose between SpecFlow and NBehave.

Specs are written by developers: my personal recommendation NSpec. Runner up StoryQ.

Introduction

BDD or behavior driven development is the practice of writing an executable testable specification that describes the application’s behavior. This specification is often written in a fluent interface, a DSL or in plain English (or rather close to plain English).

While in TDD the focus is on writing tests that single out units of the application, BDD is focused on writing tests on the behavior of the application. This can also be tought as testing features of the application.

Before presenting the frameworks I will make a distinction between xBehave and xSpec style frameworks.

xBehave

xBehave frameworks are about writing user level stories in a form comprehensible by anyone. These stories can be written by the users themselves or by a group consisting of developers, users and testers.

These framework typically use a story defined in a DSL close to English and then map this story to a test written in code by the developers. Cucumber is a well known example of such a framework.

xSpec

xSpec frameworks are usually about developers writing tests in code using an approach that favors testing behavior and functionality. These tests are usually closer to unit tests in both appearance and granularity but feature some differences.

Both approaches could be combined on a single project. The user stories could be converted to xBehave tests, while the developers could adopt an xSpec framework to replace or complement xUnit tests.

Available frameworks

Here is a compilation of xSpec and xBehave BDD frameworks for .Net. This list is current as of the writing of this post and while I tried my best to find most frameworks, I realized while compiling this list that there are just too many, I am sure I must have missed a few.

By the way, it’s not an error in the table, I have found two frameworks named NSpec which appear to be totally independent. The first one listed is the most active and most popular. The second one is an older project that hasn’t seen any recent activity.

Name Project development still active Type License NuGet package
SpecFlow Yes xBehave BSD style Yes
specunit-net No xSpec Apache License 2.0 No
Machine.Specifications / MSpec Yes xSpec xUnit and MS-PL Yes
NSpec Yes xSpec MIT License Yes
NSpec (older project) No xSpec zlib/libpng license No
NBehave Yes xBehave BSD 3 Yes
StoryQ No *** MIT Yes
NSpecify No, seems never to have taken off. xSpec ? No
.NetSpec No xSpec None No
xbehave.net Yes xSpec MIT Yes
SubSpec No xSpec MS-PL Yes
NJasmine Yes xSpec MIT Yes
SpecsFor Yes xSpec None Yes
Behavioral No xSpec BSD Yes
NaturalSpec Yes xSpec MIT / MS-PL Yes
BDDish No xSpec None Yes

*** This framework is harder to classify. Based on my interpretation of xSpec/xBehave and my comprehension of the framework, I personally place it in a gray zone. They suggest to start with a plain text story but it isn’t directly used by the code. The specifications are directly in C# test code contrary to other more “pure” xBehave frameworks. It is a sort of hybrid and does tend to lean much more on the xSpec side than on the xBehave side.

Narrowing the selection

First of all, I would eliminate the following frameworks from investigation: specunit-net, NSpec (older project), NSpecify and .NetSpec as they are abandoned inactive projects with most never having really made it to “production status”.

NaturalSpec is in F# which is great for those doing F#. There is a link to a post on it’s GitHub page about how to use it with C# objects, but I would personally prefer a library that targets C# for a C# project.

I wasn’t able to see much activity for BDDish (last commit to GitHub more than 2 years ago, only 3 questions on StackOverflow, not much documentation or examples). I would also skip this one.

As an unrelated side-note, out of the remaining more serious contenders xbehave.net and SubSpec are strikingly similar. At first glance, both projects seem to be closely related. If you are considering one of these two, you might as well look into the other as well.

Making a selection for an xBehave framework

The first question you need to ask yourself is if you want to do xBehave style tests in a language closer to plain text, with specs that can be written by non developers.

If that is so, your choice is pretty slim. Your choice is between SpecFlow and NBehave.

StoryQ could also be considered if SpecFlow or NBehave are not satisfactory.

Both SpecFlow and NBehave use Gherkin (the DSL used by Cucumber) to write their specs. Both store specs in *.feature files. Both also use attributes on methods when writing tests.

[Given("I am not logged in")]
public void LogOut()
{

Both have documentation available on GitHub.

SpecFlow seems to have more documentation and it is also recommended in this StackOverflow question in the top two rated answers.

On the other hand the SpecFlow doc is pretty dry on code examples. I personally preferred NBehave’s documentation over SpecFlow.

Making a selection for an xSpec framework

If you do not need for your specs to be written in a plain text like format and want to save time and work use an xSpec framework. You won’t need to write the binding code and writing tests will be easier for those already familiar with an xUnit framework.

On the xSpec side, you have a lot of options.

I narrowed my search to the following frameworks: MSpec, NSpec, SpecsFor, StoryQ and xBehave.net.

The following table compares those frameworks by considering their total number of questions on StackOverflow. I also rated the official documentation on a scale of 1 to 3, with 1 being the poorest and 3 the best score. This rating is largely subjective and represents my appreciation of the available documentation. It should be noted I prefer a style which includes code samples and which gets you rapidly on your feet.

Name Number of questions on StackOverflow Documentation
MSpec 519* 1 / 3
NSpec 62 3 / 3
SpecsFor 5 3 / 3
StoryQ 36 3 / 3
xBehave.net 5 2 / 3

*: I added the number of questions for both MSpec and Machine.Specifications. Even if just looking at the number of questions for either MSpec or Machine.Specifications, it is the clear winner here.

My personal recommendation for an xSpec framework is NSpec.

After trying each of these frameworks for a simple two tests scenario here is my personal rating:

  1. NSpec
  2. StoryQ
  3. SpecsFor

Here is a code sample of two simple tests using NSpec:

namespace NSpecTests
{
    class TicTacToe_specifications : nspec
    {
        void given_a_new_board()
        {
            it["A new tic tac toe board is empty"] = () => ticTacToeBoard.IsEmpty.should_be_true();

            context["When placing a first X"] = () => 
            {
                before = () => ticTacToeBoard.PlaceXat(1);
                it["Board should not be empty"] = () => ticTacToeBoard.IsEmpty.should_be_false();
            };
        }
        TicTacToeBoard ticTacToeBoard = new TicTacToeBoard();
    }
}

And here is the output that will be generated using NSpecRunner, NSpec default test runner.

Output :
TicTacToe specifications
  given a new board
    A new tic tac toe board is empty
    When placing a first X
      Board should not be empty

2 Examples, 0 Failed, 0 Pending

Here are the two same tests and their output in StoryQ:

namespace StoryQTests
{   
    [TestClass]
    public class TicTacToe_specifications
    {
        TicTacToeBoard ticTacToeBoard;
        bool result;

        [TestMethod]
        public void given_a_new_board()
        {
            new Story("A new tic tac toe board is empty")
                .InOrderTo("to start a new game")
                .AsA("player")
                .IWant("to have an empty board")
                .WithScenario("new board")
                .Given(ANewBoard)
                .When(CheckingIfBoardIsEmpty)
                .Then(ItShouldBeTrue)
                .Execute();

            new Story("After placing a first X")
                .InOrderTo("to make my first move")
                .AsA("player")
                .IWant("to have a non empty board")
                .WithScenario("new board")
                .Given(ANewBoard)
                .And(PlacingAFirstX)
                .When(CheckingIfBoardIsEmpty)
                .Then(ItShouldBeFalse)
                .Execute();
        }
        
        public void ANewBoard() { ticTacToeBoard = new TicTacToeBoard(); }
        public void PlacingAFirstX() { ticTacToeBoard.PlaceXat(1); }
        public void CheckingIfBoardIsEmpty() { result = ticTacToeBoard.IsEmpty; }
        public void ItShouldBeTrue() { Assert.IsTrue(result); }
        public void ItShouldBeFalse() { Assert.IsFalse(result); }
    }
}

And the output:

Story is A new tic tac toe board is empty
  In order to to start a new game
  As a player
  I want to have an empty board

      With scenario new board
        Given a new board                 => Passed
        When checking if board is empty   => Passed
        Then it should be true            => Passed
Story is After placing a first X
  In order to to make my first move
  As a player
  I want to have a non empty board

      With scenario new board
        Given a new board               => Passed
          And placing a first x         => Passed
        When checking if board is empty => Passed
        Then it should be false         => Passed

I tried to emulate the style in the StoryQ examples but I’m not sure if I am using StoryQ as efficiently as possible.

NSpec

Pros:

  • concise
  • easy to get started with
  • makes me think of RSpec
  • it kicks ass

Cons:

  • can’t use MS test runner
  • must manually rebuild test library before each NSpecRunner run
  • tests will fail if some conventions aren’t followed (ie: underscores in method names)

StoryQ

Pros:

  • uses MSTest or NUnit test runners and standard test attributes
  • great tools
  • focus on user stories

Cons:

  • not sure if I am using it correctly
  • very verbose

Conclusion

For my conclusion read the tl;dr section at the start of this post.

AutoMoq

When writing unit tests you must supply an instance of each dependency that will get called in your method being tested.

When using a mocking framework like Moq you also have the choice to supply a mocked instance of this dependency.

One of the problems of using such an approach is that every time you add or remove a dependency to the code being tested, you must go back through all your tests and refactor them to deal with this change in dependencies. One common scenario is refactoring tests to supply a new mocked instance.

If you have experienced this before, you know it can feel like a drag to go spend the extra time and modify all your existing tests.

This is where an auto mocking library like AutoMoq comes into play. (not to be confused with AutoFixture.AutoMoq)

AutoMoq will create a default mocked instance for each dependency that gets called by your test. This default mocked instance will return default values for all calls made to it’s methods unless you have defined an explicit setup on a method.

Here is a regular test using only Moq:

[TestMethod]
public void TestMethod_without_AutoMoq()
{
    // Arrange            
    var mockSomeDependency = new Mock<ISomeDependency>();
    var mockAnotherDependecy = new Mock<IAnotherDependency>();
    var mockRequestQueue = new Mock<IRequestQueue>();
 
    var objectUnderTest = new ClassUnderTest(mockSomeDependency.Object, 				       mockAnotherDependecy.Object, 
	       mockRequestQueue.Object);
			
    var testGuid = Guid.NewGuid();
 
    mockRequestQueue.Setup(mock => mock.GetNextRequest(ProcessType.Pending)).Returns(testGuid);
 
    // Act
    var result = objectUnderTest.AssignRequest(ProcessType.Pending);
 
   // Assert
   Assert.AreEqual(testGuid, result);
}

Here SomeDependency and AnotherDependency are dependencies that get called when GetNextRequest is invoked, but for which we do not care what values are returned. Even though we are not using their return values, we must still define mocks for these dependencies or we will get a null reference exception.

And here is how you would convert this test to use AutoMoq:

[TestMethod]
public void SameTest_with_AutoMoq()
{
    // Arrange    
    var mocker = new AutoMoqer();    
    var testGuid = Guid.NewGuid();
 
    mocker.GetMock<IRequestQueue>()
				  .Setup(mock => mock.GetNextRequest(ProcessType.Pending))
										.Returns(testGuid);
 
    var objectUnderTest = mocker.Resolve<ClassUnderTest>();
 
    // Act
    var result = objectUnderTest.AssignRequest(ProcessType.Pending);

    // Assert
    Assert.AreEqual(testGuid, result);
}

AutoMoq not only saves on the length of the test (as you don’t have to specify all dependencies), it will also cut on the time needed to refactor your tests when changes are introduced.

While the test is definitively shorter and less cluttered, the real magic comes when you add another dependencies to ClassUnderTest. In the AutoMoq test, you don’t have to do any refactoring unless you need to specify an explicit value for a method on the dependency.

Note about MEF and AutoMoq

Note that AutoMoq inject’s dependencies supplied in the constructor. If you use MEF, dependencies injected by using attributes on properties will not be managed by AutoMoq, only dependencies injected by using an ImportingConstructor.

Unit testing in Erlang with EUnit

To write unit tests in Erlang, you can use EUnit, xUnit’s erlang sibbling.

Writing unit tests in Erlang is really fun and you can do some pretty neat stuff. We will start at the very beginning but I will end up showing you how you can write a function so Erlang generates tests for you!

To start off, you can write your tests in the module you are testing or another separate module. Coming from C#, my first thoughts were that the tests should always be in a different file to separate the two concerns (testing and the actual code).

After playing a bit with EUnit, my opinion has shifted and I think both approaches have their merits and I will use the two in conjunction going forward.

For this blog I have created a simple module to test. These functions just double and triple an input parameter and my_reverse is a custom implementation of lists:reverse. I wrote simple functions as I did not want to focus on the module under test but rather on how to write the tests themselves.

Here is the module in question:

-module(sut).
-export([double/1, triple/1, my_reverse/1]).

double(X) -> X * 2.

triple(X) -> X * 3.

my_reverse(X) -> my_reverse(X, []).
my_reverse([], Acc) -> Acc; 
my_reverse([Head | Tail], Acc) -> my_reverse(Tail, [Head | Acc]).

For the test module, you should define a module with the same name as the one you are testing plus the _tests suffix. Following this convention will allow EUnit to find the tests of your module without referring to the test module (for example if you have tests embedded in you normal module as well as tests defined in an external module).

This module should contain an include of EUnit just after your -module declaration:

-include_lib("eunit/include/eunit.hrl").

Simple test functions

You can define simple test functions likewise:

% simple test functions
call_double_test() -> sut:double(2).
double_2_test() -> 4 = sut:double(2).
double_4_test() -> 8 = sut:double(4).
double_fail_test() -> 7 = sut:double(3).

The functions need to have the _test suffix which will allow EUnit to find and run them. While you could do similar tests functions without EUnit, EUnit affords you the convenience of easily running and getting feedback on your tests. Also as I will cover afterwards, EUnit allows for better ways to write test functions.

Here the call_double_test will only check that the function doesn’t crash while the others will use patter matching to verify the results.

To run your tests, just compile your test module and then call the test() function. This function is made available when you import EUnit.

Here is a complete example if you want to follow along:

-module(sut_tests).
-include_lib("eunit/include/eunit.hrl").

% simple test functions
call_double_test() -> sut:double(2).
double_2_test() -> 4 = sut:double(2).
double_4_test() -> 8 = sut:double(4).
double_fail_test() -> 7 = sut:double(3).

Calling the tests:

15> sut_test:test().

And here is the output:

sut_test: double_fail_test...*failed*
::error:{badmatch,6}
  in function sut_test:double_fail_test/0


=======================================================
  Failed: 1.  Skipped: 0.  Passed: 3.
error

Assert macros

The next step is to use assert macros. These are a step above the test functions and make the assertions more readable and more xUnit like.

% assert macros
triple_3_test() -> ?assert(sut:triple(3) =:= 9).
triple_fail_test() -> ?assert(sut:triple(3) =:= 10).

Note that there is a plethora of other macros including assertNot and assertMatch. You can consult the EUnit documentation for a full list.

Test generators

This is where the fun begins. Before, we specified functions (or macros) that tested the module. With test generators we can specify a list of tests and EUnit will run all of these.

Test generators use the _test_ suffix rather than _test. The test macros themselves also have a leading underscore, ie: ?_assert, rather than ?assert.

We could have a generator that generates a single test:

double_gen_test_() -> ?_assert(sut:double(3) =:= 6).

Or to minimize typing we could group all related tests in a single list likewise:

double_gens_test_() -> [?_assert(sut:double(2) =:= 4),
						?_assert(sut:double(3) =:= 6),
						?_assert(sut:double(4) =:= 8),
						?_assert(sut:double(5) =:= 10)].

In this previous example we have grouped four test functions in a single list.

This is ok, I guess, but it also opens up the possibility for something else…

Programmatically generating your tests

Since test generators operate on lists of tests, we can use regular list comprehension to programmatically create our list.

Here is a first example:

double_gen_test_() -> [?_assert(sut:double(X) =:= X * 2) || X <- lists:seq(1, 10)].

As the output shows:

62> sut_tests:test().
  All 10 tests passed.
ok

This single line of code generated ten tests. Call double with values 1 through 10 and check the output.

Next consider a list comprehension that creates a list of lists to test our reverse function:

reverse_gen2_test_() -> [?_assert(sut:my_reverse(List) =:= lists:reverse(List)) || List <- [lists:seq(1, Max) || Max <- lists:seq(1, 10)]].

If we break this one apart,

[lists:seq(1, Max) || Max <- lists:seq(1, 10)].

Will generate the following output:

[[1],
 [1,2],
 [1,2,3],
 [1,2,3,4],
 [1,2,3,4,5],
 [1,2,3,4,5,6],
 [1,2,3,4,5,6,7],
 [1,2,3,4,5,6,7,8],
 [1,2,3,4,5,6,7,8,9],
 [1,2,3,4,5,6,7,8,9,10]]

Then it will compare our implementation of my_reverse with Erlang’s lists:reverse for all ten lists.

Even tough this is really cool, be careful not to abuse this, as most of time simple tests will be much clearer.

But in some situations where you can define a function to generate loads of data, using list comprehension can be a time-saving solution.

Sorting algorithms in Ruby, interlude

Hello.

The next installment of Sorting algorithms in Ruby, will come out tomorrow night.

Before it does I wanted to use a separate post to present something that will be used in tomorrow’s post. I am doing it here because tomorrow’s post is already very long as it is.

Reusing test logic

Since the next two algorithms, Heap sort and Quicksort are more complex than my previous sorts, I wanted to make sure I had a robust testing suite.

When you consider the test cases for a given sorting algorithm, you can see that they can apply to any sorting algorithms. For this reason, I moved the actual testing code in a separate class named SortTest, which I will be able to reuse for any algorithms in the following installments.

Here it is:

class SortTest 
	def initialize(sort_algorithm)
		@sort_algorithm = sort_algorithm
	end
	
	def test_empty_array
		empty_array = []
		
		return test(empty_array, empty_array)		
	end
	
	def test_array_with_one_element
		array_with_one_element = [1]
		
		return test(array_with_one_element, array_with_one_element.dup)
	end
	
	def test_ordered_array
		ordered_array = [1, 2, 4, 5, 6, 7]
				
		return test(ordered_array, ordered_array.dup)
	end
	
	def test_array_with_even_number_of_elements
		array_with_even_number_of_elements = [10, 23, 43, 45, 8, 2, 4, 9]
		sorted = array_with_even_number_of_elements.sort
		
		return test(array_with_even_number_of_elements, sorted) 
	end
	
	def test_array_with_odd_number_of_elements
		array_with_odd_number_of_elements = [56, 3, 45, 11, 2, 3, 1]
		sorted = array_with_odd_number_of_elements.sort
		
		return test(array_with_odd_number_of_elements, sorted) 
	end
	
	def test_medium_sized_array
		medium_sized_array = [6, 3, 5, 7, 1, 2, 8, 39, 23, 12, 34, 11, 32, 4, 13]
		sorted = medium_sized_array.sort
		
		return test(medium_sized_array, sorted)
	end
	
	def test_reversed_sorted_array
		reversed_sorted_array = [45, 33, 28, 25, 24, 19, 15, 14, 13, 12, 11, 10, 8, 4, 2, 1]
		sorted = reversed_sorted_array.sort
		
		return test(reversed_sorted_array, sorted) 
	end
	
	def test_large_sized_array
		large_sized_array = [5, 2, 3, 8, 1, 3, 12, 10, 11, 6, 17, 13, 4, 9, 18 , 7, 15, 22, 40, 32, 34]
		sorted = large_sized_array.sort
		
		return test(large_sized_array, sorted) 
	end
	
	def test_array_with_repeating_elements
		array_with_repeating_elements = [3, 5, 2, 7, 5, 2, 4, 5, 6, 5]
		sorted = array_with_repeating_elements.sort
		
		return test(array_with_repeating_elements, sorted)
	end
	
private
	def test(initial_array, expected_result)
		actual_result = @sort_algorithm.sort(initial_array)
		result = actual_result == expected_result
		
		if not result
			puts
			puts "Expected result #{expected_result.inspect}"
			puts "Actual result #{actual_result.inspect}"			
		end
		
		return result
	end
end

And here is an example of it’s usage:

require 'test/unit'
require './heap_sort.rb'
require './sort_test.rb'

class HeapSortTest < Test::Unit::TestCase	
	def setup
		heap_sort = HeapSort.new
		@sort_test = SortTest.new(heap_sort)
	end
	
	def test_empty_array
		assert @sort_test.test_empty_array, "Empty array not corectly sorted"
	end
	
	def test_array_with_one_element
		assert @sort_test.test_array_with_one_element, "Array with one element not correctly sorted"
	end
	
	def test_ordered_array
		assert @sort_test.test_ordered_array, "Ordered array not correctly sorted"
	end
	
	def test_array_with_even_number_of_elements
		assert @sort_test.test_array_with_even_number_of_elements, "Array with even number of elements not correctly sorted"
	end
	
	def test_array_with_odd_number_of_elements
		assert @sort_test.test_array_with_odd_number_of_elements, "Array with odd number of elements not correctly sorted"
	end
	
	def test_medium_sized_array
		assert @sort_test.test_medium_sized_array, "Medium sized array not correctly sorted"
	end
	
	def test_reversed_sorted_array
		assert @sort_test.test_reversed_sorted_array, "Reversed sorted array not correctly sorted"
	end
	
	def test_large_size_array
		assert @sort_test.test_large_sized_array, "Large sized array not correctly sorted"
	end
	
	def test_test_array_with_repeating_elements
		assert @sort_test.test_array_with_repeating_elements, "Array with repeating elements not correctly sorted"
	end
end

The reason I added the private test method in SortTest and custom assert messages in my HeapSortTest is because assert alone does not do a good job of reporting test failures.

The assert function will report a failure in this way:

1) Failure:
test_empty_array(HeapSortTest) [heap_sort_test.rb:12]:
false is not true.

Normally, if you use assert_operator or assert_equal you will get a more meaningful message, but with a simple assert the output is a little dry.

With the message parameter and the additional private test method here is the actual result:

Started

Expected result [1, 2, 4, 5, 6, 7]
Actual result [1, 2, 4, 5, 6]
F
Finished in 0.00767 seconds.

1) Failure:
test_ordered_array(QuickSortTest) [quick_sort_out_of_place_test.rb:20]:
Ordered array not correctly sorted.
is not true.

Which is much better. You can clearly see what the problem is with just a quick glance. Ideally all test results should be this easy to read. With such clear messages, our tests will not only show us the errors, they will serve as a first line of debugging information.