Project on Github and code reviews

Hello,

I recently (in the last few weeks) released a very small project on Github called SummonMonster. I might work on it more later, but for now it does pretty much what I wanted and I am working on something else bigger, so updating it is not a big priority.

This project is an Erlang. I like to play around with Ruby and Erlang at home but for the last few years, I spent most of my time at work with C#. Needless to say my Erlang and Ruby skills aren’t currently up to par with my C# skills.

That’s why I wanted to have the code reviewed. Code reviews are really great. Learning a new language and it’s conventions is always faster when you can have your work reviewed by a peer.

To do this I used a great site which is currently in beta on the stackexchange network. It’s currently called codereview.stackexchange.com and I suggest you give it a whirl if you are in a similar situation.

Code reviews

Of course, code reviews are better done in person. Since I am on this subject, I will give a few guidelines I personally adhere to for code reviews.

First, code reviews are for everyone. No matter the experience or knowledge of both participants, reviews can always be beneficial. The reviewed person gets a chance to explain his ideas, might find mistakes himself in his code and gives information to the reviewer. Following a review, the reviewer is in a better position to maintain the code in question or answer questions on it when the original author is not available.

All of this suggests that reviews are beneficial even if a person is getting reviewed by someone who is more green. This is a great time for the a new member of the team to learn the standards and conventions used on a particular team or project.

Having everyone submit to code reviews also sends a positive message to new members of a team who might be reluctant to being reviewed. Older members can lead by example.

Secondly, it is always better to review the code at hand rather than the programmer and to question the implementation rather than the person who wrote the code.

Third, how the reviews should be conducted and how/if/when corrections are to be made should be decided in advance.

A policy I like is to have most propositions be optional, especially the ones who would result in a major refactoring. In these cases it might be more efficient to simply correct the reviewed person so that future developments will be free of a certain mistake.

Finally make sure to have a flexible code review policy. A lot of times, check-ins are trivial and do not necessitate a review. Imposing a mandatory review will result in time misspent and a negative view on the review process.

Sorting algorithms in Erlang, Part 2

Hello again. Today I continue my series on sorting. In the last instalment I reimplemented in Erlang two iterative sorting algorithms that I had previously done in Ruby. Today, I am going to re implement two recursive algorithms.

Before going into this, I was thinking the recursive algorithms would be a more natural fit for the language and easier to implement. No surprises here, it turned out as expected.

Merge sort

Here is my Merge sort implementation in Erlang:

merge_sort([]) -> [];
merge_sort([X | []]) -> [X];	

merge_sort(ToSort) -> {FirstList, SecondList} = lists:split(round(length(ToSort) / 2), ToSort),
					  lists:merge(merge_sort(FirstList), merge_sort(SecondList)).

You could say I cheated a bit here. In the Ruby version, I chose to implement my own merge method rather than use a prebuilt like I just did here in Erlang…

To be fair in my comparison, I also implemented a merge function.

merge(X, []) -> X;
merge([], Y) -> Y;
merge([X | TailX], [Y | TailY]) when X < Y -> [X] ++ merge(TailX, [Y | TailY]);
merge([X | TailX], [Y | TailY]) when X > Y -> [Y] ++ merge([X | TailX], TailY).

You just have to replace lists:merge with merge in the previous code and everything works the same. I could have only shown you the implementation with the custom merge, but the version with the built-in merge is very striking. It’s quick, clear and to the point.

The merge function is different in both implementations, the sort function on the other hand is very similar in logic.

If you go back to the Ruby version you will see it is a bit longer than the Erlang one. This is because I used intermediary variables in the Ruby version. If you add them to the Erlang one they become very similar.

merge_sort([]) -> [];
merge_sort([X | []]) -> [X];	

merge_sort(ToSort) -> {FirstList, SecondList} = lists:split(round(length(ToSort) / 2), ToSort),
					  FirstSortedList = merge_sort(FirstList),
					  SecondSortedList = merge_sort(SecondList),
					  lists:merge(FirstSortedList, SecondSortedList).

Here is the Ruby version (without the comments) for reference:

def sort(to_sort)
	if to_sort.length <= 1 
		then return to_sort 
	end
		
	second_array = to_sort.slice!((to_sort.length / 2.0).round..to_sort.length)
				
	first_sorted_array = sort(to_sort)
	second_sorted_array = sort(second_array)

	return merge(first_sorted_array, second_sorted_array)
end

Quicksort

When I originally started the series I mentioned I was going to implement the algorithms without resorting to reading other implementations or pseudo-code. I would try as much as possible to work with the textual descriptions of the algorithms. I have pretty much followed this. In some rare cases tough, I did glean at some pseudo-code.

Unfortunately, while reading the official Erlang documentation, I stumbled upon a quicksort implementation on the list comprehension page. It was very short and I couldn’t stop reading it before it was too late. Rather than pretend I never saw it (and also because it’s very cool) I will present it as is:

sort([Pivot|T]) ->
    sort([ X || X <- T, X < Pivot]) ++
    [Pivot] ++
    sort([ X || X <- T, X >= Pivot]);
sort([]) -> [].

(original source)

As you can see it uses the first element as a pivot, something that isn’t optimal but that I also did in my Ruby implementation. I like the implementation very much. It’s short, concise and very elegant.

I also wrote my own implementation, it’s not as short but it uses the middle element as the pivot. This prevent some of the performance problems when the list is already sorted or nearly sorted.

Here it is:

quicksort([]) -> [];

quicksort(ToSort) -> Pivot = lists:nth(round(length(ToSort) / 2), ToSort),
					 SmallerElements = lists:filter(fun(X) -> X < Pivot end, ToSort),
					 LargerElements = lists:filter(fun(X) -> X > Pivot end, ToSort),
					 quicksort(SmallerElements) ++ [Pivot] ++ quicksort(LargerElements).

Conclusion

As you you can see with merge sort, you can sometimes come up with similar approaches in two different paradigms. In some languages, for example C#, not only can you apply algorithms commonly found in functional languages, but the language integrates some functional concepts as well.

When you aren’t used to thinking in functional terms it can be challenging to use these concepts to their fullest. Learning to program in a functional language is not only an end to itself, it also helps with applying new concepts in languages you are already familiar with.

Sorting algorithms in Erlang, Part 1

Sorting algorithms in Erlang, Part 1

Ok, so I am continuing my series of posts on sorting algorithms. In the first parts, I implemented some classic sorting algorithms in Ruby, an OO language. Right now, I will implement similar algorithms in a purely functional language, Erlang.

I will not explain the basis for the algorithms in these posts as I have already done so in the Ruby posts.

Differences in implementations

Implementing Bubble sort in Erlang was really hard for me. While implementing it in Ruby was dead simple, in Erlang this algorithm was one of the hardest, if not the hardest for me to figure out. Going into this, I was already expecting the iterative algorithms to be harder to implement in a functional style, but not this much. There is more code and it is harder to follow then a procedural or OO implementation.

If you were to give the task of implementing a sorting algorithm to a class of green CS students, in a procedural or OO language, without asking them to focus on performance, you would probably get a few bubble sorts. Bubble sort, Insertion sort and Selection sort are all very intuitive in a procedural paradigm.

I don’t see someone coming up with Bubble sort intuitively in a functional language. Clearly this is a sign of how different paradigms shape our way of coming up with solutions. The languages and more importantly the paradigms we use ,shape our very way of thinking.

For the purely iterative algorithms to work in Erlang, I had to change them into recursive algorithms. The concepts of in-place and out of place sorting were also thrown out the window. “while” loops in other languages often depend on mutable state to work, in Erlang these must be replaced with recursion as there is no mutable state.

Bubble sort

Here is the code for what I came up with. Disclaimer, I am still a novice with Erlang and there are better/shorter implementations on the web.

% empty list an single element list
bubble_sort([]) -> [];
bubble_sort([X | []]) -> [X];

bubble_sort(ToSort) -> {RList, RSwapped} = bubble_sort(ToSort, false),
					   case RSwapped of
					   		true -> bubble_sort(RList);
					   		false -> RList
					   end.

% We have come to the last element of the list
bubble_sort([X, Y | []], Swapped) -> case X > Y of
							  			true -> {[Y, X], true};
										false -> {[X, Y], Swapped} 
							   	     end;

% We are working through the list
bubble_sort([X, Y | Tail], Swapped) -> case X > Y of
										true -> {NewTail, RSwapped} = bubble_sort([X | Tail], true), {[Y] ++ NewTail, RSwapped};
										false -> {NewTail, RSwapped} = bubble_sort([Y | Tail], Swapped), {[X] ++ NewTail, RSwapped} 
							  		 end.  

The algorithm works by defining that an empty list and a list with a single element is already sorted. In any other case, we traverse the list recursively with bubble_sort/2, swapping the elements when returning from the chain of recursive calls.

Swapped is passed along to monitor if any swaps have taken place. This way we can tell the bubble_sort/1 to recursively start another chain of recursive calls to bubble_sort/2 and go through the whole list again.

Insertion sort

Insertion sort was much easier to write in Erlang than Bubble sort. I also find it a lot easier on the eyes to. Contrary to Bubble sort, I did not find implementing Insertion sort in Erlang harder than in Ruby. This solution feels more natural in Erlang than the previous one.

Note that the code is simplified because it relies on the max/1 function. Without relying on this built-in function the code would have been more dense.

Of course, this isn’t a “pure” textbook insertion sort, but you could call it a functional Insertion sort.

insertion_sort([]) -> [];
insertion_sort([X | []]) -> [X];

insertion_sort(ToSort) -> insertion_sort(ToSort, []).

insertion_sort([], Sorted) -> Sorted;

insertion_sort(Unsorted, Sorted) -> Max = lists:max(Unsorted),
									insertion_sort(lists:delete(Max, Unsorted), 
                                                   [Max] ++ Sorted).

The function insertion_sort/1 calls insertion_sort/2. It’s first parameter is the list to sort and the second parameter the sorted list.

See you next time as I implement more sorting algorithms.

Sorting algorithms in Erlang, Part 2

Unit testing in Erlang with EUnit

To write unit tests in Erlang, you can use EUnit, xUnit’s erlang sibbling.

Writing unit tests in Erlang is really fun and you can do some pretty neat stuff. We will start at the very beginning but I will end up showing you how you can write a function so Erlang generates tests for you!

To start off, you can write your tests in the module you are testing or another separate module. Coming from C#, my first thoughts were that the tests should always be in a different file to separate the two concerns (testing and the actual code).

After playing a bit with EUnit, my opinion has shifted and I think both approaches have their merits and I will use the two in conjunction going forward.

For this blog I have created a simple module to test. These functions just double and triple an input parameter and my_reverse is a custom implementation of lists:reverse. I wrote simple functions as I did not want to focus on the module under test but rather on how to write the tests themselves.

Here is the module in question:

-module(sut).
-export([double/1, triple/1, my_reverse/1]).

double(X) -> X * 2.

triple(X) -> X * 3.

my_reverse(X) -> my_reverse(X, []).
my_reverse([], Acc) -> Acc; 
my_reverse([Head | Tail], Acc) -> my_reverse(Tail, [Head | Acc]).

For the test module, you should define a module with the same name as the one you are testing plus the _tests suffix. Following this convention will allow EUnit to find the tests of your module without referring to the test module (for example if you have tests embedded in you normal module as well as tests defined in an external module).

This module should contain an include of EUnit just after your -module declaration:

-include_lib("eunit/include/eunit.hrl").

Simple test functions

You can define simple test functions likewise:

% simple test functions
call_double_test() -> sut:double(2).
double_2_test() -> 4 = sut:double(2).
double_4_test() -> 8 = sut:double(4).
double_fail_test() -> 7 = sut:double(3).

The functions need to have the _test suffix which will allow EUnit to find and run them. While you could do similar tests functions without EUnit, EUnit affords you the convenience of easily running and getting feedback on your tests. Also as I will cover afterwards, EUnit allows for better ways to write test functions.

Here the call_double_test will only check that the function doesn’t crash while the others will use patter matching to verify the results.

To run your tests, just compile your test module and then call the test() function. This function is made available when you import EUnit.

Here is a complete example if you want to follow along:

-module(sut_tests).
-include_lib("eunit/include/eunit.hrl").

% simple test functions
call_double_test() -> sut:double(2).
double_2_test() -> 4 = sut:double(2).
double_4_test() -> 8 = sut:double(4).
double_fail_test() -> 7 = sut:double(3).

Calling the tests:

15> sut_test:test().

And here is the output:

sut_test: double_fail_test...*failed*
::error:{badmatch,6}
  in function sut_test:double_fail_test/0


=======================================================
  Failed: 1.  Skipped: 0.  Passed: 3.
error

Assert macros

The next step is to use assert macros. These are a step above the test functions and make the assertions more readable and more xUnit like.

% assert macros
triple_3_test() -> ?assert(sut:triple(3) =:= 9).
triple_fail_test() -> ?assert(sut:triple(3) =:= 10).

Note that there is a plethora of other macros including assertNot and assertMatch. You can consult the EUnit documentation for a full list.

Test generators

This is where the fun begins. Before, we specified functions (or macros) that tested the module. With test generators we can specify a list of tests and EUnit will run all of these.

Test generators use the _test_ suffix rather than _test. The test macros themselves also have a leading underscore, ie: ?_assert, rather than ?assert.

We could have a generator that generates a single test:

double_gen_test_() -> ?_assert(sut:double(3) =:= 6).

Or to minimize typing we could group all related tests in a single list likewise:

double_gens_test_() -> [?_assert(sut:double(2) =:= 4),
						?_assert(sut:double(3) =:= 6),
						?_assert(sut:double(4) =:= 8),
						?_assert(sut:double(5) =:= 10)].

In this previous example we have grouped four test functions in a single list.

This is ok, I guess, but it also opens up the possibility for something else…

Programmatically generating your tests

Since test generators operate on lists of tests, we can use regular list comprehension to programmatically create our list.

Here is a first example:

double_gen_test_() -> [?_assert(sut:double(X) =:= X * 2) || X <- lists:seq(1, 10)].

As the output shows:

62> sut_tests:test().
  All 10 tests passed.
ok

This single line of code generated ten tests. Call double with values 1 through 10 and check the output.

Next consider a list comprehension that creates a list of lists to test our reverse function:

reverse_gen2_test_() -> [?_assert(sut:my_reverse(List) =:= lists:reverse(List)) || List <- [lists:seq(1, Max) || Max <- lists:seq(1, 10)]].

If we break this one apart,

[lists:seq(1, Max) || Max <- lists:seq(1, 10)].

Will generate the following output:

[[1],
 [1,2],
 [1,2,3],
 [1,2,3,4],
 [1,2,3,4,5],
 [1,2,3,4,5,6],
 [1,2,3,4,5,6,7],
 [1,2,3,4,5,6,7,8],
 [1,2,3,4,5,6,7,8,9],
 [1,2,3,4,5,6,7,8,9,10]]

Then it will compare our implementation of my_reverse with Erlang’s lists:reverse for all ten lists.

Even tough this is really cool, be careful not to abuse this, as most of time simple tests will be much clearer.

But in some situations where you can define a function to generate loads of data, using list comprehension can be a time-saving solution.