Friday, 12 April 2019

Sunday, 20 May 2018

PeanutButter.Utils: the dictionaries

Dictionaries, HashMaps, whatever you want to call them -- they can be one of the most useful constructs in any language. In Javascript, the dictionary interface to objects makes a lot of dynamic code simpler. In fact, it was the Javascript paradigm which served as the inspiration for PeanuButter.Utils member: DictionaryWrappingObject. This class was originally made to facilitate some of the functionality in PeanutButter.Ducktyping: a library for duck-typing arbitrary objects onto interfaces which aren't directly related. For example, if you had an anonymous object with the correct "shape", you could duck-type it onto an interface which another part of the system requires without having to manually create your own implementation of that interface and the code to copy data / forward method calls -- all of which PeanutButter.DuckTyping can do, with varying amounts of flexibility to matching methods and properties, according to your needs.

But I'm not here to talk about PeanutButter.DuckTyping today -- as interesting as it was to write and as useful as it's proven to be in a couple of projects since then.

So, back to DictionaryWrappingObject: this class provides the familiar IDictionary<string, object> interface over any other object so you can enumerate through the properties or perform functions like querying property names without directly using reflection yourself. And, of course, you can just use very Javascript-y syntax:

var obj = new { id = 1, name = "bob" };
var wrapper = new DictionaryWrappingObject(obj);
var name = wrapper["name"];

You can also construct with a case-insensitive StringComparer to make those lookups a little fuzzier:

var obj = new { Id = 1, Name = "Mary" };
var wrapper = new DictionaryWrappingObject(obj);
var id = wrapper["ID"];
var name = wrapper["name"];

Another useful dictionary construct that I stumbled across in Python is the default dictionary -- implemented in PeanutButter.Utils, unsurprisingly, as DefaultDictionary. This dictionary allows you to specify a default value to return for the case where requested keys aren't found:

var animalsInZoo = new DefaultDictionary<string, bool>(false);
animalsInZoo["Camel"] = true;
animalsInZoo["Panda"] = true;

//... some time later:
var haveCamels = animalsInZoo["Camel"]; // true
var haveSnakes = animalsInZoo["Snake"]; // false, not KeyNotFoundException!

DefaultDictionary can be a little smarter than having a static value for the default:

// set up the default dictionary such that students with a name starting
// "A" exist.
var students = new DefaultDictionary<string, bool>(
  k => k.StartsWith("A")
);
students["Anna"] = false;
students["Mary"] = true;
// ... elsewhere ...

var haveAnna = students["Anna"]; // false, explicitly set
var haveAndrew = studentds["Andrew"]; // true: default value provider
var haveMary = students["Mary"]; // true: explicitly set
var haveStewart = students["Stewart"]; // false: default value provider

DefaultDictionary is expecially useful in conjunction with MergeDictionary, which takes one or more other dictionaries with the same key/value types and merges them, returning the value from the first in the merge list to have a value. So we could have:

var config = new Dictionary<string, string>()
{
  ["host"] = "database-machine",
  ["port"] = "123"
};
var defaults = new Dictionary<string, string>(k =>
{
  switch (k):
  {
    case "host":
      return "localhost";
    case "port":
      return "3306";
    case "user":
      return "mysql";
    case "password":
      return "super-secret";
    default:
      return "";
  }
});
var final = new MergeDictionary<string, string>(
  config, defaults
);

// ... elsewhere ...
var config = new 
{
  host: final["host"],        // database-machine
  port: int.Parse(final["port"]), // 123
  user: final["user"],        // mysql (default)
  password: final["password"] // super-secret (default)
};

Finally, there is the CaseWarpingDictionary, which basically acts as a wrapper for another dictionary to change the case-sensitivity of the keys, especially useful when you want a dictionary that is a little more forgiving (case-insensitive) than the one you're working with. CaseWarpingDictionary can be constructed with either a boolean instructing whether or not the result is case-sensitive, or with a StringComparer, so you can, for instance, switch from Ordinal to CurrentCultureIgnoreCase.

Last words

The IDictionary<TKey, TValue> interface is not particularly difficult to implement, but it is quite convenient to consume. And I recommend devs wanting to learn something about the internals of .NET to implement some kind of IDictionary<TKey, TValue> at some point in their lives. For one thing, it will give you an appreciation for the good old Dictionary (:

Sunday, 29 April 2018

Unit test coverage and why it matters

Good unit test coverage can help you to improve your code in ways you might not expect. I'm not talking about just chasing some mythical value, like the agile team which ascribes to the contract of "85% coverage over the project", though chasing a number (the theoretical 100% coverage, which I've only achieved in one project ever) can lead you to some interesting discoveries.

Obviously, we can have bogus coverage:

using NUnit.Framework;
using static NExpect.Expectations;
using NExpect;

[TestFixture]
public class TestTheThing
{
  [Test]
  public void ShouldDoTheStuff()
  {
    // Arrange
    var sut = new TheThing();
    // Act
    var result = sut.DidTheStuff();
    // Assert
    Expect(result).To.Be.True();
  }
}

public class TheThing
{
  public bool DidTheStuff()
  {
    try
    {
      WriteOutSomeFile();
      MakeSomeWebCall();
      return true;
    }
    catch (WebException ex)
    {
      return false;
    }
  }
  // ... let's imagine that WriteOutSomeFile 
  // and MakeSomeWebCall are defined below...
}

The above test doesn't actually check that the correct file was written out -- or even that the correct web request happened. The one test above technically provides full coverage of the class (if WriteOutSomeFile and MakeSomeWebCall have no branches), but it's a bit anemic in that the coverage doesn't tell us much.

So coverage is not a number which definitively tells you that your code (or tests) are good. However, examining coverage reports (particularly the line-by-line analysis) has helped me to discover at least three classes of error:

I found bugs I didn't know I had

When I was still at Chillisoft, I'd just finished a unit of code (TDD, of course) to my satisfaction and decided to run coverage on it for interest sake. I was convinced that I'd done a good job, writing one test before each line of production code which was required. To my dismay, I found that there was one line which wasn't covered. Shame on me, I thought, and went back to the test fixture, where I found a test describing the exact situation that line should be handling. Ok, so I have this test, it names the situation, but there's no coverage on the line?

_Remember: coverage reports _can_ be faulty. It's rare, but it's worthwhile re-running your reports to just make double-sure that what you're seeing is correct._

I re-ran my coverage, but that one line remained red. And the more I looked at it, the more it looked like it should actually be causing a test to fail. So I re-examine the test which is supposed to be covering it to find… I've made a mistake in that test and it's actually not running through the branch with the uncovered line.

The fault here most likely comes down to one faulty TDD cycle where I hadn't gotten a good "red" before my "green". Still, examining the coverage report made me find the error and fix it before the code got anywhere near production. This experience is why I advocate for running coverage after reaching a point where one expects the current unit of work to be complete -- to find any holes in testing or defects in logic hidden in those holes. It's why I (convincingly) argued for all Chillisoft programmers to be granted an Ultimate license of Resharper, which has dotCover built right in to the test runner. We have coverage reports running at the CI server, but I wanted every developer to have the ability to test coverage quickly so that they could also discover flaws in their code before that code gets to production -- or even another developer's machine!

I found dead code

Just recently, I finally got the gulp build tasks for NExpect to include coverage by default when running npm test. And to my dismay, NExpect only had about 78% coverage. Which I thought was odd, because NExpect was build very-much test-first: indeed, the general method of operation was to write out a test with the desired expression and then provide the code to make that expression happen. So, for example:

Expect(someCollection).To.Contain
  .Exactly(1).Deep.Equal.To(search);

would have started out with most of those words red (thanks to Resharper) and they would unhighlight as I got together the class/interface linkage to make the words flow as desired. I expected coverage to be closer to 90% (I did expect some places to have been missed in lieu of ever having scrutinised coverage reports for NExpect before), but 78%? I had some work to do.

In addition to finding a few minor bugs that I didn't know I had (particularly with wording of failure messages in a few cases), I found that I had bits of code which theoretically should have been under test, but which weren't covered. Especially stuff like:

[Test]
public void ComparingFloats()
{
  Expect(1.5f).To.Be.Greater.Than(1.4f)
    .And.Less.Than(1.6f);
}

which works as expected, but never hit the Than extension methods for continuations of float.

The answer became obvious upon hovering over the usages -- each Than was expecting to operate on values of type double as the subject (actual) value (ie, the value being tested, in this case, 1.5f). This is because Expect upcasts certain numeric types (floats to doubles, ints to longs, etc) so that the comparison code doesn't require casting from the consumer (since NExpect continuations hold the type of the subject all the way through, instead of downcasting to object and hoping for the user to provide reasonable values. There's nothing wrong (that I can tell) with this approach -- and it works well, but it does mean that the Than extension methods expecting to operate on float and int subjects will never be used. They were dead code! So I could safely remove them. One of the ways to make code better is to remove the unnecessary bits (:

I found holes in my api

This is again, working in NExpect, where, upon providing coverage for one variant of syntax, I would find that I hadn't implemented for another. For example, NExpect has no opinion on which of these is better:

Expect(1).To.Not.Equal(2);
Expect(1).Not.To.Equal(2);

All NExpect does is prevent silliness like:

Expect(1).Not.To.Not.Equal(1);

NExpect is designed around user-extensibility as one of the primary goals. As such, there are some "dangling" words, like A, An, and Have so that the user can provide her own expressive extensions:

var dog = animalFactory.CreateDog();
Expect(dog).Not.To.Be.A.Cat();
Expect(dog).To.Be.A.Dog();
Expect(dog).To.Be.A.Mammal();

Where the user can use Matchers or Composition to provide the logic for the Cat, Dog, and Mammal extension method assertions. NExpect doesn't actually provide extensions on these "danglers" -- they're literally just there for custom extension.

Whilst running coverage, I found that one variant of Contain wasn't covered, and when I wrote a test to go through all three (positive, negative, alt. negative), I found that there were missing implementations! Which I naturally implemented (:

Using coverage to make your code better

Coverage reports like those generated by dotCover and the combination of OpenCover and ReportGenerator can not only give you confidence in your code and a fuzzy feeling inside at a number which shows that you do care about automated testing for your code -- they can also help you to make your code (and tests) better. And make you better, going forward, because you learn more about the mistakes you make along the way.

If you want to get started relatively easily and you're in the .net world, you can use gulp-tasks as a submodule in your git repo. Follow the instructions in the start folder and get to a point where you can run npm run gulp cover-dotnet (or make this your test script in package.config). This should:

  • build your project
  • run your tests, using OpenCover and NUnit
  • generate html reports using ReportGenerator, under a buildreports folder

You can always check out NExpect to see how I get it done there (:

Sunday, 15 April 2018

What's in PeanutButter.Utils, part 2

Metadata extensions

I just wanted to chip away at my promise to explain more of the bits in PB, so I thought I'd pick a little one (though I've found it to be quite useful): metadata extensions.

At some point, I wanted to be able to attach some arbitrary information to an object which I didn't want to extend or wrap and which some code, far down the line, would want to read. If C# was Javascript, I would have just tacked on a property:

someObject.__whatDidTheCowSay = "moo";

But C# is not Javascript. I could have maintained some global IDictionary somewhere, but, even though I wanted it to support a feature in NExpect, where the code wouldn't have a running lifetime of any significance, it still felt like a bad idea to keep hard references to things within NExpect. The code associating the metadata has no idea of when that metadata won't be necessary any more -- and neither does the consumer.

Then I came across ConditionalWeakTable which looked very interesting: it's a way of storing data where the keys are weak references to the original objects, meaning that if the original objects are ready to GC, they can be collected and the weak reference just dies. In other words, I found a way to store arbitrary data referencing some parent object and the arbitrary data would only be held in memory until the end of the lifetime of the original object.

That's exactly what I needed.

So was born the MetadataExtensions class, which provides the following extension methods on all objects:

  • SetMetadata<T>(string key, object value)
  • GetMetadata<T>(string key)
  • HasMetadata<T>(string key)

which we can use as follows:

public void MethodWantingToStoreMetadata(
  ISomeType objectWeWantToStoreStateAgainst)
{
  objectWeWantToStorStateAgainst
    .SetMetadata("__whatDidTheCowSay", "moo");
}

// erstwhile, elsewhere:

public void DoSomethingInterestionIfNecessary(
  ISomeType objectWhichMightHaveMetadata)
{
  if (objectWhichMightHaveSomeMetadata
        .HasMetadata("_whatDidTheCowSay"))
  {
    var theCowSaid = objectWhichMightHaveSomeMetadata
                       .GetMetadata("_whatDidTheCowSay");
    if (theCowSaid == "moo")
    {
      Console.WriteLine("The cow is insightful.");
    } 
    else if (theCowSaid == "woof")
    {
      Console.WriteLine("That ain't no cow, son.");
    }
  }
}

And, of course, as soon as the associated object can be collected by the garbage collector (remembering that the reference to this object, maintained within PB, is weak), that object is collected and the associated metadata (if not referenced elsewhere, of course) is also freed up. This mechanism has facilitated some interesting behavior in NExpect, and I hope that it can be helpful to others too.

Markdown all things

Whilst blogger.com provides a fairly good blogging platform for regular writing, I've found that it's rather painful for technical blogging. In particular, code blocks are a mission. In the past, I've wrapped code in <pre><code> … </code></pre> and let highlight.js do all the heavy lifting of making that actually look readable. HighlightJs has been fantastic at that, but it still hasn't been as smooth a process as I would have liked overall. I still tended to write the non-code parts in the WYSIWIG html editor, and had to switch the source view to work on code parts.

When I blog, I literally want to get out the information as quickly as possible, in a readable format. I'm not here to fight with styling.

So I was quite happy to stumble across showdown. A little Javascript in my template and suddenly I could write in possibly the simplest format ever: markdown. I had quick and easy access to simple styling elements (lists, headings, etc) as well as code blocks. All good, but not automagick out of the box.

I thought to myself, "I'm sure I can't be the only person who wants this", and "It would be nice if that auto-bootstrapping of markdown+code could be done anywhere, not just from within my blogger template".

So, as is so common within the open-source world, I stand upon the very tall, very broad shoulders of highlight.js and showdown to present auto-markdown: a script you can include on any page to convert any element with the markdown class to be rendered as markdown. It can even be configured (script versions and code theme) via some global variables, so you don't have to fiddle with the code if you don't want to.

I trialed it with my last post and It's how I'm writing now -- I just add a shell pre tag with the markdown class and get on with the writing, without any more fighting with the html editor. As a bonus: even if the script fails for some reason (such as if the user has Javascript disabled or GitHub doesn't supply my script in time), the blog is still in a readable format: markdown.

If you're interested, follow the instructions in the README.md. Feel free to open issues if you encounter some - for instance, I encountered some stickiness with generics in code blocks. Also, bear in mind that markdown requires html-escaping for chevrons (ie, embedding xml).

Feel free to share it as much as you like. If you don't feel comfortable referencing my code directly, fork my repo and keep your own copy (:

Now, if only blogger's html editor had a vi mode…

Thursday, 12 April 2018

What's in PeanutButter.Utils, exactly?

PeanutButter.Utils is a package which pretty-much evolved as I had common problems that I was solving day-to-day. People joining a team that I was working on would be exposed to bits of it and, like a virus, those bits would propagate across other code-bases. Some people asked for documentation, which I answered with a middle-ground of xmldoc, which most agreed was good enough. People around me got to know of the more useful bits in PeanutButter.Utils or would ask me questions like "Does PeanutButter.Utils have something which can do [X]?". I kind of took the ubiquity amongst my team-mates for granted.

Fast-forward a little bit, and I've moved on to another company, where people don't know anything about the time-savers in PeanutButter.Utils -- and it occurs to me that that statement probably applies to pretty-much most people -- so I thought it might be worthwhile to have some kind of primer on what you can expect to find in there. An introduction, if you will. I think there's enough to break the content down into sections, so we can start with:

Disposables

One of the patterns I like most in the .net world is that of IDisposable. It's a neat way to ensure that something happens at the end of a block of code irrespective of what goes on inside that code. The code could throw or return early -- it doesn't matter: whatever happens in the Dispose method of the IDisposable declared at the top of a using block will be run. Usually, we use this for clearing up managed resources (eg on database connections), but it struck me that there were some other convenient places to use it. Most generically, if you wanted to run something simple at the start of a block of code and run something else at the end (think of toggling something on for the duration of a block of code), you could use the convienient AutoResetter class:

using (new AutoResetter(
    () => ToggleFeatureOn(), 
    () => ToggleFeatureOff()))
{
    // code inside here has the feature toggled on
}
// code over here doesn't -- and the feature is 
//    toggled back off again even if the code 
//    above throws an exception.

It's very simple -- but it means that you can get the functionality of an IDisposable by writing two little lambda methods.

You can also have a variant where the result from the first lambda is fed into the second:

using (new AutoResetter(
    () => GetCounterAndResetToZero(),
    originalCount => ResetCounterTo(originalCount)))
{
// counter is zero here
}
// counter is reset to original value here

Cool.

Other common problems that can be solved with IDisposable are:

Ensuring mutexes / semaphores are reset, even if an exception is encountered

For this, we can use AutoLocker:

using (new AutoLocker(someMutex))
{
}
using (new AutoLocker(someSemaphore))
{
}
using (new AutoLocker(someSemaphoreLite))
{
}

Temporary files in tests

using (var tempFile = new AutoTempFile())
{
   File.WriteAllBytes(
       Encoding.UTF8.GetBytes("moo, said the cow"),
       tempFile.Path
   );
   // we can run testing code against the file here
}
// file is gone here, like magick!

This uses the Path.GetTempFileName() system call by default -- so you don't have to care about where the file actually exists. Of course, there are constructor overloads to:

  • create the file populated with data (string or bytes)
  • create the file in a different location (not the system temp data location)
  • create the file with a specific name

AutoTempFile also exposes the files contents via properties:

  • StringData for string contents
  • BinaryData for a byte[] array

There is also an AutoTempFolder if you want a scratch area to work in for a period of time. When it is disposed, it and all it's contents are deleted.

Similarly, AutoDeleter is an IDisposable which can keep track of multiple files you'd like to delete when it is disposed:

using (var deleter = new AutoDeleter())
{
    // some files are created, then we can do:
    deleter.Add("C:\Some\File");
    deleter.Add("C:\Some\Other\File");
}
// and here, those files are deleted. If they can't be 
//    deleted, (eg they are locked by some process),
//    then the error is quietly suppressed. 
//    `AutoDeleter` works for folders too.

Other disposables

As much as I love the using pattern, it can lead to some "arrow code", like this venerable ADO.NET code:

using (var conn = CreateDbConnection())
{
    conn.Open();
    using (var cmd = conn.CreateCommand())
    {
        cmd.CommandText = "select * from users";
        using (var reader = cmd.ExecuteReader())
        {
            // read from the reader here
        }
    }
}

Sure, many people don't use ADO.NET "raw" like this any more -- it's just an easy example which comes to mind. I've seen far worse "nest-denting" of using blocks too.

This can be flattened out a bit with AutoDisposer:

using (var disposer = new AutoDisposer())
{
    var conn = disposer.Add(CreateDbConnection());
    conn.Open();
    var cmd = disposer.Add(conn.CreateCommand());
    cmd.CommandText = "select * from users";
    var reader = disposer.Add(conn.ExecuteReader());
    // read from the db here
}
// reader is disposed
// cmd is disposed
// conn is disposed

AutoDisposer disposes of items in reverse-order in case of any disposing dependencies.

So that's part 1 of "What's in PeanutButter.Utils?". There are other interesting bits, like:

  • extension methods to
    • make some operations more convenient
      • do conversions
      • working with Stream objects
      • DateTime utilities
    • facilitate more functional code (eg .ForEach for collections)
    • SelectAsync and WhereAsync let you use async lambdas in your LINQ
    • test and manipulate strings
  • the DeepEqualityTester, which is at the heart of NExpects .Deep and .Intersection equality testing
  • MemberExpression helpers
  • Reflection tidbits
  • reading and writing arbitrary metadata for any object you encounter (think like adding property data to any object)
  • Some pythonic methods (Range and a (imo) more useful Zip than the one bundled in LINQ)
  • dictionaries
    • DictionaryWrappingObject lets you treat any object like you would in Javascript, with text property indexes
    • DefaultDictionary returns default values for unknown keys
    • MergeDictionary allows layering multiple dictionaries into one "view"
    • CaseWarpingDictionary provides a decorator dictionary for when the dictionary you have does indexing with inconvenient case rules

I hope to tackle these in individual posts (:

What's new in PeanutButter?

Retrieving the post... Please hold. If the post doesn't load properly, you can check it out here: https://github.com/fluffynuts/blog/...