Monday 6 November 2017

What's new in PeanutButter

I realise that it's been a while (again) since I've posted an update about new things in PeanutButter (GitHub, Nuget). I've been slack!

Here are the more interesting changes:
  1. PeanutButter.Utils now has a netstandard2.0 target in the package. So you can use those tasty, tasty utils from netcore2.0
  2. This facilitated adding a netstandard2.0 target for PeanutButter.RandomGenerators -- so now you can use the GenericBuilder and RandomValueGen, even on complex types. Yay! Spend more time writing interesting code and less time thinking of a name on your Person object (:
  3. Many fixes to DeepClone(), the extension method for objects, which gives you back a deep-cloned version of the original, including support for collections and cloning the object's actual type, not the type inferred by the generic usage - so a downcast object is properly cloned now.
  4. DeepEqualityTester can be used to test shapes of objects, without caring about actual values -- simply set OnlyCompareShape to true. Combined with setting FailOnMissingProperties to false, a deep equality test between an instance of your model and an instance of an api model can let you know if your models have at least the required fields to satisfy the target api.
  5. Random object generation always could use NSubstitute to generate instances of interfaces, but now also supports using PeanutButter.DuckTyping's Create.InstanceOf<T>() to create your random instances. Both strategies are determined at runtime from available assemblies -- there are no hard dependencies on NSubstitute or PeanutButter.DuckTyping.
  6. More async IEnumerable extensions (which allow you to await on a Linq query with async lambdas -- but bear in mind that these won't be translated to, for example, Linq-to-Sql code to run at the server-side):
    1. SelectAsync
    2. ToArrayAsync
    3. WhereAsync
    4. AggregateAsync
There was also some cleanup (I defaulted R# to show warnings, not just errors -- and squashed around 2000 of them) and made all projects depend on the newer PackageReferences mechanism for csproj files.

So, yeah, stuff happened (:

Sunday 8 October 2017

Everything sucks. And that's OK.

There is no perfect code, no perfect language, no perfect framework or methodology. Everything is, in some way, flawed.

This realization can be liberating -- when you accept that everything is flawed in some way, it puts aside petty arguments about the best language, editor, IDE, framework, database -- whatever some zealot is pedaling as being the ultimate solution to your problems. It also illuminates the need for all of the different options out there -- new programming languages are born sometimes on a lark, but often because the creator wanted to express themselves and their logical intent better than they could with the languages that they knew. Or perhaps they wanted to remove some of the complexity associated with common tasks (such as memory allocation and freeing or asynchronous logic). Whatever the reason, there is (usually) a valid one.

The same can be said for frameworks, database, libraries -- you name it. Yes, even the precious pearls that you've hand-crafted into existence. 

In fact, especially those.

We can be blind to the imperfections in our own creations, sometimes it's just ego in the way. Sometimes it's just a blind spot. Sometimes it's because we tie our own self-value to the things we create.

"You are not your job, you're not how much money you have in the bank. You are not the car you drive. You're not the contents of your wallet. You are not your fucking khakis."

For the few that didn't recognize the above, it was uttered by Tyler Durden from the cult classic Fight Club. There are many highlights one could pick from that movie, but this one has been on my mind recently. It refers to how people define themselves by their possessions, finding their identity in material items which are, ultimately, transient.

Much like how we're transient, ultimately unimportant in the grand scale of space and time that surrounds us. Like our possessions, we're just star stuff, on loan from the universe whilst we experience a tiny fraction of it.

I'd like to add another item to the list above:


You are not the code you have written.



This may be difficult to digest. It may stick in your throat, but some freedom comes in accepting this.

As a developer, you have to be continually learning, continually improving, just to remain marginally relevant in the vast expanse of languages, technologies, frameworks, companies and problems in the virtual sea in which our tiny domains float. If you're not learning, you're left behind the curve. Stagnation is a slow death which can be escaped by shaking up your entire world (eg forcing learning by hopping to a new company) or embraced by simply accepting it as your fate as you move into management, doomed to observe others actually living out their passions as they create new things and you report on it. From creator to observer. You may be able to balance this for a while, you may even have the satisfaction of moving the pieces on the board, but you've given up something of yourself, some part of your creator spirit. You can still learn here -- but you can also survive just fine as you are.

Or at least that's how it looks from the outside. And that's pretty-much how I've heard it described from the inside. I wouldn't know, personally. I'm too afraid to try. I like making things.

This journey of constant learning and improvement probably applies to other professions, especially those requiring some level of creativity and craftsmanship from the member. You either evolve to continue creating or your fade away into obscurity.

And if you are passionate about what you're doing, if you are continually learning, continually hungry to be better at what you do, continually looking for ways to evolve your thought processes and code, then inevitably, you have to look back on your creations of the past and feel...

Displeased.

Often I can add other emotions to the mix: embarrassed, appalled, even loathing. But at the very least, looking back on something you've created, you should be able to see how much better you'd be able to do it now. This doesn't mean that your past creations have no value -- especially if they are actually useful and in use. It just means that a natural part of continual evolution is the realization that everything you've ever done, everything you ever will do, given enough distance of time, upon reflection, sucks.

It starts when you recognize that code you wrote a decade ago sucks. It grows as you realize that code you wrote 5 years, even 2 years ago sucks. It crescendos as you realize that the code you wrote 6 months ago sucks -- indeed, even the code you wrote a fortnight ago could be done better with what you've learned in the last sprint. 

It's not a bad thing. The realization allows you to divorce your self-worth from your creations. If anything, you could glean some of your identity from the improvements you've been able to make throughout your career. Because, if you realize that your past creations, in some way or another, all suck, if you realize that this truth will come to pass for your current and future creations, then you have to also come to conclusion that realizing deficiencies in your past accomplishments highlights your own personal evolution.

If you look back over your past code and you don't feel some kind of disappointment, if you can't point out the flaws in your prior works, then I'd have to conclude that you're either stagnating or you're deluding yourself -- perhaps out of pride, perhaps because you've attached your self-worth to the creations you've made. Neither is a good place to be -- you're either becoming irrelevant or you're unteachable and you will become irrelevant.

If you can come to accept this as truth, it also means that you can accept criticism with valid reasoning as an opportunity to learn instead of an attack on your character. 

I watched a video where the speaker posits that code has two audiences: the compiler and your co-workers. The compiler doesn't care about style or readability. The compiler simply cares about syntactical correctness. I've tended to take this a little further with the mantra:


Code is for co-workers, not compilers.



I can't claim to be the original author -- but I also can't remember where I read it. In this case, it does boil down to a simple truth: if your code is difficult for someone else to extend and maintain, it may not be the sparkling gem you think it is. If a junior tasked with updating the code gets lost, calls on a senior for help and that senior can't grok your code -- it's not great. If that senior points that out, then they are doing their job as the primary audience of that code. This is not a place for conflict -- it is a place for learning. Yes, sometimes code is just complex. Sometimes language or framework features are not obvious to an outside programmer looking in. But it's unusual to find complex code which can't be made understandable. Einstein said:

“If you can't explain it to a six year old, you don't understand it yourself.” 

And I'd say this extends to your code -- if your co-workers can't extend it, learn of the (potentially complex) domain, work with the code you've made, then you're missing a fundamental reason for the code to exist: explaining the domain to others through modelling it and solving problems within it.

Working with people who haven't accepted this is difficult -- you can't point out flaws that need to be fixed or, in extreme cases, even fix them without inciting their wrath. You end up having to guerilla-code to resolve code issues or just bite your tongue as you quietly sweep up after them. Or worse -- working around the deficiencies in their code because they insist you depend on it whilst simultaneously denying you the facility to better it.

At some point, we probably all felt the pang when someone pointed out a flaw in our code. Hopefully, as we get older and wiser, this falls away. Personally, I think that divorcing your self-image from your creations, allowing yourself to be critical of the things you've made -- this is one of the marks of maturity that defines the difference between senior developer and junior developer. Not to say that a junior can't master this already -- more to say that I question the "seniority" of a senior who can't do this. It's just one of the skills you need to progress. Like typing or learning new languages.

All of this isn't to say that you can't take pride in your work or that there's no point trying to do your best. We're on this roundabout trying to learn more, to evolve, to be better. You can only get better from a place where you were worse. Once of the best sources of learning is contemplated failure. You can also feel pride in the good parts of what you've created, as long as that is tempered by a realistic, open view on the not-so-good parts. 

You may even like something you've made, for a while, at least. I'm currently in a bit of a honeymoon phase with NExpect (GitHub, Nuget): it's letting me express myself better in tests, it's providing value to handful of other people -- but this too shall pass. At some point, I'm going to look at the code and wonder what I was thinking. I'm going to see a more elegant solution, and I'm going to see the inadequacies of the code. Indeed, I've already experienced this in part -- but it's been overshadowed by the positive stuff that I've experienced, so I'm not quite at the loathing state yet.

You are not your fucking code. When it's faults become obvious, have the grace to learn instead of being offended. 

Monday 18 September 2017

Fluent, descriptive testing with NExpect

Retrieving the post... Please hold. If the post doesn't load properly, you can check it out here: https://github.com/fluffynuts/blog/blob/master/20180918_IntroducingNExpect.md

This week in PeanutButter

Nothing major, really -- two bugfixes, which may or may not be of interest:

  1. PropertyAssert.AreEqual allows for comparison of nullable and non-nullable values of the same underlying type -- which especially makes sense when the actual value being tested (eg nullable int) is being compared with some concrete value (eg int). 
  2. Fix for the object extension DeepClone() -- some production code showed that Enum values weren't being copied correctly. So that's fixed now.
    If you're wondering what this is, DeepClone() is an extension method on all objects to provide a copy of that object, deep-cloned (so all reference types are new types, all value types are copied), much like underscore's, _.cloneDeep() for Javascript. This can be useful for comparing a "before" and "after" from a bunch of mutations, especially using the DeepEqualityTester from PeanutButter.Utils or the object extension DeepEquals(), which does deep equality testing, much like you'd expect.

There's also been some assertion upgrading -- PeanutButter consumes, and helps to drive NExpect, an assertions library modelled after Chai for syntax and Jasmine for user-space extensibility. Head on over to Github to check it out -- though it's probably time I wrote something here about it (:

Friday 4 August 2017

This week in PeanutButter

Ok, so I'm going to give this a go: (semi-)regularly blogging about updates to PeanutButter in the hopes that perhaps someone sees something useful that might help out in their daily code. Also so I can just say "read my blog" instead of telling everyone manually ^_^

So this week in PeanutButter, some things have happened:

  • DeepEqualityTester fuzzes a little on numerics -- so you can compare numerics of equal value and different type correctly (ie, (int)2 == (decimal)2). This affects the {object}.DeepEquals() and PropertyAssert.AreDeepEqual() methods.
  • DeepEqualityTester can compare fields now too. PropertyAssert.DeepEquals will not use this feature (hey, the name is PropertyAssert!), but {object}.DeepEquals() will, by default -- though you can disable this.
  • DuckTyper could duck Dictionaries to interfaces and objects to interfaces -- but now will also duck objects with Dictionary properties to their respective interfaces where possible.
  • String extensions for convenience:
    • ToKebabCase()
    • ToPascalCase()
    • ToSnakeCase()
    • ToCamelCase()
  • DefaultDictionary<TKey, TValue> - much like Python's defaultdict, this provides a dictionary where you give a strategy for what to return when a key is not present. So a DefaultDictionary<string, bool> could have a default value of true or false instead of throwing exceptions on unknown keys.
  • MergeDictionary<TKey, TValue> - provides a read-only "view" on a collection of similarly typed IDictionary<TKey, TValue> objects, resolving values from the first dictionary they are found in. Coupled with DefaultDictionary<TKey, TValue>, you can create layered configurations with fallback values.
  • DuckTyping can duck from string values to enums
And I'm quite sure the reverse (enum to string) will come for cheap-or-free. So there's that (: You might use this when you have, for example, a Web.Config with a config property "Priority" and you would like to end up with an interface like:

public enum Priorities
{
  Low,
  Medium,
  High
}
public interface IConfig
{
  Priorities DefaultPriority { get; }
}


And a Web.Config line like:
<appSettings>
  <add key="DefaultPriority" value="Medium" />
</appSettings>


Then you could, somewhere in your code (perhaps in your IOC bootstrapper) do:
 var config = WebConfigurationManager.AppSettings.DuckAs<IConfig>();

(This already works for string values, but enums are nearly there (:). You can also use FuzzyDuckAs<T>(), which will allow type mismatching (to a degree; eg a string-backed field can be surfaced as an int) and will also give you freedom with your key names: whitespace and casing don't matter (along with punctuation). (Fuzzy)DuckAs<T>() also has options for key prefixing (so you can have "sections" of settings, with a prefix, like "web.{setting}" and "database.{setting}". But all of that isn't really from this week -- it's just useful for lazy devs like me (:

Saturday 25 March 2017

C# Evolution

(and why you should care)

I may be a bit of a programming language nut. I find different languages interesting not just because they are semantically different or because they offer different features or even because they just look pretty (I'm lookin' at you, Python), but because they can teach us new things about the languages we already use.

I'm of the opinion that no technology with any staying power has no value to offer. In other words, if the tech has managed to stay around for some time, there must be something there. You can hate on VB6 as much as you want, but there has to be something there to have made it dominate desktop programming for the years that it did. That's not the focus of this discussion though.

Similarly, when new languages emerge, instead of just rolling my eyes and uttering something along the lines of "Another programming language? Why? And who honestly cares?", I prefer to take a step back and have a good long look. Creating a new language and the ecosystem required to interpret or compile that language is a non-trivial task. It takes a lot of time and effort, so even when a language seems completely impractical, I like to consider why someone might spend the time creating it. Sure, there are outliers where the only reason is "because I could" (I'm sure Shakespeare has to be one of those) or "because I hate everyone else" (Brainfuck, Whitespace and my favorite to troll on, Perl -- I jest, because Perl is a powerhouse; I just haven't ever seen a program written in Perl which didn't make me cringe, though Larry promises that is supposed to change with Perl 6).

Most languages are born because someone found the language they were dealing with was getting in the way of getting things done. A great example is Go, which was dreamed up by a bunch of smart programmers who had been doing this programming thing for a while and really just wanted to make an ecosystem which would help them to get stuff done in a multi-core world without having to watch out for silly shit like deadlocks and parallel memory management. Not that you can't hurt yourself in even the most well-designed language (indeed, if you definitely can't hurt yourself in the language you're using, you're probably bound and gagged by it -- but I'm not here to judge what you're into).

Along the same lines, it's interesting to watch the evolution of languages, especially as languages evolve out of fairly mundane, suit-and-tie beginnings. I feel like C# has done that -- and will continue to do so. Yes, a lot of cool features aren't unique or even original -- deconstruction in C#7 has been around for ages in Python, for example -- but that doesn't make them any less valuable.

I'm going to skip some of the earlier iterations and start at where I think it's interesting: C#5. Please note that this post is (obviously) opinion. Whilst I'll try to cover all of the feature changes that I can dig up, I'm most likely going to focus on the ones which I think provide the programmer the most benefit.

C#5

C#5 brought async features and caller information. Let's examine the latter before I offer up what will probably be an unpopular opinion on the former.

Caller information allowed providing attributes on optional functional parameters such that if the caller didn't provide a value, the called code could glean
  • Caller file path
  • Caller line number
  • Caller member name
This is a bonus for logging and debugging of complex systems, but also allowed tricks whereby the name of a property could be automagically passed into, for example, an MVVM framework method call in your WPF app. This makes refactor-renaming easier and removes some magic strings, not to mention making debug logging way way simpler. Not ground-breaking, but certainly [CallerMemberName] became a friend of the WPF developer. A better solution  for the exact problem of property names and framework calls came with nameof in C#6, but CallerMember* attributes are still a good thing.



C# 5 also brought async/await. On the surface, this seems like a great idea. If only C#'s async/await worked anything like the async/await in Typescript, where, under the hood, there's just a regular old promise and no hidden bullshit. C#'s async/await looks like it's just going to be Tasks and some compiler magic under the hood, but there's that attached context which comes back to bite asses more often than a mosquito with a scat fetish. There are some times when things will do exactly what you expect and other times when they won't, just because you don't know enough about the underlying system.

That's the biggest problem with async/await, in my opinion: it looks like syntactic sugar to take away the pain of parallel computing from the unwashed masses but ends up just being trickier than having to learn how to do actual multi-threading. There's also the fickle task scheduler which may decide 8 cores doesn't mean 8 parallel tasks -- but that's OK: you can swap that out for your own task scheduler... as long as you understand enough of the underlying system, again (as this test code demonstrates).

Like many problems that arise out of async / parallel programming, tracking down the cause of sporadic issues in code is non-trivial. I had a bit of code which would always fail the first time, and then work like a charm -- until I figured out it was the context that was causing issues, so I forced a null context and the problem stopped. The developer has to start learning about task continuation options and start caring about how external libraries do things. And many developers aren't even aware of when it's appropriate to use async/await, opting to use it everywhere and just add overhead to something which really didn't need to be async, like most web api controllers. Async/await makes a lot of sense in GUI applications though.

Having async/await around IO-bound stuff in web api calls may be good for high-volume web requests because it allows the worker thread to be re-assigned to handle another request. I have yet to see an actual benchmark showing better web performance from simply switching to async/await for all calls.
The time you probably most want to use it is for concurrent IO to shorten the overall request time for a single request. Some thought has to go into this though -- handling too requests concurrently may just end up with many requests timing out instead of a few requests being given a quick 503, indicating that the application needs some help with scaling. In other words, simply peppering your code with async/await could result in no net gain, especially if the code being awaited is hitting the same resource (eg, your database).

Which leads to the second part of C#'s async/await that I hate: the async zombie apocalypse. Because asking the result of a method marked async to just .Wait() is suicide (I hope you know it is; please, please, please don't do that), async/await patterns tend to propagate throughout code until everything is awaiting some async function. It's the Walking Dead, traipsing through your code, leaving little async/await turds wherever they go.




You can use ConfigureAwait() to get around deadlocking on the context selected for your async code -- but you must remember to apply it to all async results if you're trying to "un-async" a block of code. You can also set the synchronization context (I suggest the useful value of null). Like the former workaround, it's hardly elegant.

As much as I hate (fear?) async/await, there are places where it makes the code clearer and easier to deal with. Mostly in desktop applications, under event-handling code. It's not all bad, but the concept has been over-hyped and over-used (and poorly at that -- like EF's SaveChangesAsync(), which you might think, being async, is thread-safe, and you'd be DEAD WRONG).

Let's leave it here: use async/await when it provides obvious value. Question it's use at every turn. For every person who loves async/await, there's a virtual alphabet soup of blogs explaining how to get around some esoteric failure brought on by the feature. As with multi-threading in C/C++: "use with caution".



C#6

Where C#5 brought low numbers of feature changes (but one of the most damaging), C#6 brought a smorgasbord of incremental changes. There was something there for everyone:

Read-only auto properties made programming read-only properties just a little quicker, especially when the property values came from the constructor. So code like:

public class VideoGamePlumber
{
  public string Name { get { return _name; } }
  private string _name;

  public VideoGamePlumber(string name)
  {
    _name = name;
  }
}
became:

public class VideoGamePlumber
{
  public string Name { get; private set; }

  public VideoGamePlumber(string name)
  {
    Name = name;
  }
}
but that still leaves the Name property open for change within the VideoGamePlumber class, so better would be the C#6 variant:

public class VideoGamePlumber
{
  public string Name { get; }

  public VideoGamePlumber(string name)
  {
    Name = name;
  }
}
The Name property can only be set from within the constructor. Unless, of course, you resort to reflection, since the mutability of Name is enforced by the compiler, not the runtime. But I didn't tell you that.



Auto-property initializers seem quite convenient, but I'll admit that I rarely use them, I think primarily because the times that I want to initialize outside of the constructor, I generally want a private (or protected) field, and when I want to set a property at construction time, it's probably getting it's value from a constructor parameter. I don't hate the feature (at all), just don't use it much. Still, if you wanted to:
public class Doggie
{
  public property Name { get; set; }

  public Doggie()
  {
    Name = "Rex"; // set default dog name
  }
}
becomes:
public class Doggie
{
  public property Name { get; set; } = "Rex";
}
You can combine this with the read-only property syntax if you like:
public class Doggie
{
  public property Name { get; } = "Rex";
}
but then all doggies are called Rex (which is quite presumptuous) and you really should have just used a constant, which you can't modify through reflection.



Expression-bodied function members can provide a succinct syntax for a read-only, calculated property. However, I use them sparingly because anything beyond a very simple "calculation", eg:

public class Person
{
  public string FirstName { get; }
  public string LastName { get; }
  public string FullName => $"{FirstName} {LastName}";

  public Person(string firstName, string lastName)
  {
    FirstName = firstName;
    LastName = lastName;
  }
}
starts to get long and more difficult to read; though that argument is simply countered by having a property like:

public class Business
{
  public string StreetAddress { get; }
  public string Suburb { get; }
  public string Town { get; }
  public string PostalCode { get; }
  public string PostalAddress => GetPostalAddress();

  public Business(string streetAddress, string suburb, string Town, string postalCode)
  {
    StreetAddress = streetAddress;
    Suburb = suburb;
    Town = town;
    PostalCode = postalCode
  }
  
  private string GetAddress()
  {
    return string.Join("\n", new[]
    {
      StreetAddress,
      Suburb,
      Town,
      PostalCode
    });
  }
}
Where the logic for generating the full address from the many bits of involved data is tucked away in a more readable method and the property itself becomes syntactic sugar, looking less clunky than just exposing the GetAddress method.

Index initializers provide a neater syntax for initializing Dictionaries, for example:
var webErrors = new Dictionary()
{
  { 404, "Page Not Found" },
  { 302, "Page moved, but left a forwarding address" },
  { 500, "The web server can't come out to play today" }
};

Can be written as:
private Dictionary webErrors = new Dictionary
{
  [404] = "Page not Found",
  [302] = "Page moved, but left a forwarding address.",
  [500] = "The web server can't come out to play today."
};

It's not ground-breaking, but I find it a little more pleasing on the eye.

Other stuff that I hardly use includes:

Extension Add methods for collection initializers  allow your custom collections to be initialized like standard ones. Not a feature I've ever used because I haven't had the need to write a custom collection.

Improved overload resolution reduced the number of times I shook my fist at the complainer compiler. Ok, so I technically use this, but this is one of those features that, when it's working, you don't even realise it.

Exception filters made exception handling more expressive and easier to read.

await in catch and finally blocks allows the async/await zombies to stumble into your exception hanlding. Yay.



On to the good bits (that I regularly use) though:

using static made using static functions so much neater -- as if they were part of the class you were currently working in. I don't push static functions in general because using them means that testing anything which uses them has to test them too, but there are places where they make sense. One is in RandomValueGen from PeanutButter.RandomGenerators, a class which provides functions to generate random data for testing purposes. A static import means you no longer have to mention the RandomValueGen class throughout your test code:

using PeanutButter.RandomGenerators;

namespace Bovine.Tests
{
  [TestFixture]
  public class TestCows
  {
    [Test]
    public void Moo_ShouldBeAnnoying()
    {
      // Arrange
      var cow = new Cow()
      {
        Name = RandomValueGen.GetRandomString(),
        Gender = RandomValueGen.GetRandom(),
        Age = RandomValueGen.GetRandomInt()
      };
      // ...
    } 
  }
}

Can become:
using static PeanutButter.RandomGenerators.RandomValueGen;

namespace Bovine.Tests
{
  [TestFixture]
  public class TestCows
  {
    [Test]
    public void Moo_ShouldBeAnnoying()
    {
      // Arrange
      var cow = new Cow()
      {
        Name = GetRandomString(),
        Gender = GetRandom(),
        Age = GetRandomInt()
      };
      // ...
    } 
  }
}


Which is way more readable simply because there's less unnecessary cruft in there. At the point of reading (and writing) the test, source library and class for random values is not only irrelevant and unnecessary -- it's just plain noisy and ugly.


Null conditional operators. Transforming fugly multi-step checks for null into neat code:

if (thing != null &&
    thing.OtherThing != null &&
    thing.OtherThing.FavoriteChild != null &&
    // ... and so on, and so forth, turtles all the way down
    //     until, eventually
    this.OtherThing.FavoriteChild.Dog.Collar.Spike.Metal.Manufacturer.Name != null)
{
  return this.OtherThing.FavoriteChild.Dog.Collar.Spike.Metal.Manufacturer.Name;
}
return "Unknown manufacturer";

becomes:
return thing
  ?.OtherThing
  ?.FavoriteChild
  ?.Dog
  ?.Collar
  ?.Spike
  ?.Metal
  ?.Manufacturer
  ?.Name
  ?? "Unknown manufacturer";

and kitties everywhere rejoiced:


String interpolation helps you to turn disaster-prone code like this:

public void PrintHello(string salutation, string firstName, string lastName)
{
  Console.WriteLine("Hello, " + salutation + " " + firstName + " of the house " + lastName);
}

or even the less disaster-prone, more efficient, but not that amazing to read:
public void PrintHello(string salutation, string firstName, string lastName)
{
  Console.WriteLine(String.Join(" ", new[]
  {
    "Hello,",
    salutation,
    firstName,
    "of the house",
    lastName
  }));
}

Into the safe, readable:
public void PrintHello(string salutation, string firstName, string lastName)
{
  Console.WriteLine($"Hello, {salutation} {firstName} of the house {lastName}");
}



nameof is also pretty cool, not just for making your constructor null-checks impervious to refactoring:

public class Person
{
  public Person(string name)
  {
    if (name == null) throw new ArgumentNullException(nameof(name));
  }
}


(if you're into constructor-time null-checks) but also for using test case sources in NUnit:
[Test, TestCaseSource(nameof(DivideCases))]
public void DivideTest(int n, int d, int q)
{
  Assert.AreEqual( q, n / d );
}

static object[] DivideCases =
{
  new object[] { 12, 3, 4 },
  new object[] { 12, 2, 6 },
  new object[] { 12, 4, 3 } 
};

C#7

This iteration of the language brings some  neat features for making code more succinct and easier to grok at a glance.
Inline declaration of out variables makes using methods with out variables a little prettier. This is not a reason to start using out parameters: I still think that there's normally a better way to do whatever it is that you're trying to achieve with them and use of out and ref parameters is, for me, a warning signal in the code of a place where something unexpected could happen. In particular, using out parameters for methods can make the methods really clunky because you have to set them before returning, making quick returns less elegant. Part of me would have liked them to be set to the default value of the type instead, but I understand the rationale behind the compiler not permitting a return before setting an out parameter: it's far too easy to forget to set it and end up with strange behavior.

I think I can count on one hand the number of times I've written a method with an out or ref parameter and I couldn't even point you at any of them. I totally see the point of ref parameters for high-performance code where it makes sense (like manipulating sound or image data). I just really think that when you use out or ref, you should always ask yourself "is there another way to do this?". Anyway, my opinions on the subject aside, there are times when you have to interact with code not under your control and that code uses out params, for example:

public void PrintNumeric(string input)
{
  int result;
  if (int.TryParse(input, out result))
  {
    Console.WriteLine($"Your number is: {input}");
  }
  else
  {
    Console.WriteLine($"'{input}' is not a number )':");
  }
}

becomes:
public void PrintNumeric(string input)
{
  if (int.TryParse(input, out int result))
  {
    Console.WriteLine($"Your number is: {input}");
  }
  else
  {
    Console.WriteLine($"'{input}' is not a number )':");
  }
}

It's a subtle change, but if you have to use out parameters enough, it becomes a lot more convenient, less noisy.

Similarly ref locals and returns, where refs are appropriate, can make code much cleaner. The general use-case for these is to return a reference to a non-reference type for performance reasons, for example when you have a large set of ints, bytes, or structs and would like to pass off to another method to find the element you're interested in before modifying it. Instead of the finder returning an index and the outer call re-referencing into the array, the finder method can simply return a reference to the found element so the outer call can do whatever manipulations it needs to. I can see the use case for performance reasons in audio and image processing as well as large sets of structs of data. The example linked above is quite comprehensive and the usage is for more advanced code, so I'm not going to rehash it here.

Tuples have been available in .NET for a long time, but they've been unfortunately cumbersome. The new syntax in C#7 is changing that. Python tuples have always been elegant and now a similar elegance comes to C#:
public static class TippleTuple
{
  public static void Main()
  {
    // good
    (int first, int second) = MakeAnIntsTuple();
    Console.WriteLine($"first: {first}, second: {second}");
    // better
    var (flag, message) = MakeAMixedTuple();
    Console.WriteLine($"first: {flag}, second: {message}");
  }

  public static (int first, int second) MakeAnIntsTuple()
  {
    return (1, 2);
  }

  public static (bool flag, string name) MakeAMixedTuple()
  {
    return (true, "Moo");
  }
}

Some notes though: the release notes say that you should need to install the System.ValueTuple nuget package to support this feature in VS2015 and lower, but I found that I needed to install the package on VS2017 too. Resharper still doesn't have a clue about the feature as of 2016.3.2, so usages within the interpolated strings above are highlighted as errors. Still, the program above compiles and is way more elegant than using Tuple<> generic types. It's very clever that language features can be delivered by nuget packages though.

Local functions provide a mechanism for keeping little bits of re-used code local within a method. In the past, I've used a Func variable where I've had a little piece of re-usable logic:
private NameValueCollection CreateRandomSettings()
{
  var result = new NameValueCollection();
  Func randInt = () => RandomValueGen.GetRandomInt().ToString();
  result["MaxSendAttempts"] = randInt();
  result["BackoffIntervalInMinutes"] = randInt();
  result["BackoffMultiplier"] = randInt();
  result["PurgeMessageWithAgeInDays"] = randInt();
  return result;
}

which can become:
public static NameValueCollection GetSomeNamedNumericStrings()
{
  var result = new NameValueCollection();
  result["MaxSendAttempts"] = RandInt();
  result["BackoffIntervalInMinutes"] = RandInt();
  result["BackoffMultiplier"] = RandInt();
  result["PurgeMessageWithAgeInDays"] = RandInt();
  return result;

  string RandInt () => GetRandomInt().ToString();
}

Which, I think, is neater. There's also possibly a performance benefit: every time the first implementation of GetSomeNamedNumericStrings is called, the randInt Func is instantiated, where the second implementation has the function compiled at compile-time, baked into the resultant assembly. Whilst I wouldn't put it past the compiler or JIT to do clever optimisations for the first implementation, I wouldn't expect it.

Throw expressions also offer a neatening to your code:

public int ThrowIfNullResult(Func source)
{
  var result = source();
  if (result == null) 
    throw new InvalidOperationException("Null result is not allowed");
  return result;
}
can become:
public int ThrowIfNullResult(Func source)
{
  return source() ?? 
    throw new InvalidOperationException("Null result is not allowd");
}

So now you can actually write a program where every method has one return statement and nothing else.



Time for the big one: Pattern Matching. This is a language feature native to F# and anyone who has done some C#-F# crossover has whined about how it's missing from C#, including me. Pattern matching not only elevates the rather bland C# switch statement from having constant-only cases to non-constant ones:

public static string InterpretInput(string input)
{
  switch (input)
  {
    case string now when now == DateTime.Now.ToString("yyyy/mm/dd"):
      return "Today's date";
    default:
      return "Unknown input";
  }
}

It allows type matching:

public static void Main()
{
  var animals = new List()
  {
    new Dog() {Name = "Rover"},
    new Cat() {Name = "Grumplestiltskin"},
    new Lizard(),
    new MoleRat()
  };
  foreach (var animal in animals)
    PrintAnimalName(animal);
}

public static void PrintAnimalName(Animal animal)
{
  switch (animal)
  {
    case Dog dog:
      Console.WriteLine($"{dog.Name} is a {dog.GetType().Name}");
      break;
    case Cat cat:
      Console.WriteLine($"{cat.Name} is a {cat.GetType().Name}");
      break;
    case Lizard _:
      Console.WriteLine("Lizards have no name");
      break;
    default:
      Console.WriteLine($"{animal.GetType().Name} is mystery meat");
      break;
  }
}

Other new features include generalized async return typesnumeric literal syntax improvements and more expression bodied members.

Updates in C# 7.1 and 7.2

The full changes are here (7.1) and here (7.2). Whilst the changes are perhaps not mind-blowing, there are some nice ones:
  • Async Main() method now supported
  • Default literals, eg: Func foo = default;
  • Inferred tuple element names so you can construct tuples like you would with anonymous objects
  • Reference semantics with value types allow specifying modifiers:
    • in which specifies the argument is passed in by reference but may not be modified by the called code
    • ref readonly which specifies the reverse: that the caller may not modify the result set by the callee
    • readonly struct makes the struct readonly and passed in by ref
    • ref struct which specifies that a struct type access managed memory directly and must always be allocated on the stack
  • More allowance for _ in numeric literals
  • The private protected modifier, which is like protected internal except that it restricts usage to derived types only

Wrapping it up

I guess the point of this illustrated journey is that you really should keep up to date with language features as they emerge:
  • Many features save you time, replacing tedious code with shorter, yet more expressive code
  • Some features provide a level of safety (eg string interpolations)
  • Some features give you more power
So if you're writing C# like it's still .net 2.0, please take a moment to evolve as the language already has.

Friday 24 February 2017

Polymer: an approach for maximising testability

Recently I've been working with the Polymer framework at a client. Whilst I'm not by any means ready to evangelise for the framework, I can see some reasons why people might use and adopt it.

This post is not about why you should though. It's really just to share my learnings and process in the hopes that something in there might be useful.

Technologies which I've ended up using are:
  • Polymer
  • Typescript
  • Gulp
This is not because any of them are definitely the best. Polymer is what the client is already using. Typescript has some advantages which I'll outline below. I tried first with Webpack instead of Gulp for build but didn't have much joy with getting Vulcanize to work with that. I tried two plugins -- neither got me all the way there so I gave up. I'll try again another time, particularly because Webpack would allow me to easily use proper Typescript imports instead of the <reference /> method and partially because of the great dev server and on-the-fly transpilation of just the stuff that changed during dev. 
What we have still works, so it's Good Enough™ and we can start solving the actual problem we were trying to solve with code, now that we have a working strategy for build/test/distribute.

1. Polymer

What is it?
Polymer is Yet Another Javascript Framework for the front-end, offering a way to create re-usable components for your website ("webcomponents"). Proponents of Polymer will tell you that it's "the next standard" or that "this is how the web will be in the foreseeable future". You'll hear about how styles can't bleed out of one component into another or how simple it is to write self-contained components. While there's a lot of merit in those last two, the first part hinges on all browsers supporting the proposed feature of HTML imports, which work with varying degrees of success across different browsers and some vendors are even expressing doubt about the proposal (http://caniuse.com/#search=html%20imports).
Whether your browser supports it or not, you can gloss over that with the webcomponentsjs polyfill, which you can obviously get via your package manager of choice (though typically, that's bower). This polyfill also helps with browsers which don't do shadow DOM, another feature you'll be needing to Polymerize the world.

Why use it?
Just like with any of the other frameworks (Angular, CanJs, Vue, Knockout, Ember, etc), the point is to make it possible to make great front-ends with cleaner, better code. All of these frameworks offer pros and cons -- I'm not an evangelist for any of them, though I'll slip in a word here that I find Angular 1.x the most convenient -- and it's still being updated (1.6.1 at last look) -- though I'll not go into all of that now as it's a bit of a diversion.

My 2c:
So far, I don't hate it but I'm not enthralled either. Polymer tries to treat everything as a DOM element and the analogy falls a little flat for things that should have been the equivalent of Angular services. Don't get me wrong -- in the Angular world, you should be making directives (or in 2.x, components) -- it's the Right Thing To Do. But often you have little bits of logic which should be neatly boxed in their own space and you would often make them available to consumers through some kind of reference/import (harder to refactor against) or some kind of dependency injection (as you might with Angular 1.x services; Angular 2 has a similar concept but I still find code requiring relative-path imports and that, combined with no (Java|Type)script equivalent of the power wielded by Visual Studio + Resharper for C# means that re-organising code is more effort than it really needs to be. However, I need to revisit -- perhaps there is a way to bend Angular2 to my will. But I digress.

It does seem as if unit testing is a secondary concern in the Polymer world -- and that makes me less excited about it. Officially, you're recommended to use Web Component Tester for testing, but it's based on Selenium which brings some drawbacks to the party: it's slow to start (making the TDD cycle tedious), it's noisy in the console (making grokking output a mission) and the only way people have managed to get it to work in CI is through trickery that can't be achieved on all platforms (like xvfb, or headless chromium, which uses a similar trick) -- and I need this to work on the client's TFS build servers. So a reasonable amount of effort went into getting testing working in Chrome (for developer machines, because debugging is easier) and PhantomJS (for the build server, because it's properly headless).

2. Typescript

What is it?
Typescript is a language with a transpiler which produces Javascript as a build artefact. That Javascript is what runs in the browser. Typescript is heralded as the panacea to so many of the problems of the web (and Node) worlds for it's typing system, allegedly making your code type-safe and saving you from the perils of a weakly typed (or untyped, depending on your perspective) language.

That argument is complete bull muffins.

Simply because I can do this:

interface SomeFancyClass {
  public name: string;
} 
// compiles just fine; name is just undefined -- like any other property 
//   you might attempt to read.
const notFancyAtAllButPretends = {} as SomeFancyClass;

There are also a few oddnesses along the way, like when you define an interface with one optional property and one not-optional -- suddenly you have to define all properties which aren't optional, so the code above may not work (this was in TS 1.5, so I'm assuming it's still the case). And other bits.

But I'm not here to hate on Typescript. Initially, I didn't see the point of it, but the biggest wins you'll get out of Typescript (imo) are:
  1. Being able to tell what a function needs by looking at the declaration instead of guessing from parameter names (or worse: having to read all of the logic inside that and every subsequent function where the arguments are passed on or partially-passed on).
  2. Helping your dumb editor to be less dumb: intellisense in Javascript is a mess. Some editors / IDE's get it right a lot of the time, some get it right some of the time, some just don't bother at all and some (I'm looking at you, Visual Studio) basically suggest every keyword they've ever seen and then crash. With Typescript, the load of figuring out intellisense is offloaded onto the developer, which seems like a bit of a fail at the outset, but the time you spend defining interfaces will be paid back in dev time later. Promise.
  3. vNext Javascript features with the easiest setup: yes, you can go .jsx and Babel all the way, but I found Babel (initially) to be more effort to get working. I now use it as part of my build chain (more on that later), but Typescript is still at the front, so I can get the goodness above. Features I want include async/await (done properly, not to be confused with the .net abortion), generators, more features on built-in types, classes (which confuse the little ones less than prototypes :/ ) and other stuff. Again, I could get most of this from Babel (probably all, if I get hungry with presets), but Typescript has client acceptance and street cred. So, you know, whatever works.
Why use it?
Because of all the great points above. If you transpile to es5, you have the simplest setup to get newer features in your code, but you won't be able to use async/await. If you transpile to es6, you can have all the features, but dumb browsers will stumble. This is where I bring in Babel to do what it does well and transpile down to es5 with all the required shims so that everyone can have Promises and other goodies.
In addition, you will need to end up with es5 to satisfy the Polymer build tool, Vulcanize. Which is why my example repo transpiles eventually down to es5.
Basically, Typescript allows you to leverage new language features to write more succinct, easier-to-maintain (imo) code.

My thoughts:
I like Javascript, I really do. I hate stupid browsers and browsers which are "current-gen" but still don't support simple features like Promises. I also hate stupid IDE's and I absolutely loathe time wasted whilst you restart an editor that crashes often (so I don't use VS for js or ts, sorry) or whilst you're trying to (re-)load all code in the domain so you can call some of it. Typescript helps me to be more productive, so, after initially being quite open about not seeing the point, play-testing has made me like it.

3. Gulp

What is it?
There are more than enough Javascript build systems, but the ones people tend to make a noise about are Grunt, Gulp and Webpack.

I came into the game late, so Grunt was already being succeeded by Gulp. Gulp is analogous to Make in that you define tasks with their own logic and dependencies. It's very powerful because there are bajillions of modules built for it and it does practically everything with pipes, so it can be quite elegant. It (like Grunt), doesn't actually have a focused purpose: it's a task system, not actually a build system, but it suits builds very well, with some work.

Webpack is quite a focused build system which also has a development server with the ability to automatically rebuild and reload when changes are made, making the design-dev feedback cycle pleasantly tight.
Unfortunately, I haven't (as yet) managed to get Webpack to play nicely with Vulcanize (the tool used to compress and optimise Polymer components). I read that it's possible, so it's quite likely that my Webpack-fu is simply lacking. At any rate, the client is already using Gulp, so it's accepted there and easier for them to maintain. So Gulp it is.

Why use it?
You need some process to perform the build chain:
  • Transpile (and hopefully lint) Typescript
  • Run tests
  • Pack / optimize for distribution / release
You could use batch files for all I care, but having a tasking system allows you to define small tasks that are part of the whole and then get all of them to run in the correct order. Gulp is well supported and has more plugins than you can shake a stick at. There's a lot of documentation, blog material and StackOverflow questions/answers, so if you need a primer or hit a problem, finding information is easy.
On the flip side, it is just a tasking system, so you're going to write more code than with Webpack -- but you'll also have total control.

My 2c:
I've used Gulp a reasonable amount before. I even have a free, open-source collection of gulp tasks for the common tasks involved in building and testing .net projects (and projects with karma-based Javascript tests). It works well and I'm not afraid of the extra time to set up. If you break your tasks up into individual files and use the require-dir npm module to source them into your master gulpfile.js, you can get a lot of re-usability and easy-to-manage code.

Bootstrapping a Polymer project

Example code for the below can be found at: https://github.com/fluffynuts/polymer-ts-scratch, which is free to use and clone, if you find it useful.

Since testing is of primary importance, I need to know that I have the tooling available to write and run tests both on my dev machine and at the build server. Tests need to run reasonably quickly so that your TDD cadence isn't tedious. Web Components Tester fails both of these, so we're looking at using Karma to run the tests as we're going to need a DOM, so the pure Jasmine or Mocha runner on their own won't help.

The testing strategy is to have all Polymer components properly registered and actually create them in the DOM, after which they can have methods and properties stubbed/spied for testing and we can test as we would any Javascript logic.

Now running Jasmine tests via Karma isn't all that novel. But the existing team had been unable to run their tests at the build server because they couldn't get Polymer tests to run in PhantomJS, but that turned out to just be a timing issue: whilst Chrome has Polymer bootstrapped in time before the elements are loaded, Phantom appears to be doing some stuff either out-of-order or just plain asynchronously (more likely), so a first pass at testing Polymer components in PhantomJS will yield negative results: essentially your Polymer components aren't created -- you just get arbitrary "unknown" elements in the page -- and it becomes obvious when the {element}.$ (Polymer's way of getting Polymer stuff attached to your element) is undefined.

However, timing issues with test bootstrapping is not something I've never seen with Karma before. I remembered futzing about with window.__karma__.start and the magick comes in from test-setup.js, where I hijack Karma's start function and kick it off manually after webcomponents-lite.js has run in all of its logic and emitted the WebComponentsReady event.

So now the tests are running in PhantomJS. Great.

But I also notice that Polymer does a lot in the DOM via the element's template. Since the template is rendered to the final result once the element has been registered, testing that the template has been set up correctly can be tricky. You could test behavior (so trigger an {enter} keypress on your search entry and check that your search handler was called) -- and that's not wrong, but perhaps a step higher than I want to be initially (but a test which should come eventually).
I'd like to break this down into 3 tests:
  1. Does the template have the on-search attribute defined and set to bind to my handler?
  2. Does my Polymer element actually implement a method with that name (ie, does the handler exist)?
  3. Can I trigger the handler from the expected user behavior in the element (ie, pressing {enter} in the search box)
I want to do this to make failures more obvious: if we change the template, two tests fail: the behavioral one and the one testing that our template is correctly defined -- so we know where the problem is. Likewise if we change the Polymer element code (ie, rename or remove the function), we get two failures which point us to fixing the script code, not the template.

The problem, as stated above, is that you can't easily see the template code once the element is registered: creating an instance of the element wipes out template code in favour of final, rendered code.

I've used jsdom before and it's really cool: an implementation of a browser DOM which can be used from within NodeJS, for when you want to do a lot of the work that a browser does without actually invoking a browser.
Being a Node module, it's all split out neatly into different functional scripts, require()'d in as necessary. This won't play nicely in the browser, but we can use browserify to deal with that (indeed, there's a great discussion about jsdom and browserify here, including a bit of a tongue-in-cheek discussion about not having a DOM in the browser...).

browserify -r jsdom -s jsdom -o jsdom.js
This works fine in Chrome because browserify only bundles up the code -- it doesn't do any ES transforms on it and the jsdom code contains keywords like const.
The first time I ran this in PhantomJS, it barfed and failed miserably -- but that's OK: we already have Babel in the project (see stuff about Typescript and es5/es6 above), so we can transform to es5 and include that and PhantomJS is happy.

The above has been captured in the npm script "jsdom" in the example repo.

So now the strategy for testing templates is:
  1. Use jQuery to get the template as raw text from the Karma server with $.get()
  2. Parse with jsdom
  3. keep the useful template artifact in a variable that tests can get to. I normally prefer to not store stuff like this over the lifetime of a test suite, but I'd rather not go through (1) and (2) that often.
Karma configuration is required to load the following first:
  • jQuery
  • babel-polyfill

because I found that if they weren't first, I'd get errors about something trying to extend an object which wasn't supposed to be extended. We need the babel-polyfill to support the es5'd jsdom we made earlier. We also get karma to serve up our jsdom.js that we created above (by serving everything under src/specs/lib), ensuring that it's embedded in the karma test page so we can use the global jsdom declared in our test-utils/interfaces.ts. To reiterate: load order is important.

The highlights from here are:
  • src/specs/test-setup.ts is shows how we can hijack the karma start function to call it when we're good and ready. 
  • src/specs/sts-entry.ts shows some rudimentary testing of a Polymer element as loaded into the DOM, as it would be in live code
  • src/specs/sts-consumer.ts shows some rudimentary DOM testing of the template for sts-consumer (which is a contrived example: it simply wraps an sts-entry in a div)
Another goodie I found during this process is gulp-help which makes your gulpfile even more discoverable for other team members with very little work. Not only can you annotate simple help for tasks, you can have the help omit tasks by providing false for the "help". In that way, you can end up with succinct help for your most interesting (typically top-level) tasks. Win!

Wednesday 15 February 2017

Gentoo adventures: overlays are great -- but you can make them even better!

AKA "how I was going to code but got side-tracked with interesting Gentoo stuff instead"


One of the features I enjoy about Gentoo is overlays which are functionally equivalent to Ubuntu's PPA repositories: places where you can get packages which aren't officially maintained by the main channel of the distribution.

Of course, just like with PPAs, you need to pay attention to the source from which you're getting this software -- you are about to install software onto your Linux machine, which requires you to run as root and said software could do nefarious stuff during installation -- let alone after installation, when you're running that software.

So, naturally, just like with any software source (in the Windows world, read: ALL software, because basically none of it is vetted by people with your interests in mind), you need to check that (a) you're happy with the advertised functionality of said software and (b) you trust the source enough that the software upholds the contract to provide that functionality -- and only that functionality -- and not trashing your system, leaking your passwords, killing your hamster, drinking all your beer or wall-hacking in an fps and calling you fag or anything stupid like that... But I digress.

Note that in the discussion below, I'll use the term "package" because that's logically what I'm used to; however the more correct term in Gentoo land is "atom" -- and I'll use that sometimes too (: I'm slowly evolving (:

Anyway, warnings about nasty coders aside, there's another concern: package clashing. Let's say, for example, you're looking for a source for your favourite editor (it may even be VSCode, which is a reasonable editor, though if I really had to pick a favourite, it would be (g)vim -- but VSCode is high up on the list. Anyway, let's say you were looking for VSCode on Linux (which is possible) and you happened to find an overlay providing it. That overlay may also provide other packages with the same category/name as package which are in the main source set. Or may conflict with another overlay for a similar reason. So a good idea is to start by masking the entire overlay, adding the following to a file in your /etc/portage/package.mask (I have an overlays file in there, and add one line per overlay):

*/*::{overlay name}

where {overlay name} is obviously replaced by the name of the new overlay you added with layman (reminder: you add an overlay with layman -f -a {overlay name})

Next, you unmask the package you actually want from that overlay, in /etc/package.unmask (again, I have an overlays file in there for this purpose):

### {overlay name}
{category}/{atom}

or, a concrete example, using visual studio code:

### jorgicio
app-editors/visual-studio-code

Now you can do an emerge -a {atom} and see if you have other requirements to meet (licenses or keywords for example: many overlays will require the testing keyword for your architecture, which, for me is: ~amd64)

The advantage from the above is that you have far less chance of overlays duking it out as to who provides the packages you want as a lot of overlay maintainers maintain many more than just one package in their overlays. You can still search for packages which exist in those overlays with emerge --search {atom} and unmask them as required. Also don't forget you can find packages (and their overlays) on the great Zugaina site!

In addition to the above, I've discovered today that, aligned with Gentoo's philosophy of requiring the user to actively select installations (instead of just foisting them on the user), when you add an overlay with layman, by default, it will not auto-update!. So, I expected that emerge --sync would also sync added overlays -- and by default, it won't. You can manually sync overlays with layman's --sync command (passing an overlay name or the magic ALL string to sync all of your added overlays), but I'd prefer to have this as part of my regular sync, analogous to apt-get update being applicable to all sources.
Indeed, I only figured this out after installing visual-studio-code and wondering why the editor kept reminding me to update when I couldn't see an update in the overlay I was referencing -- it was all because the "index" for that overlay wasn't being synced... So the next issue is auto-updating.

If, like me, you'd like to auto-update, a quick look at emerge's rather extensive man page, under the --sync section, shows the following:

--sync Updates  repositories,  for which auto-sync, sync-type and sync-uri attributes are set in repos.conf

Interesting.

It turns out that repos.conf, in my case at least (and I believe it should be for any modern Gentoo system), is a directory (/etc/portage/repos.conf) with two files -- the all-important gentoo.conf for the main source and layman.conf, which contains entries for each added overlay. And in layman.conf, we see sections like:

[steam-overlay]
priority = 50
location = /var/lib/layman/steam-overlay
layman-type = git
auto-sync = No


The priority is also interesting -- because you can use this to decide, when there is a package conflict between sources providing the same atom, where that atom should come from. But it's not the topic of discussion today. Today, I'm more interested in that last line. Changing:

auto-sync = No

to

auto-sync = Yes

for each overlay yielded the results I wanted: when I emerge --sync, my overlays are updated too and I can get updates from them. Of course, the emerge --sync command takes a little longer to run -- but I don't mind: I can get shiny new stuff! Combined with the postsync.d trick outlined here to update caches and local indexes, you can also have fast emerge searching. FTW.

And now you can too (:

Parting tidbit: if perhaps you wanted to apply the masking/unmasking strategy from above to existing overlays, you can use the following command to list installed atoms from an overlay:

equery has repository {overlay}

then apply the */*:{overlay} mask and unmask the installed packages.

Friday 13 January 2017

Gentoo adventures: running a script after a package is updated / installed (emerge hook)

So here's a neat little thing that I learned tonight: how to run a script after a specific package is installed or updated.

Why?

Because I use the net-fs/cifs-utils package to provide mount.cifs and I'd like scripts run by an unprivileged user to be able to mount and dismount entries in /etc/fstab. I'd also like user-space tools like the Dolphin file manager to be able to mount these points.

At some point, there was an suid USE flag for net-fs/cifs-utils but it was decided that the USE flag be removed due to security concerns. Which is probably a good decision to have as a default, but it's a little annoying when my script to sync media to my media player fails because it can't (re-)mount the samba share after net-fs/cifs-utils was updated.

I had a hunch that there would be a way to hook into the emerge process -- and Gentoo (or rather, Portage) didn't let me down. After finding https://dev.gentoo.org/~zmedico/portage/doc/portage.html#config-bashrc-ebuild-phase-hooks (which alluded to creating bash functions in a special file) and https://wiki.gentoo.org/wiki/Handbook:X86/Portage/Advanced#Using_.2Fetc.2Fportage.2Fenv which hinted at where to do such magic, and after a little play-testing, I finally got to adding the following function to the (new) file /etc/portage/bashrc:


function post_pkg_postinst() {
  if test "$CATEGORY/$PN" = "net-fs/cifs-utils"; then
    TARGET=/sbin/mount.cifs
    echo -e "\n\n\e[01;31m >>> Post-install hook: setting $TARGET SUID <<<\e[00m\n\n"
    chmod +s $TARGET
  fi
}


Which automatically sets the SUID bit and gives me a giant red warning that it did so.

https://wiki.gentoo.org/wiki/Handbook:X86/Portage/Advanced#Using_.2Fetc.2Fportage.2Fenv  seemed to suggest that I could create the hook in /etc/portage/env/net-fs/cifs-utils/bashrc, but that didn't work out for me (EDIT: putting stuff under there is a very bad idea unless you're only setting environment variables -- I initially put my hooks under there and found that subsequent emerges installed under /etc/portage/env !!) . The above did and was (reasonably) simple to implement, after a few install attempts and dumping the environment that the script runs in with the env command.

Refactoring time: I'd like to be able to write individual scripts for packages (if I ever do this again), like I do with USE flags, where I store all flags pertinent to the installation of a higher-order package in /etc/portage/package.use/higher-order-package so I know where lower-down USE flags come from, for example my /etc/portage/package.use/chromium file contains:

www-client/chromium -cups -hangouts
>=dev-libs/libxml2-2.9.4 icu
>=media-libs/libvpx-1.5.0 svc cpu_flags_x86_sse4_1 postproc


Wherein I can always reminisce about how I had to set the icu flag on libxml2 to successfully install chromium (https://wiki.gentoo.org/wiki/Qt/FAQ#qtwebkit_vs_chromium_block_caused_by_icu)

Similarly, it would be nice if I could just create code snippets under /etc/portage/env/${CATEGORY}/${PN}, perhaps even have category-wide scripts (which an individual package could override)... hm. Ok, so first, we move the script above to /etc/portage/env/net-fs/cifs-utils/post-install. Then we create a new /etc/portage/bashrc with these contents:

PORTAGE_HOOKS_PREFIX=/etc/portage/hooks
PACKAGE_HOOKS_DIR="$PORTAGE_HOOKS_PREFIX/$CATEGORY/$PN"
CATEGORY_HOOKS_DIR="$PORTAGE_HOOKS_PREFIX/$CATEGORY"
function source_all() {
  if test -d "$1"; then
    for f in $1/*; do
      if test -f "$f"; then
        source "$f"
      fi
    done
  fi
}
source_all "$CATEGORY_HOOKS_DIR"
source_all "$PACKAGE_HOOKS_DIR"


Which, as suggested above, loads all scripts from the category folder (if it exists) then overlays with scripts from the package folder (if that exists). So now it's trivial to add hooks for any of the emerge phases outlined in https://dev.gentoo.org/~zmedico/portage/doc/portage.html#config-bashrc-ebuild-phase-hooks

Side-note: since the scripts above can change any environment variable involved in the build, it's quite important to use nice, long, specific variable names. One of my first attempts used PREFIX instead of PORTAGE_HOOKS_PREFIX and that just ended up installing all new packages under the /etc/portage/hooks directory because, of course, $PREFIX is an environment variable respected by most build scripts... *facepalm*.

Once again, this becomes a testament to how Gentoo provides an operating system which is totally yours. You have the control to deal with all of it, its state is as a result of your actions and choices. Once again, I wish I'd switched years ago.

Monday 9 January 2017

EF-based testing, with PeanutButter: Shared databases

The PeanutButter.TestUtils.Entity Nuget package provides a few utilities for testing EntityFramework-based code, backed by TempDb instances so you can test that your EF code works as in production instead of relying on (in my experience) flaky substitutions.
One is the EntityPersistenceTester, which provides a fluent syntax around proving that your data can flow into and out of a database. I'm not about to discuss that in-depth here, but it does allow (neat, imo) code like the following to test POCO persistence:

// snarfed from EmailSpooler tests
[Test]
public void EmailAttachment_ShouldBeAbleToPersistAndRecall()
{
    EntityPersistenceTester.CreateFor<EmailAttachment>()
        .WithContext<EmailContext>()
        .WithDbMigrator(MigratorFactory)
        .WithSharedDatabase(_sharedTempDb)
        .WithAllowedDateTimePropertyDelta(_oneSecond)
        .ShouldPersistAndRecall();
}

which prove that an EmailAttachment POCO can be put into, and successfully retrieved from a database, allowing DateTime properties to drift by a second. All very interesting, and Soon To Be Documented™, but not the focus of this entry.

I'd like to introduce a new feature, but to do so, I have to introduce where it can be used. A base class TestFixtureWithTempDb<T> exists within PeanutButter.TestUtils.Entity. It provides some of the scaffolding required to do more complex testing than just "Can I put a POCO in there?". The generic argument is some implementation of DbContext and it's most useful when testing a repository as it provides a protected GetContext() method which provides a spun-up context of type T, with an underlying temporary database. By default, this is a new, clean database every test, but you can invoke the protected DisableDatabaseRegeneration() method in your test fixture's [OneTimeSetup]-decorated method (or constructor, if you prefer) to make this database live for the lifetime of your test fixture. The base class takes care of disposing of the temporary database when appropriate so you can focus on the interesting stuff: getting your tests (and then code) to work. A full (but simple) example of usage can be found here: https://github.com/fluffynuts/PeanutButter/blob/master/source/Utils/PeanutButter.FluentMigrator.Tests/TestDbMigrationsRunner.cs

My focus today is on a feature which can help to eliminate a pain-point I (and others) have experienced with EF testing backed onto a TempDb instance: time to run tests. EF takes a second or two to generate internal information about a database the first time some kind of activity (read/write) to that database is done via an EF DbContext. Not too bad on application startup, but quite annoying if it happens at every test. DisableDatabaseRegeneration() helps, but still means that each test fixture has a spin-up delay, meaning that when there are a few test fixtures, other developers on your team become less likely to run the entire test suite -- which is bad for everyone.

However, after some nudging in the right direction by co-worker Mark Whitfeld, I'd like to announce the availability of the UseSharedTempDb attribute in PeanutButter.TestUtils.Entity as of version 1.2.120, released today.
To use, decorate your test fixture:

[TestFixture]
[UseSharedTempDb("SomeSharedTempDbIdentifier")]
public class TestCrossFixtureTempDbLifetimeWithInheritence_Part1
    : TestFixtureWithTempDb
{
    // .. actual tests go here ..
}


And run your tests. And see an exception:

PeanutButter.TestUtils.Entity.SharedTempDbFeatureRequiresAssemblyAttributeException : 
The UseSharedTempDb class attribute on TestSomeStuff 
requires that assembly SomeProject.Tests have the attribute AllowSharedTempDbInstances.

Try adding the following to the top of a class file:
[assembly: PeanutButter.TestUtils.Entity.Attributes.AllowSharedTempDbInstances]


So, follow the instructions and add the assembly attribute to the top of your test fixture source file and re-run your tests. Congratulations, you're using a shared instance of a TempDb which will be cleaned up when NUnit is finished, providing, of course, that you don't interrupt the test run yourself (:

What's new in PeanutButter?

Retrieving the post... Please hold. If the post doesn't load properly, you can check it out here: https://github.com/fluffynuts/blog/...