2013-05-24

Ten Coding Philosophies

These are some of my coding philosophies, in no particular order. Some of them are likely to be controversial, but I’m not suggesting that you adopt them or even that they would be useful for you.

This post is really for me.

Comment your code

Comments on public methods and properties should say what the method does. A developer who wants to use a method shouldn’t be required to read the code to figure out if it does what’s needed.

Inline comments should say why it was implemented this way. A developer reading the code is probably trying to change it. It is difficult to change something without first understanding the reasoning behind its current state.

Code should be self-documenting. Choose appropriate names for everything and divide it up so it’s readable. However, self-documenting code isn’t an excuse to omit comments. Code tells you how, not why, no matter how descriptive you make your variable names.

I’m not a fan of automated comment creation tools like GhostDoc. If a method signature is well-crafted enough that a tool like GhostDoc can generate an accurate comment, the comment must be obvious. What benefit does an obvious comment provide? Why include obvious comments in your code? Jeff Atwood calls this “undocumentation”.

Process is not a substitute for competent programmers

Joel Spolsky refers to this in his essay about the McDonald’s Methodology. You can hire someone with no experience, give him a 10-step guide to making a burger, and he’ll churn out endless amounts of near-food with very little effort. However, that same worker will never produce a 5-star meal, no matter how complex the instruction book.

In the world of software, the Process Guys believe that they can create better software not by hiring better programmers, but by carefully documenting, describing, categorising and monitoring every step of the development process.

Do they always produce 5-star software? Nope.
Do they still create software burgers? Yep.

Standup meetings, backlogs, retrospectives and all of the other pomp and ceremony in the Methodology du jour won’t enable a mediocre developer to produce a quality piece of software.

Quality isn’t an afterthought

This is one of my favourite Steve Jobs quotes:

A great carpenter isn’t going to use lousy wood for the back of a cabinet, even though nobody’s going to see it.

If the UI is polished and beautiful but the code is badly designed, unstable, spaghettified crap, it’s not a quality product. It’s a shiny turd.

Quality should pervade the product. The entire system should be well designed, even if no-one but the developers will ever see the code or appreciate the tidy architecture. Why? Because software maintenance represents a significant cost, and the only way to reduce that is by creating high-quality software in the first place.

Remain detached from the tools you use

Here’s a bug report for TeamCity:

Support for TFS branches
like for Git and Mercurial in version 7.1, but for TFS

Mercurial and Git both support lightweight branches, and TeamCity can now automatically build those branches without admins needing to create new project definitions. TFS does not support lightweight branching.

The bug report is like asking Ford to retrofit air conditioning and alloy wheels to a horse and carriage. If you need this feature, upgrade to a modern source control system. Microsoft are trying to make this easier by adding Git support to Visual Studio.

Abandon tools that no longer serve their purpose or that have been superceded by something better.

Learning a new programming language will make you a better programmer

Polyglots seem to be the most capable programmers. Do programmers that learn multiple languages become more highly skilled as a result? Or do highly skilled programmers inevitably learn multiple languages?

In either case, learning more languages can only be a good thing. I’d expect a well-rounded programmer to know at least one systems programming language, a scripting language and a language for web development (client or server side; preferably both). I should probably learn a functional language.

Would you hire a carpenter whose only tool was a hammer?

Choose the best tool for the job, even if that means learning a new tool

If you are handed a new platform to start developing for and your first instinct is to try and find a C# compiler for it, because C# is what you use now, you are doing it wrong. A new platform is an opportunity for you to learn something new: a new language, a new platform, a new IDE; you can learn new patterns, new approaches, and come out of the experience a better programmer.

I don’t know who originally said this, but it’s appropriate:

Some people work for 5 years and gain 5 years of experience. Some people work for 5 years and gain 1 year of experience 5 times.

Reaching for the same old toolset isn’t going to teach you anything new. Don’t complacently squander a rare opportunity for advancement.

This is particularly relevant in the era of mobile development. Do you learn to code for Android and iOS and create great apps for both, or do you go the “web app” route – as Facebook and LinkedIn tried to do – and give everyone the same second-rate experience?

Learn to use your tools as they were designed to be used

There’s nothing so irritating as opening up a codebase written in one language and seeing nothing but the conventions and constructs of another. If you are writing JavaScript, learn how to write JavaScript. If you are writing C#, learn how to write C#. Don’t pretend that the two languages are equivalent and write C# everywhere.

In a similar vein, don’t try to use tools for purposes for which they are not suited. You can bang a screw in with a hammer, and you can use SharePoint as a CMS. Neither was designed for the purpose and there are far more appropriate tools out there.

“That’s how we’ve always done it” is never an acceptable answer

What this answer really means is:

  • We’re lazy
  • We’re afraid
  • We don’t understand
  • We don’t care

Your API should be beautiful

Your API is the UI that other developers will see. Don’t produce garbage like libxml’s htmlEncodeEntities function, which for optimal usage requires you to precognitively know its output before you call it.

Create the best solution you can using the information you currently have

The Agile enthusiasts I’ve known have a tendency to believe that they can implement hacky, half-baked solutions for everything because, y’know, iterative. If it’s important it’ll get fixed later.

Iterative. Iterative.

The process they end up following is what Spolsky calls the “infinite defects methodology”:

The story goes that one programmer, who had to write the code to calculate the height of a line of text, simply wrote “return 12;” and waited for the bug report to come in about how his function is not always correct. The schedule was merely a checklist of features waiting to be turned into bugs. In the post-mortem, this was referred to as “infinite defects methodology”.

Iterative development is intended to solve two problems:

  • Users who incessantly change their minds about what they want;
  • Architecture astronauts who build vast systems just in case.

Short development cycles mean that users who change their minds have a less detrimental impact on the product. Similarly, tight deadlines mean that the guy who wants to build a towering behemoth of architectural indirection simply doesn’t have time to do so.

Unfortunately, it has a third effect:

  • Developers ignore the majority of information they have about the problem at hand in order to create a solution that meets the acceptance criteria – and nothing more – as quickly as possible.

Even though the developer may know that the problem he is trying to solve will need to be re-used throughout the entire system, he will implement a one-off fix that just addresses the acceptance criteria for the current development iteration.

In the best-case scenario, this minimal solution gets copied-and-pasted throughout the app – perhaps with minor usage-specific tweaks – until it becomes a major problem. At that point all of the work done so far gets thrown away, at considerable cost and effort, and more time is wasted creating a more appropriate solution. In the more likely scenario, the copy-and-paste solution is copied-and-pasted more and more, and the system ends up as a Big Ball of Mud. It’s very difficult to be taken seriously when you say, “Remember all that work we did in the last 4 iterations? We need to throw all of that away and start again even though the requirements haven’t changed.”

Consider this situation. A developer is asked to add a drop-down list of books to his web application. He knows right now that the drop-down list UI widget will be re-used dozens of times in the application. How does he approach the problem? He looks at the acceptance criteria for this iteration, which says “create a drop-down list of books”. That is precisely what he creates: a one-off, non-reusable drop-down list widget that contains a list of books. In the next iteration he has to create a drop-down list of authors, so he copies-and-pastes his book code, replaces the hard-coded book list with an author list, and he’s done. Acceptance criteria met, and the Big Ball of Mud is well on its way.

If the developer instead had looked beyond the acceptance criteria and used all of the knowledge he had available at the time – that the drop-down list UI widget would be re-used throughout the app – he would have made it a re-usable component.

Leave a reply

2013-05-22

Fish Shell 2.0

Fish Shell 2.0 was released a few days ago. This is by far my favourite bash replacement, and as a result it’s one of the first things I install on a new machine. I was so determined to use it in Linux that I spent an afternoon building it from source, but it’s now available as a .deb file.

Give it a try!

Leave a reply

2013-05-11

Gobbling Dogfood

Welcome to the new Simian Zombie! It looks almost the same as it used to, and it works in pretty much the same way, but the back-end is now completely different.

The old Simian Zombie used WordPress as its blogging engine and was hosted on BlueHost. This combination resulted in terrible response times. Fetching and rendering the site took an average of around 5 seconds (testing suggested a range from 2 to 9 seconds), which is abysmal. Worse than using WordPress’ combination of PHP and MySQL was sharing a database server with a site that was either appallingly written or hugely popular; in either case, it kept taking down the database.

The new Simian Zombie is hosted on a VPS provided by Digital Ocean. It uses Gobble as its blogging engine, which is written in Go, uses the filesystem for permanent storage, and serves posts from an in-memory cache. A full render of the same blog page takes around 2 seconds on average (with a range from 1 to 3 seconds). Most of that time is used up by the JavaScript syntax highlighter; fetching the rest of the page usually takes around half a second. It probably helps that there’s 50% less HTML in the Gobble layout.

It took longer to re-format the content from the old site into correct Markdown than it did to write Gobble. Around half of the posts lost their formatting during the automated HTML-to-Markdown conversion process, so I had to add it all back in. I’ve been reluctantly working on fixing the posts since I announced Gobble, which was two months ago. After that was done, I had to ensure that all of the various downloads and images still worked (I’m sure I’ve probably missed a few links).

I’m hoping that Gobble holds up to being dogfooded. I’m also hoping that I haven’t lost anything during the transition.

Leave a reply

2013-04-13

Installing an Email Server on Ubuntu Part 2

My last post on this subject wasn’t overly informative, so here’s some more information about my attempt to install an email server on Linux. Note that this post will be of no use to you if you are trying to do this yourself as I entirely failed to get it working.

First of all, you’ll want to install the email server. There are a few of these out there, but if all you want to do is be able to receive email from a few different domains you’ll probably end up opting for Postfix. Postfix isn’t really a full email server. It can receive email but can’t send it, doesn’t offer any kind of client, doesn’t support POP3 or IMAP access or anything else. To get any of this you’ll need to install mailutils and sendmail and Dovecot and half a dozen other programs. I don’t want any of this.

Some command-line wizardy later and you’ll have Postfix installed. Tweak some config files, and…

It doesn’t work.

You need to change your MX records to point at your server. Tweak the DNS, check that the changes have propagated, and…

It doesn’t work.

Perhaps the firewall is blocking the ports? You need to look at the documentation for iptables. iptables seems to be a hideous mess of a program, so you look for something else and find UFW. UFW is very easy to use. Some simple configuration, and…

It doesn’t work.

I had a server that would send email to itself quite happily, but failed the MX test from mxtoolbox.com. Email from anywhere else failed to reach the server. There’s no obvious way to diagnose this because Postfix doesn’t complain about any configuration errors and doesn’t log anything. The DNS settings appear to be correct. Disabling the firewall doesn’t help. As usual when dealing with anything Linux-related, most of the available documentation is either woefully out of date or specific to a completely different distro. (Imagine a Linux in which all of the effort went into creating a handful of tools that worked properly, instead of a hundred crappy, semi-functional alternatives.)

I have no interest in server maintenance. None. I don’t want to run my own email server. It’s simply the downside of migrating to a VPS so that I can run some Go web apps. I’ve wasted an afternoon on something I actively dislike doing.

At this point I decided to look around for an alternative solution. I got nullmailer set up. It’s supposed to automatically redirect incoming messages to another email server, which would enable me to forward messages straight to my GMail account. Clever! Unfortunately, I couldn’t get it working. SSMTP does the same thing (though it’s no longer maintained) and this one did work, but Google detected its attempt at logging in and blocked it.

Now I need an alternative to my alternative. Is there some way I can point the MX records at a dedicated email service and have someone else handle all of this tedium?

I eventually settled on creating a Google Apps account. It took some setting up – account creation, MX records, domain verification, counter-intuitive web UI – but it was considerably less annoying than running my own server. On the downside, Google Apps accounts used to be free but now cost $5/month. On the upside, all of the email to my woopsi.org and simianzombie.com email addresses now appear in GMail.

In other news, woopsi.org is now hosted on my Digital Ocean VPS. I had to convert it from a PHP website back to static HTML pages, but as it wasn’t using PHP for much more than some simple templating this wasn’t a big problem.

Leave a reply

2013-04-05

Installing an Email Server on Ubuntu

Good grief.

Leave a reply

2013-03-26

Gobble

A year or so ago I wrote my own blogging engine in C#, and then a while later pointed out some of the worst flaws in the system. The major one for me was the choice of platform:

…using C# means I have to maintain a Windows virtual machine, an installation of VMWare Fusion and a copy of Visual Studio. That’s far too much overhead for something that could be written in a few hundred lines of Go and deployed on any cheap Linux server.

I’ve rectified that with a new blogging engine called Gobble, written in Go. Although it blew away my initial estimate and ended up consisting of 6000 lines of code, CSS and HTML templates (etc), it fulfils my goal of being a blogging engine that:

  • Consists of a single binary;
  • Uses the file system for post storage instead of a database;
  • Supports comments;
  • Uses Akismet and reCAPTCHA for spam prevention;
  • Uses Markdown for post/comment formatting;
  • Can be deployed easily to a cheap Linux server/VPS;
  • Has a search feature;
  • Supports tags.

It has a lot of features that I like in WordPress that didn’t make it into BitBlogger, including search, self-hosted comments, syntax highlighting and more. It even retains the WordPress post ID-based URLs (as a secondary way to link to a post) to ensure that it’s a drop-in replacement as far as external links are concerned. It omits WordPress features that I don’t care about, such as categories, plugins, a fancy admin section and others.

Gobble uses a similar in-memory data storage system to BitBlogger, but this time there’s no enforced shutdown that kills the cache. I’d considered using MongoDB, CouchDB or RethinkDB for data storage, but in the end I realised that the entire textual content for this blog fits in 2MB. A database wouldn’t add anything but unnecessary complexity for managing that tiny amount of data. Posts are stored on disk on the same server as the Gobble executable. I can still use Mercurial and BitBucket to store the posts; I just need to SSH to the VPS in order to pull and update. In future I might add simple admin section that allows me to edit posts in a web browser and view new comments/spam.

I’ve got a complete copy of this blog running in Gobble on a VPS from Digital Ocean. It works well, but some of the formatting of the posts is a bit screwy (it got lost during translation from HTML to Markdown). Once I fix the formatting I’m intending to switch away from my WordPress installation to Gobble.

My only concern with migrating platforms is the fact that moving the simianzombie.com domain to the VPS means I’ll have to run my own mail server, which sounds thoroughly tedious.

More Thoughts on Go

In some ways I find Go a little awkward to write. Working with pointers is identical to working with values directly (unlike in C in which you have to manually dereference pointers) so it’s frequently hard to tell if you’ve got a pointer or a value. Sometimes it’s nice not to care; at other times it’s handy to know if you’re editing a copy of a struct or the original struct (via a pointer), and the language hides that information. Switching back to a more procedural rather than OO paradigm involves some extra thought in deciding how best to structure the app. I’m still of the opinion that Go’s naming conventions are ridiculously anachronistic, and some of the design decisions in the libraries are bloody stupid (sorting and comparison functions are defined on the container rather than the items being sorted and compared, for example).

At the same time, it has some very nice features. Compiling down to a single binary makes Go programs trivial to deploy. Defining structs with associated methods instead of objects makes a lot of sense, and hopefully it should prevent Java-esque program architectures. The concept of implicit interfaces is very reminiscent of Objective-C’s informal protocols and works well. I’ve found that I can write programs in Go extremely quickly, and I’ll get faster as I become more familiar with how to best structure programs and the available libraries.

Leave a reply

2013-02-21

Encoding HTML Entities with libXML

iOS doesn’t have a standard, easy-to-use way of encoding UTF-8 strings into HTML entities (ie. from “>” into “>“). One library that can be used to achieve this is libXML, but its API is particularly unpleasant:

int htmlEncodeEntities (unsigned char * out, 
                        int * outlen, 
                        const unsigned char * in, 
                        int * inlen, 
                        int quoteChar)

out: a pointer to an array of bytes to store the result
outlen: the length of @out
in: a pointer to an array of UTF-8 chars
inlen: the length of @in
quoteChar: the quote character to escape (' or ") or zero.

Returns: 0 if success, -2 if the transcoding fails, or -1 otherwise
The value of @inlen after return is the number of octets consumed as
the return value is positive, else unpredictable. The value of
@outlen after return is the number of octets consumed.

What’s wrong with that function signature, you ask? Other than the quality of the documentation, that is? Consider this situation. You need to encode this string:

<p>

That will encode into this string:

&lt;p&gt;

Now consider this braindead catch-22 situation: You must allocate the memory for the “out” parameter before you call the function, but until you call the function you won’t know how much memory you need to allocate.

The raw string in the example above is 4 bytes long; the encoded version is 10 bytes (don’t forget that this is C and all strings are NULL-terminated). If you follow the advice from Stack Overflow and double the initial size you’ll end up with 7 bytes (1 terminator + (3 characters * 2)), which is still too short. Another alternative is to figure out the maximum amount of memory that an encoded string could possibly consume (9 bytes per character) and use that, but then you’ll potentially be wasting massive amounts of memory.

If I were writing the function I’d probably return a pointer to a block of memory allocated inside the function itself. It could re-allocate the memory as needed and users of the function wouldn’t need to guesstimate the buffer size. That would raise my favourite C question, though: Who owns this memory?

I couldn’t find any examples of how to use the httpEncodeEntities() method, so I came up with my own. This solution uses a loop and encodes the string in chunks. It uses an encoded buffer twice the size of the initial string, but will resize it if it finds that the buffer isn’t large enough for a single encoded character. It’s implemented as a category method on NSString.


#import <libxml/htmlparser.h>

@implementation NSString (ZOHTMLEncoding)

- (NSString *)stringByEncodingHTMLEntities {

    if (self.length == 0) return;
    
    NSData *data = [self dataUsingEncoding:NSUTF8StringEncoding];
    
    int remainingBytes = data.length;
    int bufferSize = (data.length * 2) + 1;
    const unsigned char *bytes = (const unsigned char *)[data bytes];
    
    // We have to add an extra byte on the end of the encoded string to enable
    // us to add a terminator character.
    unsigned char *buffer = malloc(bufferSize);
    buffer[bufferSize - 1] = '\0';
    
    NSMutableString *output = [NSMutableString stringWithCapacity:remainingBytes];
    
    do {
        
        int outLen = bufferSize - 1;
        int inLen = remainingBytes;
    
        int result = htmlEncodeEntities(buffer, &outLen, bytes, &inLen, '"');
        
        // libXML doesn't append a terminator to the string - presumably because
        // NSString doesn't include one - so we'll have to take care of that in
        // order to convert back to an NSString.  We only add this if we haven't
        // completely filled the buffer.  If we've filled it, we've already
        // added the terminator character.
        if ((NSUInteger)outLen < bufferSize - 1) {
            buffer[outLen] = '\0';
        }
        
        if (result == 0) {
            
            NSString *string = [NSString stringWithCString:(const char *)buffer encoding:NSUTF8StringEncoding];
            
            [output appendString:string];
            
            remainingBytes -= inLen;
            
            if (remainingBytes > 0 && inLen == 0) {
                
                // Oh no!  We've got characters left to encode but they aren't
                // encoding.  This happens if our buffer isn't big enough, so
                // we'll resize it.
                free(buffer);
                bufferSize = ((bufferSize - 1) * 2) + 1;
                buffer = malloc(bufferSize);
                buffer[bufferSize - 1] = '\0';
            }
        } else {
            
            // Something bad happened
            break;
        }

        bytes += inLen;
        
    } while(remainingBytes > 0);

    free(buffer);
        
    return output;
}

@end

6 comments

2013-01-10

Hacker News

…would be more enjoyable if it automatically filtered out any comments with the words “sigh” or “yawn” in them.

Leave a reply

2013-01-06

DL/ID Parser Library For Go

I’ve published an initial version of my US DL/ID barcode data parser library here:

This is a library for Go that can parse the data extracted from the PDF417 barcode on the back of a US driving licence into a Go struct. It doesn’t handle many of the encoded fields yet but it’s a start. I’ve grumbled about the multiple problems with the spec and its various implementations before.

Leave a reply

2013-01-05

Uninstall fishfish Shell on OSX

Now that fishfish is available via Homebrew, it’s time to clear out the installed package and let brew take care of it for you. But how do you delete it? Here’s what I’ve been using:

#!/bin/sh

rm -rf /usr/local/share/fish
rm -rf /usr/local/share/doc/fish
rm -rf /usr/local/etc/fish
rm /usr/local/bin/fish*
rm /usr/local/bin/set_color
rm /usr/local/bin/mimedb
rm /usr/local/share/man/man1/set_color.1
rm /usr/local/share/man/man1/mimedb.1
rm /usr/local/share/man/man1/fishd.1
rm /usr/local/share/man/man1/fish_pager.1
rm /usr/local/share/man/man1/fish_indent.1
rm /usr/local/share/man/man1/fish.1

Remember to use chsh -s /bin/bash to switch back to Bash first if you swapped your default shell.

Leave a reply