Temporary workarounds are not so temporary

In a perfect world, development schedules would be based on realistic estimates, with plenty of buffer time factored in for the unexpected contingencies that always arise. But the real world is never ideal. Shit happens. Development cycles end up being too short and inflexible. Even worse, deadlines are often times determined by a completely arbitrary release date.

As a result of all this, everyone cuts corners to meet deadlines. From a development perspective, this manifests itself it as hastily written code: sprawling thousand line functions, “magic numbers” scattered everywhere*, far too much inline css, hard-coded English strings (this makes localization a pain to deal with later), code layers that bleed all over one another (the business logic layer outputting HTML and javascript code, and the UI layer performing business logic checks), and of course, lots of hidden bugs everywhere. Its not a pretty picture.

A fun thing to do on an old code base is to search for the word “hack” and read all the hilarious comments that crop up. These comments typically start out with a warning along the lines of: “This is a horrible hack!”, “UGH hack hack bad bad”, or “DANGER! Hack!” This is followed up by an explanation of why the following code is, to put it nicely, less than optimal. I don’t think anybody who has ever checked in code such as that ever expected their changes to be permanent, but rather a temporary band-aid fix/implementation to meet a deadline. Unfortunately, many of these comments that I see are dated back from many years ago. Oops.

Its not hard to see why. You check in the code, promising yourself you’ll come back and revisit this at some later point in time and do some badly needed cleanup, refactoring, or in a worst case scenario, a complete rewrite. But what happens inevitably is that there are new features to work on, bug fixes to make, meetings to attend, and pretty soon you are completely sidetracked. After all, if the current feature you’re cutting corners on is part of an unforgiving development cycle, why would future development cycles be planned out any differently?

Worse, each passing day makes a refactor that much more riskier. Think about it – the easiest bug fixes to make are ones that you do a few days after you’ve completed a feature. This is because your brain doesn’t have to make a “context switch”; everything is still fresh in your memory. However, as time goes on, you grow less familiar with the code in question and it is more likely you’ll introduce new bugs into the system anytime you modify it. This problem is compounded over time by the fact that new functionality will invariably be built upon this previous code. In the worst case, a temporary hack becomes an integral cornerstone of the system architecture. You’ll then need to be cognizant of a whole set of complex dependencies that simply wouldn’t have existed back when the code was first written. After a certain point it becomes more expensive and risky to refactor/rewrite a broken piece of code than it is to simply leave it as is.

Its also worth mentioning the political barrier to rewrites and refactors: Upper management, sales, and marketing don’t really care about clean and elegant code, especially if it is at the expense of more tangible things, such as a shiny new feature. Its difficult to make a compelling power point slide to customers explaining that the new version of a product uses 30% more design patterns. This problem is mitigated if you work for a good tech company, but not everyone is so fortunate.

The bottom line here is that it is wishful thinking to hope that a temporary hack is going to be anything other than permanent. So what’s a developer to do? As the cliche goes, anything worth doing is worth doing right. Ok great, but even if you work weekends, the fact of the matter is that there’s only so many hours in a day. Given an inflexible deadline, you’ve got to figure out what truly matters. Luckily, tough decisions such as these are why upper management exists. Transparency and honesty go a long way. All you can do is give them enough information so that they can make an informed decision. The importance here is to emphasize the tradeoff between high quality code and being 100% feature complete by a given date.

Obviously, everyone would be happy if bug free code shipped on the feature complete date. But that’s not possible in most cases. Management may not be happy to hear that kind of news, but I think its a safe bet to make that they’d be far more upset to find out later on down the line that you were unable to deliver on what you promised. The choices would involve some combination of pushing back the release date, slipping certain features (or slipping/modifying certain requirements), and of course, scheduling time for the inevitable bug fix patches.

If shoddy code is knowingly shipped, the key thing is to stress the importance of a refactor. As I mentioned before, non techies typically don’t grasp the benefits of a well written piece of code, so it is your duty to make sure to make sure that they do understand. A well architected solution not only has fewer bugs, it will also be flexible enough to accommodate future requirements, making subsequent dev work that much easier. Of course, the downside is that it obviously takes much longer to come up with a good solution. However, this is only a one time price that you pay upfront. In contrast, bad code is bug ridden and inflexible. New functionality built on such a shoddy foundation will take longer to write and be buggier as well. This is a steep price that will be paid repeatedly in the future.

This is exactly why anytime code of dubious quality is shipped, it is imperative that you convince the powers that be to create some bug/task/feature in whatever bug tracking / project planning system your company uses to make sure this code is improved upon. Simply promising yourself that you will do it at some later unspecified date and time won’t be enough; chances are its not gonna happen.

FOOTNOTES:
*“Magic numbers” are numeric values that appear in code which are unclear in meaning. Ideally, these should be replaced with a descriptive variable name instead. For example, instead of:

weight = mass * 9.80665;

the following would be better:

weight = mass * EARTH_STANDARD_GRAVITY;

Manipulating raw bitmap data in .NET

The Bitmap class found in the .NET Framework provides a lot of useful functionality.   Unfortunately, it doesn’t have any methods that let you easily manipulate the raw bitmap data.   It provides a SetPixel method which takes x,y coordinates and a color value, and does exactly what you’d think it would.    Unfortunately, the implementation is not the most efficient.   It will lock the entire bitmap before modifying the data.    This is fine if you only need to manipulate a few pixels, but horrible if you need to manipulate the entire image.   Needless to say, the overhead of locking and unlocking the data just to modify one measly little pixel is detrimental for performance.

I found a few articles on the best way to manipulate the raw data.    The simplest and most intuitive way is to create a memory stream, save the bitmap to it, and then manipulate the bytes in the memory stream.     Once you are done, you recreate the bitmap based on the modified memory stream.   For example, this pedeantic snippet sets all the bits in the bitmap purple (based on this tutorial article):

MemoryStream ms = new MemoryStream();

bmp.Save(ms, ImageFormat.Bmp);
byte[] bitmapData = ms.GetBuffer();

const int BITMAP_HEADER_OFFSET = 54;
Color colorValue = Color.Purple;

for (int i = 0; i < bitmapData.Length; i += 4)
{
    bitmapData[BITMAP_HEADER_OFFSET + i] = colorValue.R;
    bitmapData[BITMAP_HEADER_OFFSET + i + 1] = colorValue.G;
    bitmapData[BITMAP_HEADER_OFFSET + i + 2] = colorValue.B;
    bitmapData[BITMAP_HEADER_OFFSET + i + 3] = colorValue.A;
}

bmp = new Bitmap(ms);

Once you have the bytes themselves you can begin manipulating them.      Each pixel is represented by a four bytes, consisting of the Red, Green, Blue, and Alpha (this controls transparency) values, respectively.     Divide the current byte index by four to get the pixel you are on.    From there you can mod by the bitmap width to get the x coordinate, and divide by the bitmap height to get the y coordinate.

One thing to note is that the raw image data does not begin at byte 0.     The first 54 bytes of the data consists of the bitmap header which contains various metadata. The problem with this code of course, is that it hard codes this offset and assumes it will never change.    This ties the code to the internal implementation details of the Bitmap class which violates encapsulation.   Not the best practice.

So another way of accomplishing the same task that does not require knowledge of the internal structure of a windows bitmap is to pin the bitmap object in memory and copy the data over to a local buffer. The data can then be manipulated locally and then copied back to the bitmap, which can then be unpinned from memory.    The following is based on the MSDN example.

// Lock the bitmap's bits.
Rectangle rect = new Rectangle(0, 0, bmp.Width, bmp.Height);
System.Drawing.Imaging.BitmapData bmpData =
bmp.LockBits(rect, System.Drawing.Imaging.ImageLockMode.ReadWrite,
bmp.PixelFormat);

// Get the address of the first line.
IntPtr ptr = bmpData.Scan0;

// Declare an array to hold the bytes of the bitmap.
byte[] bitmapData = new byte[Math.Abs(bmpData.Stride) * bmp.Height];

// Copy the RGB values into the array.
System.Runtime.InteropServices.Marshal.Copy(ptr, bitmapData, 0, bitmapData.Length);

Color colorValue = Color.Purple;
for (int i = 0; i < bitmapData.Length; i += 4)
{
    bitmapData[i] = colorValue.R;
    bitmapData[i + 1] = colorValue.G;
    bitmapData[i + 2] = colorValue.B;
    bitmapData[i + 3] = colorValue.A;
}

// Copy the RGB values back to the bitmap
System.Runtime.InteropServices.Marshal.Copy(bitmapData, 0, ptr, bitmapData.Length);

// Unlock the bits.
bmp.UnlockBits(bmpData);

Note the use of the Bitmap data class, particularly the use of Scan0 and Stride.   Scan0  gets the memory location of the actual image data itself, completely bypassing the problem of figuring out the end of the bitmap header.    Stride represents the width in bytes of a bitmap object.

The Lock and UnlockBits are used to pin the bitmap object in memory.     What this means is that this object now has a fixed location in memory and cannot be moved.    Typically, the garbage collector moves objects around during the lifetime of an application.    This is done to avoid memory fragmentation.   For example, the garbage collector has a compaction phase where objects that are still being referenced are packed together in the heap.      This solves the problem of being unable to allocate a block of memory even though there is enough free memory (imagine for example, that there is 3KB of free memory, but a 3KB block cannot be allocated because the 3KB is not contiguous and is spread out in small chunks throughout the heap) Pinning the bitmap allows us to safely copy the data back and forth from the object without having to worry whether or not that memory location is still the correct address.

To test the performance, I wrote a simple program to set all the pixels in a 1000 x 1000 bitmap to purple.      The pinning method is slightly faster, but not by much.   Here is the code:

using System;
using System.Drawing;
using System.IO;
using System.Drawing.Imaging;

namespace BitmapTestFramework
{
    class Program
    {
        static void Main(string[] args)
        {
            Bitmap testBitmap = new Bitmap(1000, 1000);
            DateTime start;
            DateTime end;
            TimeSpan timeSpan;

            Console.WriteLine(string.Format("Method 1:   Calling Get/Set Pixel", testBitmap.Width, testBitmap.Height));
            start = DateTime.UtcNow;
            Method1(testBitmap);
            end = DateTime.UtcNow;
            timeSpan = end - start;
            Console.WriteLine(string.Format("Completed in {0} milliseconds", timeSpan.TotalMilliseconds));

            Console.WriteLine("Method 2:  Create a memory stream of the bitmap and manipulate the buffer");
            start = DateTime.UtcNow;
            Method2(testBitmap);
            end = DateTime.UtcNow;
            timeSpan = end - start;
            Console.WriteLine(string.Format("Completed in {0} milliseconds", timeSpan.TotalMilliseconds));

            Console.WriteLine("Method 3:   Pin the bitmap object in memory and manipulate a local copy of the data");
            start = DateTime.UtcNow;
            Method3(testBitmap);
            end = DateTime.UtcNow;
            timeSpan = end - start;
            Console.WriteLine(string.Format("Completed in {0} milliseconds", timeSpan.TotalMilliseconds));
        }

        static void Method1(Bitmap bmp)
        {
            for (int x = 0; x < bmp.Width; x++)
            {
                for (int y = 0; y < bmp.Height; y++)
                {
                    bmp.SetPixel(x, y, Color.Purple);
                }
            }
        }

        static void Method2(Bitmap bmp)
        {
            MemoryStream ms = new MemoryStream();
            bmp.Save(ms, ImageFormat.Bmp);
            byte[] bitmapData = ms.GetBuffer();

            const int BITMAP_HEADER_OFFSET = 54;

            Color colorValue = Color.Purple;
            for (int i = 0; i < bitmapData.Length* 4; i += 4)
            {
                bitmapData[BITMAP_HEADER_OFFSET + i] = colorValue.R;
                bitmapData[BITMAP_HEADER_OFFSET + i + 1] = colorValue.G;
                bitmapData[BITMAP_HEADER_OFFSET + i + 2] = colorValue.B;
                bitmapData[BITMAP_HEADER_OFFSET + i + 3] = colorValue.A;
            }

            bmp = new Bitmap(ms);
        }

        private static void Method3(Bitmap bmp)
        {
            // Lock the bitmap's bits.
            Rectangle rect = new Rectangle(0, 0, bmp.Width, bmp.Height);
            System.Drawing.Imaging.BitmapData bmpData =bmp.LockBits(rect, System.Drawing.Imaging.ImageLockMode.ReadWrite, bmp.PixelFormat);

            // Get the address of the first line.
            IntPtr ptr = bmpData.Scan0;

            // Declare an array to hold the bytes of the bitmap.
            byte[] bitmapData = new byte[Math.Abs(bmpData.Stride) * bmp.Height];

            // Copy the RGB values into the array.
            System.Runtime.InteropServices.Marshal.Copy(ptr, bitmapData, 0, bitmapData.Length);

            Color colorValue = Color.Purple;
            for (int i = 0; i < bitmapData.Length; i += 4)
            {
                bitmapData[i] = colorValue.R;
                bitmapData[i + 1] = colorValue.G;
                bitmapData[i + 2] = colorValue.B;
                bitmapData[i + 3] = colorValue.A;
           }

            // Copy the RGB values back to the bitmap
            System.Runtime.InteropServices.Marshal.Copy(bitmapData, 0, ptr, bitmapData.Length);

             // Unlock the bits.
             bmp.UnlockBits(bmpData);
        }
    }
}

Here is the output on my machine:

Method 1:   Calling Get/Set Pixel
Completed in 649.414 milliseconds
Method 2:  Create a memory stream of the bitmap and manipulate the buffer
Completed in 91.7969 milliseconds
Method 3:   Pin the bitmap object in memory and manipulate a local copy of the data
Completed in 82.0312 milliseconds

Simple JQuery tutorial

JQuery is a Javascript development framework that makes life for the Javascript developer that much more easier. It provides a set of libraries and standardized UI widgets, all topped with a liberal dose of syntactic sugar, to make tedious tasks such as manipulating the DOM and AJAX support that much simpler. True, there are frameworks out there that provide a similar set of functionality, such as Prototype and Ext-js. However, JQuery currently has a commanding market share and is seeing widespread adoption. A few notable examples include Microsoft, who is shipping JQuery with ASP.NET MVC, and Google, who actually hosts its own version of the JQuery API. Part of JQuery’s success can be attributed to the fact that cross browser comptability is one of its main design requirements. Frameworks such as Ext-js, which make extensions and modifications to the DOM, cannot make such a claim. In a way, JQuery’s success is analogous to why the C programming language became so popular: C provided a high level abstraction for Assembly that simplified coding by orders of magnitude (and as someone who had to write a quicksort algorithm in MIPS Assembly as part of a CS homework assignment, I can assure the reader that writing Assembly is indeed a nightmare), and it was an incredibly portable language as well (whereas any variant of Assembly was completely machine dependent).

I haven't had the opportunity to do much front end development in the past, but luckily my new job will involve a lot of Javascript coding. I decided to play around with some JQuery so I could start getting my hands dirty. I wanted to begin with something simple. One of the side projects I'm working on involved displaying/hiding a set of form fields based on a drop down selection, so I figured that'd be a good place as any to start.

Select an option:

Regular field 1:

Regular field 2:


Extra field 1:

Extra field 2:

The code for this example ends up being much shorter than the Javascript equivalent. After all, its motto is “Write Less, Do More”. Here it is in all its glory:

<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
<script type="text/javascript">
$(document).ready(function(){
    $('#optional').hide();

    $('dropdown').change(function(){
        if($('#dropdown').val() == 'Extra'){
            $('#optional').show();
        }
        else{
            $('#optional').hide();
        }
    });
});

</script>

Select an option:
<select id="dropdown">
    <option value="Standard">Show regular fields</option>
    <option value="Extra">Show extra fields</option>
</select>
<br />
<br />
Regular field 1: <br />
<input type="text"  /><br/>
Regular field 2:<br/>
<input type="text"  /><br /> 
<br />
<div id="optional">
Extra field 1:<br />
<input type="text" /><br />
Extra field 2:<br />
<input type="text" /><br />
</div>

The HTML side of things looks pretty much how you’d expect. The dropdown has an ID attribute of “dropdown”, and the div containing the extra text fields has an ID of “optional”. Obviously using such unimaginative and generic ID attributes in real life would be terrible. This is only a tutorial of course.

On the javascript side of things, a reference to the jquery library is included. What follows afterward is the $(document).ready function. Here’s where things get a little interesting. This represents the entry point for JQuery, kind of like your typical Main() function in C or Java. The $ is shorthand for the JQuery object. Here is where all the magic happens.

Inside the ready function, an event handler is defined for the drop down list: $(“#dropdown”).change. JQuery defines multiple events that you can hook into. In our case, we are interested in the change event.

The event hander uses an if conditional evaluates the value of the selected drop down list item. Luckily, we no longer have to use awkward document.GetElementById calls to access elements inside the DOM. Instead you can simply use the shorthand: $(“#id”). Grabbing the value of an element can then be done by calling the val() method. So, in order to figure out what the selected value of the drop down is, we simply call $(‘#dropdown’).val()

An If conditional is used to evaluate the value and show/hide the div accordingly. Note that JQuery comes with Show and Hide functions (and quite a few others) which do exactly what you’d expect. No more need to set css visibility properties manually!

So there you have it. A simple tutorial (albeit a trite example) on JQuery!

A reading list for the stuff they didn’t teach you in school

One of the problems with computer science courses is that they are too academic; heavy on the theory while lacking in practical application.   That’s not to say that theory is bad.  Algorithms, data structures, run time analysis, discrete mathematics, and so on are the bread and butter of a solid developer.  However, actually knowing how to write code is a key skill as well, and surprisingly, it is something that is not taught (adequately) enough in schools.

I attended the University of Washington computer science program.  Aside from a few introductory programming classes, none of the courses really did a deep dive on the art of coding.  Best practices regarding things such as good design, performance and scalability, testing strategies, naming conventions, formatting, debugging, defensive coding, etc, were things students had to pick up on their own.  My software engineering course, which should presumably have covered some of these topics, was woefully inadequate.  Part of the problem was that it was only a quarter long.   In the real world, devs work in large teams and even larger code bases consisting of millions of lines of code.  One short course can not possibly prepare the students for that kind of complexity.

One way to mitigate this problem is of course, to do actual work in the real world.  However, that can be a crapshoot.  You’d have to get lucky and find a good mentor who can show you the ropes.  The other option is to teach yourself.  Luckily, there are many books out there on how to write software well.  Here is a simple list of books that I’ve read which I’ve found to be helpful.  Obviously, this list is by no means complete.

  • Code Complete – this is one of the most comprehensive books out there on how to actually write code well.  It was written specifically to capture all the best practices regarding development.  It is well written and well researched, citing sources from both the professional world and academia.  The author is quite thorough as well.  Because there is no “one size fits all” solution in software development, he presents multiple viewpoints for many of the topics, discussing the pros and cons of each.  However, because the book is so thorough, it is also quite long (my copy is 914 pages long).   Some of the information may be a bit too basic (such as the sections on conditionals and data types), however, there is something in here for everyone.  As an added bonus, every chapter has a recommended reading list.
  • Writing Solid Code – a lot of cynical people will point out that this book was published by Microsoft and make snide remarks about buggy Microsoft products and blue screens.  However, this is a good defensive coding book.   All the examples are in C and involve the joys of pointers and manual memory management in C, but a lot of the concepts still apply to today’s modern languages, such as the numerous examples of how to use assertions properly.   One of the most insightful things that I took away from the book is that solid code is the result of a good mindset.  Anytime you discover one of your own bugs, ask yourself how/why the bug happened, and what you can do in the future to automatically detect / prevent such bugs from happening in the future.
  • The art of UNIX programming – this book covers the design philosophy of UNIX/LINUX systems.  Because it discusses design philosophy, the information in this book is relevant to all developers, not just those working on a LINUX distro.  The author discusses things such as transparency, modularity, simplicity, parsimony (writing programs as small as possible and no larger), composition (writing programs that can work well with one another), file formats, minilanguages, and a diverse set of other topics.  In addition, the book contains quotes and thoughts from many computing legends such as Dennis Ritchie, Doug McIlroy and Brian Kernighan.  There’s even a section at the beginning detailing the history of UNIX computing.  As someone too young to have been a part of the computing revolution, its always a treat to learn about its past.   Its difficult to sum up the book in a nutshell, but one of the key points is to write small, concise, simple programs that do one task well, and to write these programs in such a way that it can take input from other programs and output to other programs.  By building upon this solid foundation of simplicity, one can elegantly combine these simple programs together in complex ways to achieve impressive results.  This philosophy is what allows users to do stuff such as deleting all the html files containing the word “foobar”:   “find /path/to/public_html -name ‘*.html’ -print0 | xargs -0 grep -lZ ‘foobar’ | xargs -0 rm -f”.   The book does get a little too zealous at times, bashing windows and preaching the merits of open source, but that’s forgivable (some might even agree), considering the wealth of knowledge and experience contained within.
  • The Mythical Man Month and other Essays – This is a collection of essays written by Fred Books.   I actually had to read this book for my software engineering class at the UW, and didn’t like it.   I only include this book in this list because it is considered a classic.  My problem is that a lot of the information in these essays is just plain outdated and wrong.   The author talks about microfiche, time sharing, treats memory like a rare and precious commodity, and ancient mainframes.  His writing style is also quite dry.  That’s not to say that there isn’t wisdom to be found in this book.   Ideas such as no silver bullets, the mythical man month (adding more people to a late project makes it more late), and favoring data representation over algorithms (“Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.”) still ring true today.  Fred Brooks is constantly quoted in other books and articles though, so all the wisdom can be found elsewhere in material that is more current and relevant.  In my opinion, you don’t really need to read this book unless you are curious about what programming was like back then (which other books do a better job covering anyway).
  • Joel on Software 1 & 2 – Joel on Software is a collection of essays from Joel Spolsky’s blog.   Joel is the CEO of Fog Creek Software (which makes Fog Bugz, among other things).  Joel’s writing style makes for easy reading.  He has a sharp wit, a good sense of humor, and like all good bloggers, is controversial, purposefully ruffling feathers with essays such as “exceptions are no better than gotos”.   His essays range cover a lot more than development best practices, and include topics such as high level articles on how to run your own business.  As he explains, a good developer will understand the business and have a good grasp of microeconomics.   And of course, it couldn’t hurt to have this knowledge if someday you want to start your own company.  Granted, you can always read his blog instead of buying these books, but I enjoy the rustic feel of reading a good book on a lazy Sunday afternoon.
  • Programming Pearls – This book is also considered a classic, and rightfully so.   It reads like a computer science textbook, and that’s because its a collection of articles published by the ACM, written by John Bentley.  It is deceptively short, but tackles lots of programming concepts and problems such as algorithms, testing, debugging, program verification, performance, searching, sorting, and string manipulation.   I’m currently still working on this book.  I say work, because this book provides an invaluable list of practice exercises at the end of every section.  Anyone whose ever interviewed for a job before can surely recognize one variant or another of some of these difficult problems at the back of each chapter :)