Wednesday, July 22, 2015

Your Bug Reports Suck

Introduction

I've been a professional software developer for nearly a decade. In that time, I've read a lot of bug reports, and almost all of them have sucked. A lot. Hardcore suckitude. They really, really suck.

There's something you need to understand about us developers: We typically don't know how to use the software that well. That might come as a shock, but it's true. The people who know the application the best are the people who use it on a regular basis. They have adapted to it, developed their own processes, and repeated them so often that they've become second nature. The support staff and testers (the people typically writing the bug reports) are a close second to the end users in application knowledge, while developers are often a distant third. Developers are, unfortunately, not mind readers, nor do we possess a crystal ball, so our ability diagnose and fix bugs will live and die with the quality of the bug reports we receive.

What to Include in Bug Reports

So how, you may wonder, does one write a good bug report? Well, for one, you need to know what content to include, and, for that, I would start with item #4 on the Joel Test. Here's what Joel has to say:

A minimal useful bug database must include the following data for every bug:

  • complete steps to reproduce the bug
  • expected behavior
  • observed (buggy) behavior
  • who it's assigned to
  • whether it has been fixed or not

Let's look at these a bit further...

The first bullet point is by far the most crucial. Good reproduction steps point the reader to the part of the application that isn't functioning properly and (hopefully) guide us right to the problem. Without complete reproduction steps, a bug report is basically useless.

A common problem that I see is omitting a reproduction step or two. This sometimes occurs due to laziness or by mistake, but it can also happen because the person writing the bug report assumed that the developer knew what to do at a certain point in the user's process. Never assume! As I mentioned in the beginning, developers don't know the software as well as you think we do.

The next two bullet points are obviously pretty important, too. We need to know what the application is supposed to be doing (the expected behavior), as well as what it is doing (the bug). The combination of these two items is what I typically call the problem statement.

The fourth bullet point (the assignee) is pretty self-explanatory. The last bullet point is often referred to as the bug's status, and this is usually pretty straightforward. However, I have seen people invent some rather nebulous terms for issue status. I recommend naming the statuses as clearly as possible (i.e. "Open", "In Progress", "Development Complete") and providing detailed definitions for each status within your documentation.

Joel's list is a pretty good jumping-off point, but I think we can take it a bit further. I also recommend including the following pieces of information whenever available:

  1. Who reported the issue? Some problems are isolated to a specific user, and we need to know what makes them different. What type of user are they? Do they have any specific permissions granted (or missing)? Are they on our network, or are they accessing the software externally? This information can be vital when trying to reproduce and/or diagnose issues. This seems like a lot of information to include in a bug report, and indeed it is, but getting (at a minimum) the user's name, contact info, and login ID is at least a starting point in case more information is needed later.
  2. When did the user first notice the problem? Knowing the date and time that an issue first manifested itself can be of vital importance. It often helps us determine if it was caused by something we did (a software change, hardware change, data modification, etc.) at a certain time.
  3. What is the frequency of occurrence? Does this issue happen every time the user performs a specific action, or does it occur intermittently? Does it only occur at certain points of the day? Are all users experiencing this issue, or are only certain users experiencing it (in which case the information in item #1 can be crucial)?
  4. Which environment was this issue observed in? A bug might be found in production, but it could also first be noticed in a different environment (such as QA). This environment could have a slightly different version of the application. It's important to indicate where the bug is observed so that the developer knows a) where they can reproduce the issue, and b) which version of the source code to look at.
  5. Which web browser was the issue observed in? Much of our software these days is web-based, and, unfortunately, applications that function fine in one web browser can be buggy in others. For web applications, always include the browser name and version number in the bug report.
  6. Is there a workaround? What are the workaround steps? This sort of information can help keep users productive while the issue is being resolved. Occasionally, it can also help developers diagnose and fix the bug being addressed.
  7. What is the severity of the issue? Indicating the severity of an issue is important in helping developers prioritize the issues that need to be addressed (please note that an issue's severity is related to, although not necessarily the same as, its priority). It's highly recommended that some sort of severity scale is developed and strictly adhered to. I recommend something like the following:
    • High = Critical functionality is impacted and no workaround exists.
    • Medium = Critical functionality is impacted, but a workaround exists.
    • Low = Non-critical functionality is impacted.

Obviously, not all of the information above is always necessary (severity and workarounds aren't as important for non-production issues) or even applicable (non-web-based applications don't require a browser). However, the point is that a bug can have a lot of variables, and it's important to know as much about those variables as we can so we can diagnose the issue quickly and efficiently.

Including Quality Content

In addition to knowing what content to provide, you'll want to make sure that content is as clear and as useful as it can be. Here are just a few of the things you can do to improve the content of your bug reports:

  1. Use complete sentences with correct spelling, punctuation, and grammar. This seems like a no-brainer, but you would be amazed at some of the positively dreadful bug reports I've read. Don't assume the person reading your bug report will understand sentence fragments or shorthand. Before submitting the report, reread everything you've written to ensure clarity.
  2. Call things what they are called on the user interface. Users will often use industry jargon or come up with their own terminology for certain screens, UI elements or pieces of data. Developers won't know this terminology; they'll be relying on how things are labeled on the screen. Stay away from cryptic terms whenever possible.
  3. Use ordered (i.e. numbered) lists. This is a rather simple thing to do, but it can help aid in clarity of communication between team members (e.g. "can you elaborate on reproduction step #4?").
  4. Take screenshots. This one is pretty self-explanatory. We've all heard the old saying that a picture is worth a thousand words. While the phrase "refer to screenshot" might not save you from typing a thousand words, it will probably make your life easier.
  5. Use screenshots to include examples of form input data. Here's a common scenario: a user fills out a lengthy form, clicks a submit button, and then receives some nasty error. As much as it might suck, any one of those pieces of data could have caused the issue, and you'll need to include them all. But instead of typing all of those form values into a note in your bug report, you can create a screenshot while you're reproducing the bug. Fill out the form, then take a screenshot before you click the submit button.
  6. Save your reproduction steps! You just typed a TON of words on how to reproduce a problem... do you want to have to retype all that every time this screen breaks? Of course not! Save reproduction steps for common tasks into text files or Word documents, and store them in a folder on your hard drive. Include the name of the UI at the beginning of the file name so you can locate them quickly.
  7. Consider developing (or improving) your own template for bug reports. If your bug reports are hand-written, consider developing a Word template. If you're using a commercial tool for bug tracking, look into customizing it. If you find that the developers are constantly coming back to you asking you for a specific piece of information, that's a strong indication that it should be included in all bug reports.

The Developer's Responsibility

Okay, okay... I've spent a lot of time berating support staff and testers over their bug reports. But, as it turns out, developers aren't perfect either. There are a few things we should always be doing with bugs that get assigned to us:

  1. If you can't find the problem, at least add notes for other developers. Investigating bugs can often feel like "doing science"... no matter how methodical you are, you won't make a discovery until you make a discovery. Diagnosing bugs can be time consuming, and it can be frustrating and costly to retread the same ground that others have already covered. Add as much detail as you can to the bug report about where you looked and what you tried. Try to be as specific as possible; include names of source code files, subroutines, or even line numbers. It sounds like a lot of extra effort, but it may save you or your teammates time later on.
  2. After fixing a bug, add instructions related to how to test changes. Usually the reproduction steps are the testing instructions. But sometimes a change can have a far-reaching impact and require re-testing a variety of screens or functionality that (seemingly) had nothing to do with the original bug. In these instances, add a note to the bug report calling out what areas should be tested.

Conclusion

Bugs suck, but bug reports don't have to suck. By following a few guidelines, we can make bug reports clearer and more informative, which makes them more useful to developers.

JavaScript: call(), apply(), & bind()

Introduction

JavaScript functions are actually objects themselves, and, like any other objects, contain properties and methods. Three of these methods, call(), apply(), and bind(), can be used to provide a calling context for a specific function, thereby treating ordinary functions like methods. This post aims to briefly describe and demonstrate how to use these methods (albeit with a contrived example).

Initial Setup

Our initial setup code is pretty straightforward: we have an object called person that has two properties (firstName and lastName) and one method (getName()). Running the code will log the person's full name to the console.

var person = {  
    firstName: "John",
    lastName: "Doe",
    getName: function() { 
        return this.firstName + " " + this.lastName; 
    }
};
console.log(person.getName()); //John Doe

The call() Method

At some point later in our application, we define a function to set the firstName and lastName properties at the same time.

var setName = function(newFirst, newLast) { 
    this.firstName = newFirst; 
    this.lastName = newLast;
};

The problem here is that we've declared this function outside of our person object. In this context, this refers to the global object (i.e. the Window object if our code is running in a web browser). So neither invoking the method directly on person (i.e. person.setName("Joe", "Schmoe")) or invoking it without indicating an object (i.e. setName("Joe", "Schmoe")) will do what we want it to do. How can we invoke this function as if it were a method on the person object? Our answer lies in the call() method.

The call() method allows you to invoke a function in the context of a specific object, which you provide as the first argument. In other words, the call() method lets you treat any function like a method on an object of your choosing. Any additional arguments provided are passed to the function being called (so invoking a function with n arguments via call() requires n+1 arguments).

setName.call(person, "Joe", "Schmoe");
console.log(person.getName());  //Joe Schmoe

In the above example, the setName() function is being called as if it were a method on the person object. So the code works the same way as if we had just given person a setName() method in the first place and were now invoking it like person.setName("Joe", "Schmoe").

The function being called doesn't have to be declare outside of an object; it can even be a method declared inside a different object. The example below utilizes the setFullName() method defined in customer to perform the same operation on person.

var customer = {
    ...
    setFullName: function(newFirst, newLast) {
        this.firstName = newFirst; 
        this.lastName = newLast;
    }
};
customer.setFullName.call(person, "Joe", "Schmoe");
console.log(person.getName());

The apply() Method

The apply() method functions almost identically to call(). The only real difference is that, while call() accepts a variable number of arguments, apply() only accepts two: the context object (i.e. the value of this inside the function) and an array of arguments to be passed in to the target function. So the equivalent code to our call() example utilizing apply() is written as follows:

setName.apply(person, ["Joe", "Schmoe"]);
console.log(person.getName()); //Joe Schmoe

The apply() method is obviously a bit more robust than call(), and can be used in scenarios where the number of function arguments is not known until runtime. Remembering which one is which can be a little confusing at first; I find it easiest to just remember the mnemonic "A for arrays, C for commas".

The bind() Method

The last method we will discuss is the bind() method. Much like the other methods we've discussed, the first argument supplied to bind() is the context object (i.e. the value of this inside the function). Unlike those other methods, bind() does not invoke a function when called. Instead, it returns a new function. Let's demonstrate this using another example:

var setPersonName = setName.bind(person);
setPersonName("Suzy", "Schmoe"); //Suzy Schmoe
console.log(person.getName()); 

The example above takes setName(), binds it to the person object, and stores it in a new function called setPersonName(). When this new function is invoked, it will modify the person object's properties. Notice that setPersonName() behaves like a method, but is not invoked like one. Bind is useful for creating shortcut functions (as in the above example) or in situations where you want to pass methods as anonymous function arguments to other functions.

Completed Example Code

var person = {  
    firstName: "John",
    lastName: "Doe",
    getName: function() { 
        return this.firstName + " " + this.lastName; 
    }
};
console.log(person.getName()); //John Doe

var setName = function(newFirst, newLast) { 
    this.firstName = newFirst; 
    this.lastName = newLast;
};

setName.call(person, "Joe", "Schmoe");
console.log(person.getName());  //Joe Schmoe

var customer = {
    firstName: "Sam",
    lastName: "Smith",
    setFullName: function(newFirst, newLast) {
        this.firstName = newFirst; 
        this.lastName = newLast;
    }
};

customer.setFullName.call(person, "Joe", "Schmoe");
console.log(person.getName());  //Joe Schmoe

setName.apply(person, ["Joe", "Schmoe"]);
console.log(person.getName()); //Joe Schmoe

var setPersonName = setName.bind(person);
setPersonName("Suzy", "Schmoe"); //Suzy Schmoe
console.log(person.getName());

Type Systems

Overview

Static vs. dynamic? Strong vs. weak? What does it all mean? And what's this I keep hearing about ducks? In this post, I will try to shed some light on these mysterious terms...

Static vs. Dynamic Type-Checking

Static type-checking is the process of verifying a program's type safety before it is run (i.e. at compile time). But what does it mean to "verify a program's type safety"? Well, it simply means to insure that types are being used as they are supposed to (int's are used like int's, string's are used like string's, and so on). Therefore, we would expect a statement like int x = "one"; to fail to compile in a statically type-checked language. C, C++, Java, and C# (*see note below) all behave this way.

Dynamic type-checking, on the other hand, is the process of verifying a program's type safety at runtime. Here are a few JavaScript expressions that would fail to execute due to dynamic type-checking:

var n = 123;
console.log(n.toUpperCase()); //TypeError (n is not a String)
n(); //TypeError (n is not a Function)

It is pretty common for interpreted languages (such as JavaScript, Ruby, and Python) to have dynamic type-checking, but some compiled languages have this trait as well. It's also possible for a language to have both static and dynamic type-checking.

*NOTE: C# 4.0 and above allows a variable to be declared as dynamic. This behaves similarly to declaring a variable as an object, except that expressions involving these variables are not type-checked at compile time. This was introduced for interoperability with dynamically type-checked languages.

Strong vs. Weak Typing

The "strong" vs. "weak" type system comparison is not as cut-and-dry or as well-defined as the static vs. dynamic comparison. The difference between strong and weak type systems has to do with how the language will handle variables which do not meet the expected type for the operation. A "strongly" typed language will typically throw an error, while a "weakly" typed language will attempt to convert the argument(s) to complete the operation.

Consider the expression 3 + "4". Attempting this in Ruby will throw a TypeError, whereas JavaScript will coerce the 3 into a string and produce the result "34". Therefore, it can be said that Ruby is strongly typed and JavaScript is weakly typed (note that both languages use dynamic type-checking).

Duck Typing

So what, then, is duck typing? The answer typically given to this problem is the following:

When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.

Yeah, I'm not wild about people who answer straightforward questions with riddles, either.

Consider instead the following JavaScript function and example usage.

var maxLength = function(a, b) {
    return Math.max(a.length, b.length);
};
console.log(maxLength([1, 2, 3], "hello"));

The two arguments could both be strings, could both be arrays, or could be one of each. Or they could be something else altogether. As long as they both have a length property, the function goes quietly about it's business. This is duck typing.

Duck typing is something that can really only manifest in languages with dynamic type-checking, although it is not limited to weakly typed languages. Duck typing differs from languages which use nominal typing, where variables are determined to be type-compatible by comparing specifically declared types. Structural typed languages offer something of a middle-ground between duck typing and nominal typing; two variables are type-compatible if their entire structures are the same (i.e. all of their properties and methods).

Further Reading

If you're looking for more detailed explanations, Wikipedia has a fairly extensive entry on type systems, which also links to entries on the majority of the topics discussed above. I personally find their comparison of type systems found in major languages to be a more useful jumping-off point.

Enforcing one-to-(zero or one) Relationships in a RDBMS

Overview

The following sections outline approaches one might take to enforce a one-to-(zero or one) relationship in a relational database.

Approach 1: Store the data in the same table.

This is pretty self-explanatory. If the data is truly 1:(0..1) there is no logical reason the data can't coexist in one table. Just add the additional columns that would go in the child table to your parent table and make them nullable.

However, you may actually want to separate your data into multiple tables. This could be because you want to conform to a specific model (the tables represent different entities), or because you have certain physical constraints to consider (limited space, the parent table is already as wide as you're willing to make it, etc.). There are certainly other ways to accommodate this scenario...

Approach 2a: Use a foreign key constraint.

When you create your child table (the (0..1) part of the relationship), add a foreign key column that references the primary key of the parent table. By marking the foreign key column NOT NULL and UNIQUE, you ensure that all records in the child table must reference one (and only one) record in the parent table. The resulting SQL will look something like this:

CREATE TABLE Parent (
    ParentID INT IDENTITY(1,1) NOT NULL,
    ...
    CONSTRAINT PK_Parent PRIMARY KEY (ParentID)
)

CREATE TABLE Child (
    ChildID INT IDENTITY(1,1) NOT NULL,
    ParentID INT NOT NULL UNIQUE,
    ...
    CONSTRAINT PK_Child PRIMARY KEY (ChildID),
    CONSTRAINT FK_Parent_Child_ParentID FOREIGN KEY (ParentID) REFERENCES Parent (ParentID)
)

Approach 2b: Use the same primary key.

This is sort of a twist on the previous approach. You can actually eliminate an extra column in the child table by folding the primary key and foreign key into one column. This can be accomplished with the following SQL:

CREATE TABLE Parent (
    ParentID INT IDENTITY(1,1) NOT NULL,
    ...
    CONSTRAINT PK_Parent PRIMARY KEY (ParentID)
)

CREATE TABLE Child (
    ParentID INT NOT NULL,
    ...
    CONSTRAINT PK_Child PRIMARY KEY (ParentID),
    CONSTRAINT FK_Parent_Child_ParentID FOREIGN KEY (ParentID) REFERENCES Parent (ParentID)
)

Approach 3: Use an intersection table.

The previous approaches work well for 1:(0..1) relationships, where the "parent" entity always exists, but the "child" may or may not exist. But sometimes we can have two independent lists of entities that are occasionally related, and you want to limit them so that an entity in table A cannot be related to more than one in table B, and vice-versa. In other words, it is a (0..1):(0..1) relationship.

In this scenario, the best approach is to use an intersection table. This intersection table will have two UNIQUE, NOT NULL columns that reference the primary keys of the two entity tables. The SQL to create this schema should look something like this:

CREATE TABLE TableA (
    ID INT IDENTITY(1,1) NOT NULL,
    ...
    CONSTRAINT PK_TableA PRIMARY KEY (ID)
)

CREATE TABLE TableB (
    ID INT IDENTITY(1,1) NOT NULL,
    ...
    CONSTRAINT PK_TableB PRIMARY KEY (ID)
)

CREATE TABLE Intersection (
    TableAID INT NOT NULL UNIQUE,
    TableBID INT NOT NULL UNIQUE,
    CONSTRAINT FK_Intersection_TableA_TableAID FOREIGN KEY (TableAID) REFERENCES TableA (ID),
    CONSTRAINT FK_Intersection_TableB_TableBID FOREIGN KEY (TableBID) REFERENCES TableB (ID)
)

What about true one-to-one relationships?

Utilizing Approach 1 from above seems to be the only real solution to this (although, in this instance, the "child" columns should be non-nullable). The other approaches won't work here, because there's no way to split the entities into two separate tables and relate them in a true 1:1 fashion. Attempting to do so would create a sort of chicken-and-egg situation where you can't insert data into either table. The easiest course of action is to use a single table and call it a day.

AngularJS: Communicating Between Controllers and Directives with Isolated Scope

Overview

The problem addressed in this post is a rather common one when developing AngularJS applications. This isn't a particularly difficult problem, but it was one that I struggled with when I started working on medium-sized AngularJS applications... mainly because I didn't have a strong understanding of directives and scope at the time.

Scenario

I have a controller which declares a function called doSomething() on $scope.

angular.module("myApp", [])
    .controller("myController", function ($scope) {
        $scope.doSomething = function() { 
            //something 
        };
    });

I have a custom directive being applied within my controller block.

<div ng-controller="myController">
    <div my-directive></div>
</div>

I want to call my doSomething() function when some event (such as an ng-click) transpires inside my directive. How can I define (and then apply) my directive to accomplish this?

Solution

To do this, I create a directive with an isolated scope. Within the scope declaration, I use '=' to bind a scope property to a function that can then be passed in as an attribute.

angular.module("myApp", [])
    .directive("myDirective", function () {
        return {
            scope: {
                clickFn: '='
            },
            template: "<button ng-click='clickFn()'>Click Me</button>",
            ...
        };
    });

When I apply the directive to an HTML element, I provide an expression (in this case, a function call) to the click-fn attribute.

<div ng-controller="myController">
    <div my-directive click-fn="doSomething()"></div>
</div>

You can use this alternate syntax if you want your directive attribute and scope property to have different names:

scope: {
    onSomethingHappens: '=attrFunction'
},
template: "<button ng-click='onSomethingHappens()'>Click Me</button>",
...
<div ng-controller="myController">
    <div my-directive attr-function="doSomething()"></div>
</div>

Alternate Approach: Use a Service

An alternative to the approach outlined above is to move the function you want to execute into a service. AngularJS services are singletons which can be passed into controllers and directives through dependency injection. I'm not wild about this approach, but it will get the job done. I suppose it is possible to run into scenarios where you need to do this, perhaps because you don't want your directive to have an isolated scope. However, you should be defining an isolated scope if you want your directives to be truly reusable.

Cookies vs. Web Storage

Cookies

HTTP cookies are small pieces of data stored in a user's browser by a specific web site. When that web site is reloaded, the cookies are sent back to the server with each request. Cookies contain a name-value pair, an expiry date, and the domain and path of the server they should be sent to.

To access a page's cookies via JavaScript, use document.cookie. For more information on how to do this, refer to this page.

In ASP, cookies can be created with Response.Cookies and retrieved with Request.Cookies.

Web Storage

Web Storage was introduced with the HTML5 standard. Web Storage is a means of persisting client-side-only data on web sites. Unlike cookies, this data is not automatically sent to the server. Web Storage provides far more capacity than cookies: 5 or 10 MB per origin (depending on the browser) versus cookies' 4 KB. Web Storage comes in two flavors: Local Storage and Session Storage.

Local Storage

Like cookies, Local Storage can retain data between sessions. Unlike cookies, this data does not have an expiration date. This data can be accessed using window.localStorage, as illustrated by the example below.

window.localStorage.setItem("username", "jsmith");
var user = window.localStorage.getItem("username");
window.localStorage.removeItem("username");
window.localStorage.clear(); //clear all items

Session Storage

Session Storage works just like Local Storage, with one key difference: all data expires at the end of the current session (i.e. when the window is closed). Session Storage works just like the examples in the previous section... just replace window.localStorage with window.sessionStorage.

Other Options

Here are some lesser-known alternatives to using Cookies or Web Storage.

Web SQL

Web SQL is an API that can be used to store data in a local relational database. Web SQL is largely considered deprecated, as the W3C ceased work on the specification in November 2010. Web SQL is not supported by IE or Firefox. It is supported by Chrome, Opera, Safari, and the native browsers in Android and iOS. Those browsers that do support it use SQLite as the storage engine.

IndexedDB

The Indexed Database API (or IndexedDB) is an API that can be used to store data in a Key-Value database. IndexedDB is currently supported by Chrome, Firefox, and IE10+. Apple has announced support for future versions of Safari.