I’ve come across some really bad unit tests in a number of projects recently. I’m assuming that were written in the past only because someone said to the developers something like “Writing more unit tests improves code quality, so everyone write more unit tests!”. That was all the direction they got it seems (I’m assuming that from reading the code), and unfortunately, the developer and the leader wanting the developer to write unit tests wasn’t exactly sure why unit tests are important or how to write effective unit tests.
There are a few common goals that writing units help solve.
- Unit tests provide an excellent facility for regression testing, in that help they allow programmers to easily refactor a piece code at a later date, and make sure that each module still works correctly.
- Unit tests can be written to help simplify integration if you are working from the bottom-up in many cases. This goes well with really good planning around integration points. In a large project you may need to separate everything out into smaller units or modules. All the pieces are probably going to developed concurrently by different developers and teams. If you write unit tests around your integration interfaces, you can flush out any issues before hand and move on to develop other pieces of the project, and that lessens the strain on your integration phases later.
- Unit tests can also offer a form living document for others to understand how the code works and how they can use it or call it by providing a bunch of working examples. Since the unit tests are constantly being updated as code gets refactored they are usually more reliable then any static documentation which can drift and get out of date.
Another overlooked piece is when should you write unit tests. I know that in a lot of projects that I’ve worked on, the time developers would take to write their unit tests is usually delegated to the end or when they have any extra time. The unit tests usually end up the victim of procrastination and a lack of coverage usually causes more headaches later.
I believe unit tests should almost drive your development, especially when you are developing in an agile environment, defining features as you go and you are constantly refactoring your code.
Unit tests are great way of showing you where you have spaghetti code and complex environmental dependencies right away.
If you enforce unit tests on all business logic, you pretty much can enforce a clean separation of interface and implementation, and drive you code towards using better coding standards and patterns out of necessity to just be able to test your own code (like moving your code to a basic model/view/controllers pattern). It’s a great practice to get into for writing maintainable code.
You have to be smart about your unit tests though. More is not always better. You have to be practical and not walk about until you feel you have sufficiently covered everything. Anything that preforms any custom logic based on a specific input is a big candidate for a unit test. Sanity check unit tests are not usually necessary (such as writing a unit test around a property/setter-getter method that simply reads or writes to a field in a class).
One of my pet peeves is a test that knows the internals of a particular unit, not the function of that unit provides. You should pretend you don’t know how your code works internally, and write tests how other code is going to interact with it. The exception to that rule is if you know of ways that unit could easily break with specific inputs but stay within the confines of what that unit provides you. Do not write units tests that simply reconfirm that your code works exactly the way that you wrote it and would probably break if someone refactored the internals of that code.
For example, lets say I have a function called “SaveCustomer(String id, Customer customer)” that I know saves to a database underneath and another function called “LoadCustomer(String id)” that loads from that database underneath (the underlying database is not intended to be accessed directly except by the customer abstraction in this case). It would not be an effective test to write a test that calls the SaveCustomer function and then connects to the database to see if its written, or similarly, write a test that calls the LoadCustomer function and checks to see if the data returned is the same as in the database. Rather a more effective test would be write a test creates a customer and attempts to store it using the SaveCustomer function and then tries to load it back out using the LoadCustomer function and verify the data in the customer objects survived round trip.
Another example that I ran into was a function called “getOffset” which was meant to return the internal position of where to read off an array returned by that same class. The previous programmer that wrote the function knew that the value would always return 16 at that time but made the function so that he didn’t have any magic numbers running his code and that he could change the value it returned to something different someday (or possibly even dynamically figuring its position at some point). However the programmer wrote a unit test to make sure it always returned 16 and sure enough when I went in and changed what the offset would was, the unit tests broke although no code did that called that function. Effectively, the test was broke and not the code.
It comes down to being smart and practical about your unit tests. There is no perfect setup. I try to shoot for a test around each public function and protected function (if intended to be inherited by other classes), and if applicable how the class is used as a whole (thinking of it as almost writing examples and documentation for the code).
Some good candidates for unit tests are around that code that does logic in loops, contains switch statements (especially in languages that all statements to can fall through), any goto/jump statements, any regular expressions usage, any “auto-magical” features that can possibly fail at runtime, non generic collection usage, null pointers, static position access on non-rigged arrays, primitives based by reference, and code that calls reflection features of your language.
In scripting/interpreted/dynamically typed languages, lots of sanity check tests are also more useful to attempt to test each code path possible. I wish you luck if you are writing a large application in one of these languages if you don’t have a strict coding standard and a whole bunch of tests.
Whatever you do, you can’t trust unit tests to be your only type of testing. Unit tests are not well suited for GUI testing and they can almost never do functional testing. Unit tests can only tell you if an error occurs in one particular case but not that couldn’t foresee to happen in many cases. Automated testing solutions depending on your project are really important as well in many cases.