Ben Fairbanks

Freelance Website Developer

Using the Scientific Method's Experimental Procedure to Refine Test Creation

Mar 19, 2017

One of the most valuable tools a programmer has in their toolkit is unit testing. Unit tests consist of taking specific, important, high traffic portions of your program and running them in isolation. This allows you to crystallize exactly what this section of your program does. Ruby has a testing tool called RSpec, which I am most familiar with, but the basic idea can be transfered between languages. RSpec tests look like this:

Like looking at any new subset of coding, getting a handle on what this code does is a bit of a challenge. However when notated properly, RSpec is at least chunked up into sections with labels that have a finite amount of jargon. Setup, Exercise, Verify and Teardown are reasonably understandable, Setup creates the environment that the method being tested needs to run, Exercise runs the method that is the crux of the test, Verify compares what was produced to a set value,(basically what you expected to be produced) and Teardown returns the environment to the state it was pre-Setup.

This is very similar to the experimental procedure employed by almost all scientific fields. Experimental science is straightforward on paper; Starting with a question you explore the available evidence related to that question, form a hypothesis related to that question, run an experiment in an attempt to prove or disprove the stated hypothesis, analyze the data that was generated by the experiment to see whether it supports or challenges your stated Hypothesis, and finally communicate the results of your experiment.

Applying this thinking to RSpec testing produces a clarifying and codifying effect. The question is almost always 'how does this section of code work?', which is a very good question to ask. Gathering data is a relatively simple process, understanding the methods invoked produces a pretty accurate time line of what is happening within the function. Now comes forming the hypothesis, which will be something along the lines of "I think given this kind of input this program will produce this result." This portion of the Scientific is perhaps the most important, as having a more formal and deliberate outlook towards what questions you are asking though RSpec testing produces a more full understanding of the portion of the program you are testing.

Now comes the testing of that hypothesis, basically the actually RSpecing. To begin with, during the setup phase you want to limit the amount of variables that can influence your experiment over numerous repetitions. With RSpec this is relatively simple, you create the necessary environment for your method to run and you keep it the same. If you feel that you should change something in this step it most likely means that this is a separate test.

Exercise is where you will make the changes during repetition. The Exercise step can be thought of as the actual moving parts of the test, wherein input is declared and the method is run. Ensuring that your experiment works with multiple inputs is important, not necessarily multiple types of input (if your method needs a hash it needs a hash), but with multiple different inputs of the same type. These different inputs should be done over multiple tests, as the input will impact the Verify step dramatically.

The Verify step is where you compare your expected output with your actual output. In other words you compare what your method produced with what you thought it should have produced back in the Hypothesis phase. There is nothing wrong with the test failing, in fact it is the primary way to understand your method more fully. If your experiments always work you either understand your method to perfection or you should be attempting grand experiments. Finally you should communicate your results, if you find an error in your program or a way that the method does not work communicate this to your teammates, if you are working with any. Sharing this kind of thing is the difference between mistakes being ironed out immediately and hanging out until roll out, which is a thing no one wants to even think about. The Teardown step should delete anything that was created or undo anything that was changed, making testing repeatable, as if you have a hypothesis based on the way your program was previously and it no longer is in that state, you have a massive flaw in your thinking. Also it is always good to pick up your toys after playing with them.

That's about it! Through applying the scientific method to RSpec testing you should develop a more robust mindset about testing, create better and more accurate testing, both of which should lead you towards being a better programmer. Happy Testing!

Share this on Twitter!