VerySillyMUD: Understanding a Function with Tests
Series [ VerySillyMUD ]
This post is part of my “VerySillyMUD” series, chronicling an attempt to refactor an old MUD into working condition [ 1 ].
In our
last episode,
we got a basic framework for unit tests in place. Just to
refresh our context (and which layer of yak we are shaving at the
moment): we need unit tests so we can (a) capture the current behavior
of a particular function, dsearch()
, so that we can (b)
refactor it safely, so that we can (c) fix a compiler warning about
it, so that we can (d) finish fixing all the compiler warnings in case
there are other latent bugs being hidden. Whew!
So let’s take a look again at the function in question:
What can we figure out about it? Well, first, it seems to take two
string arguments, helpfully named string
and
tmp
. It seems like both are potentially modified
by the function here; tmp
gets written on line 11 via
strcpy()
, and string
gets written on line 24 via
sprintf()
. The i
variable seems to be a flag
indicating whether we’re done, as the main while
loop tests
for it and the only modification to it is in line 10 where it is set
to 1. So let’s start with the simplest case, where we only go through
the loop once; in order for that to be true, we need the call to
strchr
in line 9 to find no instances of the tilde
('~'
) character in string
. In this case, it looks
like we just copy string
into tmp
and are done,
which suggests that our first unit test can be:
So let’s try to run that:
Now, what about a more complicated test? It seems like tmp
might be the output variable here, so let’s see what happens when we
feed it a string with a tilde in it, by writing an intentionally
failing test:
And then run the test:
Hmm, a little less than I was hoping for; the Criterion library
macro-interpolates the C source expression into the error message. In
the case of string comparison, I’d rather have seen the string
values
here instead. At this juncture I took a quick browse of
the Criterion source code to see if this was a quick fix, but it
wasn’t, so I decided to pass on this for now and just fix it by doing
a good ol’ printf
to see what the value was. In this case, it
looks like the behavior is to strip out the tilde and the following
space, because this test passes:
A quick perusal of the source shows we must be in the
default:
clause of the switch
statement. Now, like
any good tester, let’s see what happens if we have a tilde as the
last character:
Ah, cool! That one passes. So now let’s look at some of the other
cases
, starting with a ~N
. I’m going to
guess that this substitutes the string $n
in for
~N
:
Indeed, this passes. Similarly, it looks like ~H
will turn into $s
. (It does, although I won’t
show that test here just for brevity). Ok, what should we test next?
How about two tildes in a row? I think that this should just
get stripped out because the second tilde won’t be either 'N'
or 'H'
:
Yep. Ok, now we have a loop, so this suggests we should be able to do multiple substitutions:
Check. That would seem to cover most of the output behavior on the
second argument to dsearch()
, which is labeled
tmp
. As we noted, though, the first parameter,
string
sometimes also gets written, in the sprintf
on line 24. So we had better document its behavior too. However,
notice that sprintf(string,tmp)
essentially copies
tmp
into string
, as long as there are no format
expansions like %d
in tmp
, and if there are, then we
don’t have any arguments for them! So this is likely a bug, especially
if string
comes from an untrusted input source. Now, if you
remember
a couple episodes ago,
this was the exact compiler warning we ran into:
Ok, so this means that we should pretty much just add tests that show
that string
has the same output behavior as tmp
, and
then I think we can refactor it.
At this point I feel pretty confident I understand what this function
does. Let’s rewrite it more simply. What we’ll do is we’ll build up
the substituted string in tmp
and then copy it back into
string
at the end:
Sure enough, the tests pass, so we’ll call this a victory for now. In
the next episode,
we’ll turn the -Werror
flag back on and continue
fixing compiler warnings as we were trying to do
previously. Fortunately, we now have the beginnings of some unit tests
we can use if we encounter anything that doesn’t look like it has an
immediately obvious fix.
[ 1 ] SillyMUD was a derivative of DikuMUD, which was originally created by Sebastian Hammer, Michael Seifert, Hans Henrik Stærfeldt, Tom Madsen, and Katja Nyboe.