Go Tricks: Benchmark Testing

Last month I wrote a post about testing a simple function to generate random strings using the testing/quick package in Go. As I was working through the code changes, I noticed that I was creating a new random seed every time my function was called. This seemed like a place where we might be able to make a measurable performance optimization, and a good excuse to play with the benchmark testing capability built in to Go.

As a bit of background, benchmark testing is basically just testing performance time. While the Go testing tooling will run any other unit tests in the package by default, benchmark tests are not concerned with the question of whether or not your code does what you want it to, but rather simply how quickly it will run. The way that the Go benchmark test works is that it tries to run your function as many times as possible within a second over and over with the same input, then spits out the results.

I had a hunch that we could improve the performance of the code, but first needed to write a benchmark so that we could tell as quickly as possible whether I was right. So, here's the functional code as we left it in the last post (check there for more details on why it looks this way or what it's supposed to do):

func generateString(numChars int) (string, error) {
	if numChars <= 0 {
		return "", fmt.Errorf("numChars must be greater than 0, received: %d", numChars)
	}
	if numChars > 1024 {
		return "", fmt.Errorf("numChars must not be greater than 1024, received: %d", numChars)
	}
	rand.Seed(time.Now().UnixNano())
	randomString := make([]byte, numChars)
	for i := range randomString {
		randomString[i] = characterSet[rand.Intn(len(characterSet))]
	}
	return "anonymized_" + string(randomString), nil
}

Step 1: Write a benchmark test

Fortunately, writing a benchmark test for this function is super simple. Here's the whole thing:

func Benchmark_generateString(b *testing.B) {
	for n := 0; n < b.N; n++ {
		generateString(10)
	}
}

This code must go into something that the Go toolset recognizes as a test, so a file with a name ending _test.go. By convention, the rest of the filename should match the file where your code that you're testing lives. So if this code lived in generator.go, then my benchmark would live in generator_test.go.

Another requirement is that the function name must begin with Benchmark_. Again, the convention is that we prepend this to the name of the function under test as I've done here. Both this convention and the one listed in the preceding paragraph will not prevent go test from running your benchmarks if they are ignored, but they may make it difficult for you or a teammate to find.

Your function must take a *testing.B as its only parameter, and then you write a simple for loop to execute the code. The value of b.N is dynamically determined by the testing tool to try to run the code inside the for loop as many times as possible within one second. All that we need inside the for loop is our function call; since I don't care about the result of the function here (that's what all those unit tests were for), I don't provide any place for go to store them.

To run the test from the CLI, use go test -bench=. from the package where this test lives. When I ran this command on the original code, here's what I got:

Benchmark_generateString-4 103034 11535 ns/op

It gives me the function name first, though I'm not really sure where the -4 comes from. The next value–here 103034–is the value of b.N for this particular run. The last value is the amount of time per operation (the execution of whatever code you put inside your for loop) your function took on average. So here our function took 11,535 nanoseconds to run on average. It is important to remember that these are nanoseconds; it might be reasonable for me to look at this result, learn that my function takes on average .011 milliseconds to run, and decide that there are more important places for me to use my time and move on.

Step 2: Optimize?

How do I decide whether it's worth optimizing? Well, by themselves I'm not sure how valuable those metrics we got out of go test were. What I really need is something–anything–to compare them against. I decided to make the quickest code change I could think of that would prove or disprove my hypothesis, which was that the line rand.Seed(time.Now().UnixNano()) might be slowing us down since it involves three other function calls. Additionally, it seemed like we should be able to call that once for the life of the serivce rather than every time the function is called without affecting the behavior. So what's the quickest way to test that idea?

Strategy 1: Package level init()

NOTE: Package level init functions can be useful, but they can also be very magical in all the worst ways. I decided to use one here just to test my hypothesis; it's super simple to write an initializing function that will set that seed, and it will require no changes to the function signature in order to implement. So here's the new code:

func init() {
	rand.Seed(time.Now().UnixNano())
}

func generateString(numChars int) (string, error) {
	if numChars <= 0 {
		return "", fmt.Errorf("numChars must be greater than 0, received: %d", numChars)
	}
	if numChars > 1024 {
		return "", fmt.Errorf("numChars must not be greater than 1024, received: %d", numChars)
	}
	randomString := make([]byte, numChars)
	for i := range randomString {
		randomString[i] = characterSet[rand.Intn(len(characterSet))]
	}
	return "anonymized_" + string(randomString), nil
}

Really, all that I did was move that one line from my function under test to a new init() function. This init() will be called automatically whenever the package is imported or code from the package is executed. So it now (I believe) executes that bit of code once per benchmark test, rather than 100K+ times in the original function. Now I can quickly run my benchmark test and see whether or not there are significant gains to be made:

Benchmark_generateString-4 2945200 391 ns/op

The important number to focus on here is the time per operation, since go test tries to get that b.N value, the first value, close enough to 1 second, there may be fluctuations there (not to this degree though). By just comparing ns/op, I've cut my function's run time down to 3.4% of the original time! I definitely want to optimize this function, but I hate this optimization for a reason that will involve a quick aside.

An Aside: The three efficiencies

There are (at least) three separate things we could mean when we talk about efficiency:

  1. “Run” efficiency: how quickly a computer will run some code
  2. “Read” efficiency: how quickly a developer can understand some code
  3. “Write” efficiency: how quickly a developer can create some code

Usually when efficiency comes up, we are talking about the first kind. In fact, that's the only kind that these benchmark tests measure in any way. What I've found, though, is that in the context in which I write code, the second kind of efficiency is at least as important to my success. This is because I spend much more time reading code that was written by other people in order to support or extend it than I do creating new code (sometimes the other person who wrote this code is a younger version of myself).

What's troubling, though, is how often my decisions seem to be based on the third kind of efficiency: how quickly can I write this. Sometimes that's due to laziness on my part, but I think more often it's based on my obsession with delivery time that makes me code in a rush. I've started describing this as writing code “breathlessly”; so focused on getting from red to green and closing the story, that I lose sight of the context of the code and what maintaining it will require.

All three of these may have some bearing on what maximum efficiency means. In this case, though, I am giving up a lot of “read” efficiency for a little bit of “write” efficiency going with the package level init(). In fact, I may even be giving up some “run” efficiency as well, if this package is ever used by someone who will not be using this function at all.

Based on these factors, I'll try some changes to improve the readability of this function, and hopefully not have to give up too much performance along the way.

Strategy 2: Passing in the randomizer

One way to make sure that the randomizer is not created in every call and make it clear that one is being generated is to make the caller pass in the randomizer when they call the function. This has the added benefit of being able to reproduce randomness in my unit tests: I can pass in a randomizer with a known seed and validate that I get the expected output. The only real change (aside from getting rid of init()) is that now I'm going to expect a *rand.Rand passed in; there's also a small change to how we get random values now that we're using a method on the Rand object instead of just package level functions from rand:

func generateString(r *rand.Rand, numChars int) (string, error) {
	if numChars <= 0 {
		return "", fmt.Errorf("numChars must be greater than 0, received: %d", numChars)
	}
	if numChars > 1024 {
		return "", fmt.Errorf("numChars must not be greater than 1024, received: %d", numChars)
	}
	randomString := make([]byte, numChars)
	for i := range randomString {
		randomString[i] = characterSet[r.Intn(len(characterSet))]
	}
	return "anonymized_" + string(randomString), nil
}

Unlike the addition of the init() function, since I've now changed the function signature I must also change the benchmark test:

func Benchmark_generateString(b *testing.B) {
	src := rand.NewSource(time.Now().UnixNano())
	r := rand.New(src)
	for n := 0; n < b.N; n++ {
		generateString(r, 10)
	}
}

The two new lines outside of my for loop just generate the new object required for my function. It's very important that the new object be created outside of my for loop, but before we dig into that too much I'll run the benchmark and see how it looks:

Benchmark_generateString-4 2847084 360 ns/op

Great news! The time performance is even better than the package init(), though it might be within the margin of error. Either way, I do prefer this implementation, since it makes things clearer to the caller as to what's happening.

Unfortunately, though, thinking about how this change impacts the caller gives me pause. We're really just pushing the problem of optimizing the run performance up to the caller. In a lot of ways, that's a much better policy; on the other hand, I'm worried that someone (like future-me) will make the same mistake of creating the rand.Rand object each time they call my function. Since the main improvement we are making relies on this object being created just once, I think there might be a better way to handle this.

Strategy 3: Creating a method receiver

We could try writing this function as a method on an object that stored the randomizer as a property. We could also then be intelligent about creating a randomizer only when needed.

type Generator struct {
	r *rand.Rand
}

func (g *Generator) String(numChars int) (string, error) {
	if numChars <= 0 {
		return "", fmt.Errorf("numChars must be greater than 0, received: %d", numChars)
	}
	if numChars > 1024 {
		return "", fmt.Errorf("numChars must not be greater than 1024, received: %d", numChars)
	}

	if g.r == nil {
		src := rand.NewSource(time.Now().UnixNano())
		g.r = rand.New(src)
	}
	randomString := make([]byte, numChars)
	for i := range randomString {
		randomString[i] = characterSet[g.r.Intn(len(characterSet))]
	}
	return "anonymized_" + string(randomString), nil
}

First, I defined this new type and told it to store a *rand.Rand. The method signature changed back to being close to what the original signature was, but I've changed the name to String because that feels more idiomatic to me. The other big change is that, once we have checked that numChars is valid, we're checking to see if g.r is nil; if so, we create a *rand.Rand before proceeding.

Before I can run the benchmark, it also needs to be updated to reflect the new change:

func Benchmark_generateString(b *testing.B) {
	gen := &Generator{}
	for n := 0; n < b.N; n++ {
		gen.String(10)
	}
}

Because I have not set the value for r in my Generator, it will be nil the first time that it runs, and the function will create the randomizer. Running the benchmark here gives even better results:

Benchmark_generateString-4 4263811 251 ns/op

We still have the advantages of making this fully testable by providing a *rand.Rand to my Generator struct in my test code. By not exporting r in my generator, though, I'm preventing users of this package (in particular, one stubborn and foolish developer who looks exactly like me) from making the mistake of creating this object multiple times. I'm pretty happy with this solution.

Final Thoughts

I hope this is helpful to someone (even if it's just future-me). While it was nice to see that my hypothesis was correct, it is always important to write these sorts of benchmark tests to ensure that a problem exists and to measure the success of new strategies. In this case, my expectation was that the init() function solution would be the fastest and we would have to compromise on speed to get a more readable solution. It was nice that that was wrong, and that the solution I like the best is also the fastest.

comments powered by Disqus