So you’ve set up some benchmarks for some performance critical piece of code in your application. Perhaps you’ve even followed along with my Benchmarking in .NET article to get set up. If you did, you might have noticed that I made a horrible mistake in that article. OK, maybe not horrible, but a mistake nonetheless. To be fair, Github Copilot made me do it and I can’t be held responsible for it 😉. Can you spot it?
namespace StringBenchmarks;
[MemoryDiagnoser]
public class Benchmarks
{
[Params(5, 50, 500)]
public int N { get; set; }
[Benchmark(Baseline = true)]
public string StringJoin()
{
return string.Join(", ", Enumerable.Range(0, N).Select(i => i.ToString()));
}
[Benchmark]
public string StringBuilder()
{
var sb = new StringBuilder();
for (int i = 0; i < N; i++) // <-- hint, hint
{
sb.Append(i);
sb.Append(", ");
}
return sb.ToString();
}
}
In the above code, I’m benchmarking two methods that are supposed to do the same thing - concatenate a list of integers into a comma-separated string.
The first method is a simple string.Join
call, which avoids adding a comma at the end of your final string. The second uses StringBuilder.Append
… and it adds a comma at the end 🤦♂️.
When recording the video for that benchmarking demonstration, I didn’t notice the mistake. It wasn’t until Steve Smith reviewed it that it was brought to my attention. Thankfully, the mistake is small enough that it doesn’t materially affect the benchmarking results. But it could have.
There are two lessons to be learned from this:
- Github Copilot will make a fool out of you on live TV.
- You should probably have some kind of automated testing in place to make sure that your benchmarks are working as expected.
Let’s focus on the second lesson. How can we validate that our benchmarks are returning the same results and that we are in fact comparing apples to apples and not apples to oranges, or worse, chimpanzees?
When I first started thinking about this, I figured I would probably have to write some unit tests using some kind of automated testing framework. But it turns out that the package we are using for benchmarking, BenchmarkDotNet is already built to do that. All we have to do is add the [ReturnValueValidator]
to our benchmark class and we’re good to go.
namespace StringBenchmarks;
[MemoryDiagnoser]
[ReturnValueValidator(failOnError: true)] //<-- this is the magic sauce
public class Benchmarks
{
[Params(5, 50, 500)]
public int N { get; set; }
[Benchmark(Baseline = true)]
public string StringJoin()
{
return string.Join(", ", Enumerable.Range(0, N).Select(i => i.ToString()));
}
[Benchmark]
public string StringBuilder()
{
var sb = new StringBuilder();
for (int i = 0; i < N; i++)
{
sb.Append(i);
sb.Append(", ");
}
return sb.ToString();
}
}
Now, when running our benchmark using dotnet run -c Release
, if our benchmarked methods are not returning the same thing, we will get an error like this:
// Validating benchmarks:
Inconsistent benchmark return values in Benchmarks: StringJoin: 0, 1, 2, 3, 4, StringBuilder: 0, 1, 2, 3, 4,
...
And the benchmarks won’t run. By the way, don’t be confused by the fact that both return values look the same here. It’s because that last comma after the output of StringJoin
is not part of the output but just there to separate the output of StringJoin
from the output of StringBuilder
. The output of StringJoin
is "0, 1, 2, 3, 4"
and the output of StringBuilder
is "0, 1, 2, 3, 4, "
.
If we fix our code to make sure both methods return the same thing:
namespace StringBenchmarks;
[MemoryDiagnoser]
[ReturnValueValidator(failOnError: true)]
public class Benchmarks
{
[Params(5, 50, 500)]
public int N { get; set; }
[Benchmark(Baseline = true)]
public string StringJoin()
{
return string.Join(", ", Enumerable.Range(0, N).Select(i => i.ToString()));
}
[Benchmark]
public string StringBuilder()
{
var sb = new StringBuilder();
for (int i = 0; i < N - 1; i++) //<-- this is the fix
{
sb.Append(i);
sb.Append(", ");
}
{
sb.Append(i);
sb.Append(", ");
}
sb.Append(N - 1); //<-- this is the fix
return sb.ToString();
}
}
And then run our benchmark again, we’ll see the validation succeed and the benchmarks run as expected.
// Validating benchmarks:
// ***** BenchmarkRunner: Start *****
// ***** Found 6 benchmark(s) in total *****
// ***** Building 6 exe(s) in Parallel: Start *****
// ***** Done, took 00:00:03 (3.21 sec) *****
...
Hurray! We can now continue benchmarking our code, safe in the knowledge that we are protected from Github Copilot’s shenanigans.