How an exception can lead to code optimization

[Tomislav Gracin] - How an exception can lead to code optimization-1

In Mireo, we value the quality of code and performance more than releasing stuff fast. Services we write are supposed to run forever, and it's incredibly suspicious when an application crashes for the first time in several years.

Speaking of performance, whenever you dig deep enough, you end up tumbling some bytes and bits over. Unless flexibility is the most important factor, at least one type of data will be written in some kind of binary format. However you design such a format, after a while, you will find yourself doing mental gymnastics in process of "searching for a place where I can add just one more bit of information. Often there is a flags member which runs out of bits, and you're in problem. Questions such as these often arise:

  • Do we really need course (direction compared to North) to be precise to 1 degree? Maybe we can squeeze it to just one byte.
  • If the vehicle is moving, then it's not adding fuel, can we combine that information?
  • After we process this data, some of the raw information is useless - can we reuse this field for something completely unrelated?

One interesting type of data is time. Everyone who ever programmed surely had her share of frustrations dealing with time zones. Usually, it's best to treat time as an integer - number of seconds (or milliseconds if you need that precision) elapsed from some fixed moment in time, and calculate everything relative to that (converting to local time is done just when time needs to be displayed to the user). An industry standard is Unix epoch time - number of (milli)seconds elapsed since January 1st, 1970, midnight. Often you won't need that big of an offset though - if your system is designed in 2020, and you won't accept any prior data, it's perfectly acceptable to shred 50 years of elapsed seconds. The biggest reason being something called Year 2038 problem.

We too, in Mireo, represent time as an integer relative to Unix epoch. We, luckily, do not need to represent any date before 1970, so we know that our integers will always be positive, which brings us back to the initial problem - we have an extra bit of information at our disposal wherever we use signed integers (yes, for representing dates after 2038 we'd need to convert this to unsigned or use more bytes). In our particular case, after we apply a filter to the data point, we can negate the time component to signal that this instance of data point was processed. A simplified pseudo-code follows:

struct data_point {
struct data_point {
  int time;
  ....
}

...

while (!all_data_is_filtered) {
  foreach (dp in data_points) {
    if (dp.time > 0 && filter_needs_to_be_applied(dp)) {
      apply_filter(dp);
      dp.time = -dp.time; // mark it as done
    }
  }
}

This way, if we pass through the same data multiple times, we won't apply filter more than once. Later on, we can just take the absolute value of the time component and resume as we normally would:

do_something_with_time(Math.Abs(dp.time));

Clever, right? Well, after your application says kaboom and crashes with an exception a few years later, you have some head-scratching to do. As it always is, Devil's in the details. Data is often not perfect, and as we are processing real-life data, we should be expecting all kinds of nonsense. Some devices that provide the data to our system sometimes have problems of their own. In this particular case, a device sent a timestamp value of 0x80000000 which is the smallest possible 4-byte integer. As a reminder, for example, in C (and for all practical purposes - everywhere else):

Value of int.MaxValue is +2147483647 (0x7FFFFFFF). 
Value of int.MinValue is -2147483648 (0x80000000).

At this point, you might want to reconsider the above code.

The application in question was written in C#. As it's written here, this method throws an exception in this very specific case - the input is int.MinValue. And there we have it, an innocent-looking code worked perfectly for years until some erroneous data arrived.

The interesting part is how you should handle this case. Java advocates are grinning at this point, as they tend to try...catch virtually every line of code and "this can never happen to them". There are benefits to writing such heavy-exception-handing code (stability), and there are penalties as well (performance and readability). The main question you should be asking is: "what do I want to happen?". Sometimes you'd rather the application to crash if an exception happens instead of trying to handle it. Sometimes you do not care. Lines such as these are very common:

try {
  do_something(); // may throw exception
}
catch {} // do nothing

And that's perfectly fine. In the case when you don't care, you'd rather if the method do_something didn't even throw the exception in the first place, so you would not need to write the try...catch block anyway. In the case of an elementary method such as abs, you probably didn't even expect that it can throw an exception. And even if you do, you probably didn't catch it. C++'s way of handling it is that abs of int.MinValue is undefined behavior. The caller is expected to handle the case when the value passed is this specific case. For us, we didn't really care what abs would return in the code above, as the data would be ignored in the next step anyway. So, what did we do? No, we didn't wrap the Math.Abs in the try...catch block, we simply wrote our version of Abs. You'd probably expect something like this:

static int MyAbs1(int value) {
  if (value == int.MinValue)
    return int.MaxValue; // close enough
  return (value < 0) ? -value : value;
}

But, as we said initially, we are focusing on performance. This method has two control blocks just to make it correct. But we don't really care about int.MinValue, so we can shred that one out:

static int MyAbs2(int a) {
  return (value < 0) ? -value : value;
}

What will this code return in case of int.MinValue? Well, unluckily, it will return int.MinValue. If you truly don't care, this is not a problem.

If we look into the abs function, it's common implementation, and the idea behind it, we can notice that:

  • abs function which would take unsigned value as an input makes no sense (it would simply return the same value as there is no possibility of negative numbers)
  • abs returns signed values, even though it should (must?) never return a negative number

In our specific example from the start of this blog, we might want to get any non-negative result out of it, so we don't re-run the filter. For this reason, it might be useful to rewrite the function as follows:

static uint MyAbs3(int value) {
  return (uint)((value < 0) ? -value : value);
}

This variation really returns the expected value, and as long as your application can handle the unsigned return value, the problem is solved.

Or is it? Being as we are, we're trying to improve everywhere we can. Knowing that negative numbers are written as Two's Complement, we can improve on the code by replacing a control statement with some bit manipulation logic:

static uint MireoAbs(int value) {
  return (uint)((value + (value >> 31)) ^ (value >> 31));
}

Explanation: Let's consider (value >> 31) - the initial value is shifted to the right 31 times ("copying" the leftmost bit on the way). If the left-most bit of value is 0 (positive number), this will evaluate to 0. But, if the left-most bit of value is 1, this will evaluate to a number consisting of thirty-two ones in binary notation (in hex it's 0xFFFFFFFF). This number is actually -1.

Let's consider both cases now.

If the initial number is positive, we will return the (value + 0) ^ 0 which evaluates to value.

If the initial number is negative, it becomes a little bit more fun. For any xx ^ (-1) means XORing the value with nothing but ones, essentially taking a complement of a number, which is equivalent to taking two's complement and subtracting one: x ^ (-1) = - x - 1. Substituting x = value - 1 results in (value - 1) ^ (-1) = - (value - 1) - 1 = - value + 1 - 1 = - value. Exactly what we wanted.

There are two more things to do:

  1. Test if the method is correct.
  2. Measure the performance

How does one do the correctness test? In this case, it's actually very simple - there are about 2 billion possible integers. A commodity CPU of 2GHz performs 2 billion operations per second. Meaning, we should be able to iterate through EVERY integer and test if the value is correct.

for (int val = int.MinValue; val <= int.MaxValue; ++val)
  if (MyAbs(val) != Math.Abs(val))
    Console.WriteLine("Oops!\n");

Easy, right? We hope you didn't fall into the trap, there are two bugs in the code:

  • You'll get the exception on the very first iteration for value == int.MinValue, just where this blog started.
  • Even if you caught the exception, the loop would run infinitely

A proper test:

for (int val = int.MinValue; val < int.MaxValue; ++val)
  if (MyAbs(val + 1) != Math.Abs(val + 1))
    Console.WriteLine("Oops!\n");

After running the test and verifying the results are correct, we can measure the performance. On a 3.3GHz machine, running the Abs function for all integers except int.MinValue results in this table:

Implementation Run Time
Math.Abs 7.2 s
MyAbs1 7.8 s
MyAbs & MyAbs3 5.7 s
MireoAbs 2.2 s

 

And that's the story of how we ended up writing our own implementation of the Abs method. Although it's almost impossible to consider all corner cases when designing an application and writing code, we hope we gave you an idea of how we in Mireo tackle a problem, however simple it might be on the surface.

Edit:

As correctly identified by a reader (Goran Mitrović), this blog had several mistakes. The first one was a typo (which was repeated twice), and that's easy to fix. The second one was of technical nature, and although it's never fun to be wrong, I as an author have to admit that I made a mistake in testing. The value of 2.2 seconds that was measured as a performance is incorrect. Compiler optimized the code, noticing that the value was not used. The reader suggested that returned value should be assigned or accumulated to prevent optimization, which makes perfect sense. Doing that resulted in a table what looks more sensible:

Implementation Run Time
Math.Abs 7.6 s
MyAbs1 7.0 s
MyAbs2 & MyAbs3 4.7 s
MireoAbs 4.6 s

 

Obviously, the improvement is insignificant (and probably depends on luck). It would be silly to defend the obfuscated version for a doubtful improvement just for sheer showing of bit manipulation skills. “Premature optimization is the root of all evil” is a common saying, and it applies here as well - code should be efficient, but compromising readability for no efficiency is dangerous and harmful. In this particular case, where complete test is possible (iteration over all possible values), it's acceptable, but for sure nothing I'd recommend.

That being said, the said reader inspired me to give it another test. Going back to the root of the problem, an exception was thrown, and this whole code was done just to avoid throwing an exception where we can ignore it. Measuring with the same principle, but adding `try .. catch` block around the MireoAbs call resulted in runtime being 12.6s - around 2.7 times slower. Avoiding need to handle an exception is always a positive trait.

To sum it up - do your best. Sometimes you'll make a mistake - own it up and do better next time. That's how we do it in Mireo, and so should you.