Fast portable arithmetic shifting with C/C++ (to multiply with power-of-two)

There are certain situations when one wants to shift integers either left or right based on runtime parameters, for example to calculate a×2b.

Simply calculating a << b can result in errors:

  • if a is negative, the operation is undefined behavior.
  • if b is negative, the operation is undefined behavior.
  • if a is signed and a 1-bit get shifted in the sign bit, then until C++14 this is undefined behavior and since C++14 it results in the sign-bit being set, which may or may not be desired.
  • with a >> b, if a is negative and b isn’t then the result is implementation defined in C++, it might not be a×2b.

A solution can be the following:

  • Depending on the sign of b, simply do << b or >> (-b).
  • If a is negative and b is positive:
    •   -a       = [1] (-a is positive)
        -a << b  = [2] (-a 2b in a well-defined way)
      -(-a << b) = [3] (-(-a 2b) = a 2b in a well-defined way)
    • In [1], we negate a. This is only undefined if a is the minimum value, for example -128 and can not be converted to +128 in 8 bits. This is the only case which results in UB, but the result would be unrepresentable anyway.
    • In [2], we get –a×2b in a well-defined way and finally in [3] we get a×2b.
  • If a is negative and b is negative:
    •    a        = 1jk..rs..yz  [1] (j..z are the bits)
        ~a        = 0JK..RS..YZ  [2] (N = ~n)
        ~a >> -b  = 00..0JK..RS  [3]
      ~(~a >> -b) = 11..1jk..rs  [4]
    • In [1], we have the original number, a, which is negative.
    • In [2], we get the ones-complement which is abs(a)-1.
    • In [3], we get (abs(a)-1)×2b = abs(a)×2b-1×2b = abs(a)×2b rounded down
    • In [4], we get a×2b-1 where a×2b is rounded up.
    • As it can be seen, this sequence of operations is equivalent to an arithmetic right shift.

The code can be viewed here.

Most platforms implement arithmetic shifting for signed values anyway, meaning they use the arithmetic shift instructions so the above exercise becomes pointless. Even worse, look at the assembly generated here: https://godbolt.org/g/YN1qr4

All the branches are there in the generic shifting function even though they are optimized away to single instructions (SAL, SAR, SHR on x86 and MOV ASL, MOV ASR, MOV LSR on ARM) when the shift amount is known at compile time.

It is best if we rely on builtin arithmetic shifting when possible. To test for this at compile-time, we can verify that shifting -1 = 0xFFFF right by one results in -1. The results can be seen here: https://godbolt.org/g/puIcAF The generic shift function shift_by_int avoids branching and uses less instructions on all platforms.

The code:

#include <type_traits>

template <typename T>
constexpr auto builtin_shr(T value, int amount) noexcept
    -> decltype(value >> amount)
{
    return value >> amount;
}

template <typename T>
struct uses_arithmetic_shift : std::integral_constant<bool, builtin_shr(T(-1), 1) == -1> {};

template <typename T = int>
constexpr T shift_by_portable(T value, int amount) noexcept
{
    return value < 0 ?
      amount < 0 ? ~(~value >> -amount) : -(-value << amount) :
      amount < 0 ? value >> -amount : value << amount;
}

template <typename T = int>
constexpr T shift_by_arithmetic(T value, int amount) noexcept
{
    // Only use with negative T values when the compiler translates right shift to arithmetic shift instructions.
    return amount < 0 ? value >> -amount : value << amount;
}


template <typename T = int>
constexpr T shift_by(T value, int amount) noexcept
{
    return uses_arithmetic_shift<T>::value ? shift_by_arithmetic(value, amount) : shift_by_portable(value, amount);
}

Recreating this with C macros is left as an exercise to the reader.

Leave a Comment

Your email address will not be published. Required fields are marked *