Saturday, July 14, 2018

DRY multicomparisons

Now and then, I find myself writing something like if (x == a || x == b || x == c) ..., and every time the repetition of x == annoys me.

A number of people, me included, have reduced the repetition by writing code like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
template <typename T>
class is
{
public:
    constexpr is(T t_) : t(std::move(t_)) {}
    template <typename ... U>
    constexpr auto any_of(U const& ... u) const { return ((t == u) || ...); }
private:
  T t;
};

If you're not familiar with the C++17 constructs, I'll walk you through it.

To begin with, C++17 has a class template argument type deduction that works for constructors. This works automatically when the type used in the constructor is enough to deduce the template types, which is the case with the constructor on line 5, so is{3} is automatically deduced to be is<int>{3}, since the literal 3 is an int.

The member function template any_of(), on line 7, uses a variadic template parameter pack ...u, to accept any number of arguments of any type. This is then passed to a fold expression which will be expanded such that the pattern (t == u) is repeated, where u takes on each and every of the parameters called with, in the ....

Here's an example use of the construction:

1
if (is{x}.any_of(a,b,c)) ...

It will construct an is<int>{x}, and the call to any_of(a,b,c) will in effect return the result of ((x == a) || (x == b) || (x == c)).

This takes care of much of the repetition, but it doesn't read very nicely. It's awkward.

Enter ranges. Ranges have been in the works for C++ for some time now, and is available as open source libraries if you want to try them out. Chances are they will be a part of C++20, but that decision is not made yet.

With ranges, the algorithms from <algorithm> and <numeric> can work on whole ranges of values at once, so you don't have to fiddle with iterator pairs. Collections counts as ranges, including std::initializer_list<>. In my examples I'm using the range-v3 library, available from github.com. In the range-v3 library, the algorithms are in namespace ranges::v3.

So we can instead write:


1
if (ranges::v3::any_of({a,b,c}, [x](auto y){ return x == y; }) ...

It takes care of repetition, but at the cost of boiler plate to the degree it smells of obfuscation. It's not easy to read.


A trivial higher order function improves it:

1
2
3
4
5
template <typename T>
inline auto equals(T t)
{
    return [=](auto&& x) { return x == t;};
}

The call can now be made as:

1
if (ranges::v3::any_of({a,b,c}, equals(x)) ...

The call to equals(x) gives us the lambda that makes the test for equality. I guess that if you read it out, except for the namespace and punctuations, it kind of reads like normal English.

But there is a difference here. A subtle one, and one I didn't think of for a very long time. What if a, b and c, are not all of the same type?

Just a couple of days ago, I had a different idea. What if I make any_of to be a class template, that holds the values I want to compare, and implements an operator==? This would be sort of like the first example solution, but inside out.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
template <typename ... T>
class any_of : std::tuple<T...>
{
public:
    using std::tuple<T...>::tuple;
    template <typename U>
    constexpr bool operator==(const U& u) const {
        return std::apply([&](const auto& ... a) { return ((a == u) || ...);},
                          get());
    }
    template <typename U>
    friend constexpr bool operator==(const U& u, const any_of& a)
    {
        return a == u;
    }
private:
    constexpr const std::tuple<T...> get() const { return *this;}
};

template <typename ... T>
any_of(T&& ...) -> any_of<T...>;

I'll walk you through the code.

First, any_of<T...> inherits privately from std::tuple<T...>, and on line 5 we say that we have all the constructors that tuple does.

Lines 6-10 implements operator==, that can be called with any type. std::apply, on line 8, calls a function (first parameter) with the individual values from a tuple (2nd parameter). The function is a lambda that can take any number of arguments, and compare them with the reference captured u, as explained above for the is<T>::any_of() function template, and the 2nd parameter is the tuple that any_of inherits from (as given by the private get() member function on line 17.)

The one thing missing now is that it is not obvious for the compiler what the types T are. It cannot deduce them automatically, so we need to help it. The deduction guide on lines 20-21 takes care of that. It say if we call any_of(a,b,c), it will deduce the Ts from the types of a, b and c. l-values will be deduced to l-value references, and r-values will be moved into the tuple.

This is enough to make it work. A call can now be made as:

1
if (any_of{a,b,c} == x)...

This is nice! A little bit Yoda-like grammar, but the friend operator== on lines 11-15 allows comparison with the arguments in the reversed order, like:

1
if (x == any_of{a,b,c})...

This is all and well. But since this can be used with any types for which the comparisons make sense, why not expand it to other operators too, and implement all_of and none_of? Why not allow an assertion like the following?:

1
assert(all_of(a,b,c,d) > " ");

This of course requires that a, b, c and d are all greater-than comparable with a c-string.

I think this is neat. It reads nicely, and is not very difficult to understand.

If you want to have a look at code generation, please toy with this link to gcc.godbolt.org. Spoiler - the compilers see through it all and generates amazingly efficient code.

If you want to try it in your own code, clone it from github.com.

No comments:

Post a Comment