Subscripting in generic scalar & vector code

Submitted by Matthias on Tue, 11/22/2016 - 16:50

There are several users of the Vc library who are interested in writing their code in such a way that it can be compiled with and without Vc. Consequently, all the application logic must be stated in generic terms, which can work equally well with T as with Vc::Vector<T>. This works pretty well, since Vc::Vector<T> implements the same operators with the same semantics as T. However, there are consequences for how you state masked assignments and mask reductions. The Vc library provides any_of, all_of, none_of, some_of for mask reductions, which are overloaded in the Vc namespace for the builtin bool type. You can overload the functions yourself in your namespace and thus compile reductions without Vc. For masked assignment, Vc provides the Vc::where function. It is also overloaded for bool and builtin T. Feel free to copy this implementation into your namespace as well to make Vc optional.

The problem

But what is the solution for a case where you need to subscript into a Vc::Vector? There's no way the subscript operator can work on builtin arithmetic types. You could make it work by wrapping T in a class, similar to what Vc::Scalar::Vector<T> does. Consider a case of swapping rows in a 3x3 matrix:

template <class T> using Col = std::array<T, 3>;
template <class T> using Mat = std::array<Col<T>, 3>;
using Matrix = Mat<double>;
 
void swap_rows(Matrix &m, int row0, int row1) {
  using std::swap;
  for (int i = 0; i < 3; ++i) {
    swap(m[row0][i], m[row1][i]);
  }
}  

And the vector case:

using Matrix = Mat<Vc::double_v>;
using intv = Vc::SimdArray<double, Vc::double_v::size()>;
 
void swap_rows(Matrix &m, intv row0, intv row1) {
  for (int i = 0; i < intv::size(); ++i) {
    for (int j =0; j < 3; ++j) {
      swap(m[row0[i]][j][i], m[row1[i]][j][i]);
    }
  }
}

A solution?

Now, how can we write a single generic swap_rows function? The need for the extra loop and subscripting of vectors makes the task non-trivial. Here's an idea:

template <class T, class U>
void swap_rows(Mat<T> &mv, U row0v, U row1v) {
  using std::swap;
  as_scalar(mv, row0v, row1v, [](auto &m, int row0, int row1) {
    for (int i = 0; i < 3; ++i) {
      swap(m[row0][i], m[row1][i]);
    }
  });
}

The as_scalar function is easy to implement for direct uses of Vc::Vector<T>/Vc::SimdArray<T, N> as is the case for the row variables in the example. However, the type of m must be some smart wrapper type that translates the swap to the correct element in mv. An idea is to require such argument types to be a class template where the as_scalar function may thus instantiate the underlying class template Mat<T> with a generic element reference wrapper. This smart reference may even encode the element index into the type to help the compiler generate more efficient memory accesses or vector insert/extract instructions. A completely generic implementation for as_scalar is not trivial, but seems to be easier than Vc::simdize<T>. Implementing as_scalar is material for another post.

Tags

Add new comment

By submitting this form, you accept the Mollom privacy policy.