Applying an operation on each element in a tuple

I ran into this when implementing the C++14 version of zip for cppitertools.

Given a tuple of iterators, increment each of those iterators using ++. Assuming I already have a std::index_sequence with the indices as Is..., I initially had something like this:

void increment_all() {
    ++std::get<Is>(this->tup)...;
}

However, this doesn’t work because the expansion cannot appear outside of a function call or initializer list. This can be worked around with a function that absorbs everything. My original implementation was to make it variadic like so:

template <typename... Ts>
void absorb(Ts&&...) { }

this isn’t really necessary though, since we have the older … that works just as well. The below meets the requirements:

//void absorb(...) {} //UPDATE DON'T USE THIS READ BELOW
 
void increment_all() {
    absorb(++std::get<Is>(this->tup)...);
}

And this works for my purposes. Note that this is a function call, so the order in which the iterators will be incremented is unspecified. If it’s important to you for it to happen in a 0, 1, …, N order, then you can use an initializer list if all of the operations will return the same type, or use other solutions if they don’t.

UPDATE: I reverted to using the templated absorb because the version with just … will attempt to make copies of its input arguments. This is not what I ever want to happen. Even though the copies would be optimized out, it still prevents the function from working with non-copyable (in some cases, only non-movable) types.

Advertisement

wordpress unescaping code blocks

It seems that if I create a code block like the following,

template <typename T>
void f(T a, T b) {
  std::string s = "result: ";
  std::cout << s << a + b << '\n';
}

then publish, edit, publish, it escapes and reescapes the html in it after the edit. This is quite hideous.

template &lt;typename T&gt;
void f(T a, T b) {
  std::string s = &quot;result: &quot;;
  std::cout &lt;&lt; s &lt;&lt; a + b &lt;&lt; '\n';
}

and it stacks, each time I edit, it gets worse

template &amp;lt;typename T&amp;gt;
void f(T a, T b) {
  std::string s = &amp;quot;result: &amp;quot;;
  std::cout &amp;lt;&amp;lt; s &amp;lt;&amp;lt; a + b &amp;lt;&amp;lt; '\n';
}

Pretty rough. I’ll have to find a way around this.

Exploding Tuple in C++14

So as it turns out, the C++14 standard makes expanding a tuple, pair, or array, in a function call, very simple.

template <typename Func, typename TupleType, 
          std::size_t... Is>
decltype(auto) call_with_tuple_impl(
    Func&& f, TupleType&& tup, 
    std::index_sequence<Is...>) {
  return f(std::get<Is>(tup)...);
}
 
template <typename Func, typename TupleType&gt;
decltype(auto) call_with_tuple(
    Func&& f, TupleType&& tup) {
  constexpr auto TUP_SIZE = std::tuple_size<
    std::decay_t<TupleType>>::value;
  return call_with_tuple_impl(
      std::forward<Func>(f),
      std::forward<TupleType>(tup),
      std::make_index_sequence<TUP_SIZE>{});
}

Much less painful than its C++11 equivalent. std::integer_sequence is responsible for most of the ease. Combined with auto functions, which deduce their own return type, that is.

call_with_tuple first determines the tuple_size of the tuple-like object passed in. std::make_index_sequence will result in an index sequence of 0, 1, ..., TUP_SIZE. The last argument of call_with_tuple_impl is used to deduce the Is... pack. finally the mf(std::get<Is>(tup)...) is the equivalent of mf(std::get<0>(tup), std::get<1>(tup), ..., std::get<TUP_SIZE-1>(tup)).

The std::decay_t is necessary because TupType is a universal reference.

I’m using these integer sequences more and more. They’re proving to be extremely useful in flattening complex recursive data structures and logic. C++14 rules

How to malloc() the Right Way

Having been in the game for a while now, I’ve seen different styles of malloc()ing data. With new students, I almost universally see:

int *array = (int *)malloc(sizeof(int) * N);

There are two problems that need addressing here:

  1. The cast to (int *)
  2. Using sizeof(int)

If you like either of these, bear with me, I have reasons for changing both to the below:

int *array = malloc(N * sizeof *array);

If you aren’t already aware, the sizeof operator can be applied to an expression. The expression is not evaluated (array isn’t actually dereferenced in the above), only the type of the expression is examined. The type of *array is int, so sizeof *array is equal to sizeof(int), and is also computed at compile-time.

Why you shouldn’t cast

First of all, you don’t need to. All data pointer types in C are implicitly convertible to and from void * with respect to constness. This should be enough of a reason not to use it, but keep reading if you’re unconvinced.

Second, and this is vital for students, but especially teachers: code should have as few casts as possible. A cast is a sign that something dangerous or strange it happening. A cast indicates that the normal rules of the type system can’t do what the programmer needs to do. I haven’t seen many good reasons for type-casting outside of low-level systems code. Teachers, we should not teach students to use casts without a second thought. The problem becomes more pronounced with they use a cast somewhere to shut up the compiler, their code breaks, and no one knows why. malloc() is not unusual, strange, or dangerous in C. It shouldn’t be casted, it’s a natural part of the flow of any significant C program.

Third, the cast can actually prevent the compiler from issuing a warning if stdlib.h isn’t included. C89 doesn’t require a declaration for all functions, as long as it can find them at link-time. Simply put, this means you can call malloc without including stdlib.h. The compiler will produce an implicit declaration: int malloc(int). This is invalid in C99, but most compilers still allow it.

Why you shouldn’t use sizeof(type) either

If the type changes, the malloc line won’t need many changes. Let’s examine the first example where one allocates an array of int. Now, imagine the programmer later realizes the type needs to be long, not int.

int *array = (int *)malloc(sizeof(int) * N); /* before */
long *array = (long *)malloc(sizeof(long) *N); /* after */

Three changes. In the latter case we have

int *array = malloc(N * sizeof *array); /* before */
long *array = malloc(N * sizeof *array); /* after */

One change. No changes to the right side of the =, which is important because often the call to malloc() isn’t right next to the variables declaration. This makes it clearer that using sizeof(int) and casting prevent your code from being very DRY. For all that is said about DRY code, the easiest way to think about it, in my opinion, is that you want to design your program so that if you need to make a change, you only need to change one thing. Repeating the type in three places makes your code harder to modify and maintain.

A more convincing example

Let’s consider a typical type of struct, a size and a pointer to the contained data. Along with that struct I’ll create a function

myarray.h

#ifndef MY_ARRAY_H_
#define MY_ARRAY_H_

#include <stddef.h>
/* Array type, stores its length alongside the data */
struct my_array {
  size_t size;
  int *data;
};

struct my_array *new_my_array(size_t sz);
void free_my_array(struct my_array *);

#endif

myarray.c

#include "myarray.h"
#include <stdlib.h>
struct my_array *new_my_array(size_t sz) {
  /* space for the struct itself */
  struct my_array *arr = malloc(sizeof *arr);
  /* space for the contained data */
  arr->data = malloc(sz * sizeof *arr->data);
  arr->size = sz;
  return arr;
}

void free_my_array(struct my_array *arr) {
  free(arr->data);
  free(arr);
}

Note: for the more advance C programmers out there, I know this could be done with a flex-array, but it’s an example, there could be two arrays, it could be C89, whatever.

The usage of this should be pretty obvious: struct my_array *arr = new_my_array(N);. Now consider the same problem: the programmer decides int should be long. As I’ve written this, the only change needs to be in myarray.h. int *data becomes long *data, and the allocation in myarray.c is still correct. There is no cast that needs changing and since sizeof uses the variable, it’s transitively modified by the change in the header.

If, on the other hand, the malloc line in new_my_array used a cast and a sizeof(int), the code would still compile, but would be wrong. In this example the change would be required across two files, which is problematic enough, but it could be worse. What if the array is growable? Now there’s another malloc or realloc somewhere that needs to be found. What if the array can grow in more than one place? What if the array can shrink too?

Objections

But what if I want it to work in C++?

C++ doesn’t allow implicitly converting from void* to another data pointer without a cast. This is a valid question, but there are issues with it, mostly arising from the fact that C and C++ are two different languages

Right away, if you have a C++ project that has malloc in it, in that case malloc is a strange thing to use, and should require a cast. C++ has new for dynamic allocation. If you’re allocating an array, you’re better off using std::vector, std::string, or one of the smart pointers if you really think that’s best.

If you actually really need to malloc in C++ (which you probably don’t). Then a C-style cast isn’t the right way to do it. It’s a sledgehammer, whereas C++ has a lot of tools to do what you actually mean. In the malloc case, what you really want is:

int *arr = static_cast<int *>(std::malloc(N * sizeof *arr));

If you’re thinking “that’s clunky and ugly,” I’d agree, but I’m also not someone who uses malloc in C++ (unless I really really need it). If you’re a C++ programmer and you don’t understand the difference between static_cast, dynamic_cast, reinterpret_cast, and const_cast, then seriously, start reading up. A student once asked me: “what’s the right time to use a C-style cast?” and as I told her, “when you’re programming in C.”

That’s not what I mean! I want my library to work in both C and C++

For that you should be using extern “C”.Note: What follows is only tangentially related to the original point of this post

You have written a C library, and you want to link it with C++ code. This is quite normal, and there are two rules of C++ that let us do so. C++ allows one to prefix extern "C" on a function declaration, meaning the name will not be mangled and the linker knows to look for a C-style named function, rather than a C++ style name (which would be mangled). The declaration then appears as extern "C" int f(int);. However, we can’t just throw around extern "C" in C code, because the C language knows nothing about it. The other rule we have says that a conforming C++ compiler must define the preprocessor symbol __cplusplus. Combining these two, one can create a declaration that works in C++ and C:

#ifdef __cplusplus
extern "C"
#endif
void myfunc(int);

In the myarray example, the whole thing can be wrapped in an extern "C" block.

#ifndef MY_ARRAY_H_
#define MY_ARRAY_H_
#include <stddef.h>

#ifdef __cplusplus
extern "C" {
#endif

/* Array type, stores its length alongside the data */
struct my_array {
  size_t size;
  int *data;
};

struct my_array *new_my_array(size_t sz);
void free_my_array(struct my_array *);

#ifdef __cplusplus
} /* close the extern "C" block */
#endif

#endif

Voila, now your C and C++ code can share a header, be compiled as their own language, and still be linked.

$ gcc -std=c89 -Wall -c myarray.c
$ g++ -std=c++11 -Wall -c main.cpp
$ g++ myarray.o main.o -o main

This whole concept may merit more explaining in another post

I don’t really understand what you just said, but I meant that I want to be able to run my code through a C++ compiler as well as a C compiler

Well, in that case you’ll find yourself restricted to using a subset of C89 and C99. You’ll also have more problems to deal with than you realize, there are a lot of things that are perfectly valid C but when put through a C++ compiler, they crash and burn.

If you still prefer to cast malloc()‘s return in C, or use sizeof(type). I’m interested as to why, so please, do tell.

Another Exploding Tuple

After watching Andrei Alexandrescu’s talk on Going Native 2013, I wanted to take a crack at it myself. The presentation covers how to expand a tuple into individual arguments in a function call. Being a Python programmer I’m a little spoiled by func(*args) so the ability to do this in C++11 is something I’m eager to use. What I came up with wound up being quite similar, but more flexible. I wanted to make it more generic, to work with std::pair and std::array. The version presented in that video is incredibly powerful, but it can go a bit further.

The limitations start at the top level, the explode free function.

template <class F, class... Ts>
auto explode(F&& f, const tuple<Ts...>& t)
    -> typename result_of<F(Ts...)>::type
{
    return Expander<sizeof...(Ts),
      typename result_of<F(Ts...)>::type,
      F,
      const tuple<Ts...>&>::expand(f, t);
}

The tuple& argument allows a means to use result_of to figure out the return type, and sizeof... to determine the size of the tuple itself. This can be accomplished via other means. decltype can be used to figure out the return type. It needs more typing, but removes the need for result_of. As for sizeof..., there is a std::tuple_size available which can reach the same end. Using this makes explode non-variadic. Taking a universal reference, rather than capturing the parameter pack, means different versions for lvalue and rvalue refs aren’t needed.

My initial function (called expand instead) is:

template <typename Functor, typename Tup>
auto expand(Functor&& f, Tup&& tup)
  -> decltype(Expander<std::tuple_size<typename std::remove_reference<Tup>::type>::value, Functor, Tup>::call(
        std::forward<Functor>(f),
        std::forward<Tup>(tup)))
{
    return Expander<
        std::tuple_size<typename std::remove_reference<Tup>::type>::value, 
        Functor, 
        Tup>::call(
          std::forward<Functor>(f),
          std::forward<Tup>(tup));
}

Some things to note:

  1. std::tuple_size works on std::pair (yielding 2) and on std::array (yielding the size of the array).
  2. std::get also supports std::pair and std::array, meaning that now tuple, pair, and array can all work in this context.
  3. std::remove_reference is needed for calling std::tuple_size because tup is a universal reference, and Tup may deduce to an lvalue reference type

The decltype goes through each level of the expansion, until much like the original, it hits a base case and does the call.

#include <cstddef>
#include <tuple>
#include <utility>
#include <type_traits>
#include <array>

template <std::size_t Index, typename Functor, typename Tup>
struct Expander {
  template <typename... Ts>
  static auto call(Functor&& f, Tup&& tup, Ts&&... args)
    -> decltype(Expander<Index-1, Functor, Tup>::call(
        std::forward<Functor>(f),
        std::forward<Tup>(tup),
        std::get<Index-1>(tup),
        std::forward<Ts>(args)...))
  {
    return Expander<Index-1, Functor, Tup>::call(
        std::forward<Functor>(f),
        std::forward<Tup>(tup),
        std::get<Index-1>(tup),
        std::forward<Ts>(args)...);
  }
};

template <typename Functor, typename Tup>
struct Expander<0, Functor, Tup> {
  template <typename... Ts>
  static auto call(Functor&& f, Tup&&, Ts&&... args)
    -> decltype(f(std::forward<Ts>(args)...))
  {
    static_assert(
      std::tuple_size<
          typename std::remove_reference<Tup>::type>::value
        == sizeof...(Ts),
      "tuple has not been fully expanded");
    // actually call the function
    return f(std::forward<Ts>(args)...);
  }
};

template <typename Functor, typename Tup>
auto expand(Functor&& f, Tup&& tup)
  -> decltype(Expander<std::tuple_size<
      typename std::remove_reference<Tup>::type>::value, 
      Functor,
      Tup>::call(
        std::forward<Functor>(f),
        std::forward<Tup>(tup)))
{
  return Expander<std::tuple_size<
      typename std::remove_reference<Tup>::type>::value, 
      Functor,
      Tup>::call(
        std::forward<Functor>(f),
        std::forward<Tup>(tup));
}

A few examples showing the flexibility.

int f(int, double, char);
int g(const char *, int);
int h(int, int, int);

int main() {
    expand(f, std::make_tuple(2, 2.0, '2'));

    // works with pairs
    auto p = std::make_pair("hey", 1);
    expand(g, p); 

    // works with std::arrays
    std::array<int, 3> arr = {{1,2,3}};
    expand(h, arr);
}

Each level of the call takes one argument at a time off the back of the tuple using std::get and the template Index parameter, decrements the index, and recurses. This is a bit hard to imagine, so I’ll illustrate. This sequence is not meant to be taken too literally.

Let’s say I have a tuple of string, int, char, and double. I’ll denote this example tuple as tuple("hello", 3, 'c', 2.0). The expansion would happen something like the following

expand(f, tuple("hello", 3, 'c', 2.0)) 
-> call<4>(f, tuple("hello", 3, 'c', 2.0))
-> call<3>(f, tuple("hello", 3, 'c', 2.0), 2.0)
-> call<2>(f, tuple("hello", 3, 'c', 2.0), 'c', 2.0)
-> call<1>(f, tuple("hello", 3, 'c', 2.0), 3, 'c', 2.0)
-> call<0>(f, tuple("hello", 3, 'c', 2.0), "hello", 3, 'c', 2.0)
-> f("hello", 3, 'c', 2.0)

Of course std::integer_sequence in C++14 turns all of this on its head. Maybe I should’ve implemented that instead…