Universal print syntax for Python 2 and 3

Often I see python programmers in python2 writing code like

print ('Hello World')

claiming “this will work in both python 2 and 3!” This is a noble pursuit, but the approach is misguided. In python3 the above behaves as a call to the print() function with a single argument, in python2 it is a print statement followed by a parenthesized expression. Once there are two arguments though, it fails

print ('Hello', 'World')

Under python3 this works as expected, but under python2 there is only one object following the print, and that object is a tuple. This will print the tuple as-is

('Hello', 'World')

So this piece of common misinformation is easily dismissed, the above is by no means universal since we can’t have multiple arguments. A real attempt at a universal print function means there can only be one object that is printed. The next step is to use join.

print (' '.join(('Hello', 'World')))

Great! call join with a tuple of ‘Hello’ and ‘World’, the separator space will do what printing normally does, then wrap that in parentheses to get back to step one (something that works with the print function). We also get the ability to use a custom separator when python2’s print only allows a space. It’s starting to seem like a lot of different things happening and that leaves room for human error. But there is a bigger problem. What happens when something other than a string is passed?

print (' '.join(('Hello', 'World', 2)))

join only works with strings, so now we have the error

TypeError: sequence item 2: expected string, int found

but that’s easily fixed by tossing a generator expression inside the join call to convert each item in the tuple to as string.

print (' '.join(str(e) for e in ('Hello', 'World', 2)))

Voila! a cross-python print function in just 5 easy steps

  1. put all of your arguments inside a tuple
  2. wrap that in a generator expression to convert them all to strings
  3. put that inside a call to ' '.join()
  4. wrap that with parentheses
  5. instert a print in front

Though there are two problems that remain: printing without a newline and printing to a file. Printing to a file can be achieved by changing stdout during the print call. Assuming we have a file object f

out, sys.stdout = sys.stdout, f
print (' '.join(str(e) for e in ('Hello', 'World', 2)))
sys.stdout = out

Change stdout to be f during the print call, then change it back after. I would strongly recommend putting that inside of a context manager (which exists in python3.4 stdlib) in case printing raises an exception.

If you can figure out a way to equate end='' with a trailing comma let me know.

Though at this level of complexity you might as well be using sys.stdout.write since getting equality means throwing out most of the convenience that print offers.

What you should take away from this is that there isn’t a simple way to write print syntax that works with both python2 and python3. If it’s available, in python2 I generally recommend

from __future__ import print_function

so you can just use the real python3 print syntax in python2. If it’s not, then write your own print function. If none of this is an option just use one or the other!. 2to3 can convert someday if you ever need to.

If you’re considering using what I’ve demonstrated here, please realize that this post is meant to show that trying to accomplish a universal print syntax is a mess. Don’t do this in real code.

Is the proposed fold syntax for c++17 too limited?

After reading this paper, and this cppreference page, and then posting a stack overflow question about it. I went ahead and built the latest clang from source to start messing around with fold expressions. Writing a sum() function is straight forward.

template <typename... Ts>
auto sum(Ts&&... args) {
  return (args + ...);
}

The syntax is nice, but imo a bit limited. The next thing I wanted was a variadic min() function. This is where the proposed syntax starts to seem weak. Obviously the outline of the function should be easy:

template <typename... Ts>
auto min(Ts&&... args) {
  return // something
}

but there’s no builtin operator in C++ that takes two things and returns the lesser of the two. It’s a pretty easy function to write, but it doesn’t work with the fold syntax. It surprises me that folds would be proposed without a way to have them work with functions. They’re functional in origin, after all.

Here is my workaround

template <typename T>
struct MinWrapper {
  const T& obj;
};

template <typename T, typename U, typename V=std::common_type_t<T,U>>
MinWrapper<V> operator%(const MinWrapper<T>& lhs, const MinWrapper<U>& rhs) {
  return {lhs.obj < rhs.obj ? lhs.obj : rhs.obj};
}  

template <typename... Ts>
auto min(Ts&&... args) {
  return (MinWrapper<Ts>{args} % ...).obj;
}

Okay, the MinWrapper is just a struct that binds a reference. This just gives me a type to define an operator on. After that I define operator% just because modulus seems to be the thing people use when they’re doing something terrible. The mod operator compares the items in each wrapper, and returns a wrapper with their common type. This actually does work in clang’s current head believe it or not.

My question is: if I thought of (nearly) this monstrosity before I even had a compiler in front of me to run it with, is this going to be a pattern that emerges once folds make it into the language for real? Probably.

The solution would be to extend the syntax to allow not only operators, but binary functions as well, this would let me write

template <typename... Ts>
auto min(Ts&&... args) {
  auto sel_min =
    [](auto&& lhs, auto&& rhs) -> auto&& 
      {return lhs < rhs ? lhs : rhs;};
  return (args sel_min ...);
}

not the prettiest lambda either, but imo, much better than what I’m doing with the current syntax.

super generic functions in c++14

Since C++14 lambdas now allow auto parameters, function objects can be pretty damn generic.

[](auto&& arg1, auto&& arg2) { return arg1 < arg2; }

Which got me thinking, what’s to stop me from defining all of my functions as lambdas. For example, this function which takes an iterable and prints it to stdout.

auto print_iterable = [](auto&& iterable) {
    for (auto&& e : iterable) {
        std::cout << e << '\n';
    }
};

int main() {
    std::vector<int> v{1,7,2,0};
    print_iterable(v);
}

This function can deduce its argument types and return type. There’s no return here of course, but there’s nothing stopping me from making this function more complex.

I haven’t thought too much about the implications of this, but it seems like I could get away without naming variable types or using templates. I’m not saying this a good idea, but it’s an interesting one.

C++ Antipatterns: The Java constructor (and final vs const)

It seems Java (or C#) constructors are becoming increasingly pervasive in novice C++ code. A demo of what I mean:

class Point {
  public:
    Point(int a, int b) {
      this->x_ = a;
      this->y_ = b;
    }
    //...
  private:
    int x_;
    int y_;
};

This looks normal for programmers coming from a java background, but for those with C++ experience, it’s awkward that the variables x_ and y_ are initialized inside the constructor body. C++ constructors using initialization lists are more correct. They go between the ) and { of the constructor, begin with a colon, and have each part separated by a comma. They style I lean towards is as follows:

Point(int a, int b)
    : x_{a},
    y_{b}
{ }

The difference is that rather than having x_ and y_ be default constructed, then assigned, they are copy-constructed with the values of a and b. These x_ and y_ values must have been constructed because upon entering the body of the constructor, the this object must exist.

Performance

Okay, so dealing with a couple of ints, no big deal. Let’s look at a class with a string

class Person {
  public:
    Person(const std::string& n) {
      this->name_ = n;
    }
  private:
    std::string name_;
};

The semantics of this are different from what novices often think. What happens is that name_ is default constructed, and is then assigned to, and overwritten, upon entering the constructor. This means that before being assigned n, name_ already has a value! It’s the empty string! Constructing and assigning is slower than just copy constructing, so there’s a slight performance hit.

The correct constructor is, as expected

Person(const std::string& n)
    : name_{n}
{ }

Objects without default constructors

Novices don’t care too much up to this point because strings and ints both can be default constructed, and strings behave very much like POD objects. However there are many objects in C++ which don’t have default constructors at all! An example? the Person class we just made!

How about a class which has a Person in it?

class Couple {
  public:
    // fails before entering { }
    Couple(const Person& p1, const Person& p2) {
      this->person1_ = p1;
      this->person2_ = p2;
    }
  private:
    Person person1_;
    Person person2_;
};

This can’t work, because the Person class doesn’t have a default constructor. the compiler error will resemble error: no matching function for call to Person::Person(). The assignments are valid in this case, but the construction just doesn’t work. We have to use C++ style:

Couple(const Person& p1, const Person& p2)
    : person1_{p1}, // works, uses implicit copy ctor
    person2_{p2}
{ }

Beautiful!

const, and why it’s different from final

Another case to consider, in java one may have a final instance variable, signalling that once the reference has been assigned (by the constructor) it cannot be reassigned. Some incorrectly draw a parallel between const and final, they aren’t the same.

Let’s consider a Person class where we also have a string for their social security number. This number can’t change, so we want it to be const. The java-style constructor fails here.

class Person {
  public:
   Person(const std::string& n, const std::string& s) {
     this->name_ = n;
     this->ssn_ = s; //fails, can't assign to const var
   }
  private:
   std::string name_;
   const std::string ssn_;
};

Remember, the semantics differ, this code default constructs ssn_ and then assigns to it. However, we can’t assign to const variables, we can only construct them (or if you prefer, initialize them). Instead we must use the correct C++ constructor style.

Person(const std::string& n, const std::string& s)
    : name_{n},
    ssn_{s} // works, construction not assignment!
{ }

Since C++ novices often don’t care about constness, it’s hard to convince them to take this seriously last one seriously.

Other cases

There are other types of objects that cause problems. One is the class of objects without an assignment operator (such as the Person with a const ssn), objects where default construction followed by assignment does get expensive (std::array for example), objects that can be moved but not copied need to be handled differently, etc.

This is not an academic exercise, the problems with java-style constructors are very real.

Applying an operation on each element in a tuple

I ran into this when implementing the C++14 version of zip for cppitertools.

Given a tuple of iterators, increment each of those iterators using ++. Assuming I already have a std::index_sequence with the indices as Is..., I initially had something like this:

void increment_all() {
    ++std::get<Is>(this->tup)...;
}

However, this doesn’t work because the expansion cannot appear outside of a function call or initializer list. This can be worked around with a function that absorbs everything. My original implementation was to make it variadic like so:

template <typename... Ts>
void absorb(Ts&&...) { }

this isn’t really necessary though, since we have the older … that works just as well. The below meets the requirements:

//void absorb(...) {} //UPDATE DON'T USE THIS READ BELOW
 
void increment_all() {
    absorb(++std::get<Is>(this->tup)...);
}

And this works for my purposes. Note that this is a function call, so the order in which the iterators will be incremented is unspecified. If it’s important to you for it to happen in a 0, 1, …, N order, then you can use an initializer list if all of the operations will return the same type, or use other solutions if they don’t.

UPDATE: I reverted to using the templated absorb because the version with just … will attempt to make copies of its input arguments. This is not what I ever want to happen. Even though the copies would be optimized out, it still prevents the function from working with non-copyable (in some cases, only non-movable) types.