Lecture 11: Functions as objects (function pointers), a.k.a. functional programming

\( \newcommand\bigO{\mathrm{O}} \)

These notes are incomplete, they don't cover captures inside lambdas. The "Lambda Capture" section of this page is a very detailed description of captures inside lambdas. As a quick summary, there are two types of captures for allowing the lambda to use the variables outside the lambda. We can pass these parameters either by value, or by reference. We can also specify a default capturing method (= for pass by value, and & for pass by reference). The capture specifiers go inside the square brackets in the lambda expression. For example, the following lambda captures foo by reference, and everything else by value:

[=, &foo](...) {}

A lot of programming is noticing repetition or low-level details, and abstracting over them. This is what allows us to build complex programs like a web browser or a compiler without worrying about details like how the variables are shared between the registers and the stack. We have been climbing the ladder of abstraction with the programming constructs we have used so far:

Classes/structs allowed us to treat a conglomeration of objects (the fields) and associated data as a single object. This is how we built the abstract data types to write code that is agnostic to implementation details of an object.
With inheritance and dynamic dispatch, we were abstracting over the actual type of an object, and picking the appropriate methods to call at runtime automatically. Again, we are able to write code that is agnostic to the specifics of an object.

In this lecture, we will continue the same thread and abstract over functions. Although I have "functional programming" in the title, this is just an introduction. While it is useful, it is only the tip of the iceberg. If you like the programming style here, I urge you to check a language where functional programming is a first-class citizen.

1. An example with some repetition

Suppose, we are extending our awesome math library from the error handling lecture and decided to add functions for computing squares. This is the code we have now:

struct SqrtError {};

// Compute square root using Newton's method, the implementation does not matter
// (that's the point of this lecture!).
double Sqrt(double a) {
  if (a < 0) {
    throw SqrtError{};
  }
  if (a == 0) {
    return 0;
  }
  double x = 1;
  double b = a;
  // pick a good estimate, this is not the fastest way to do it
  if (a > 1) {
    while (b > 1) {
      b /= 4;
      x *= 2;
    }
  } else if (a < 1) {
    while (b < 1) {
      b *= 4;
      x /= 2;
    }
  }
  for (int i = 0; i < 7; ++i) {
    x = (x + (a / x) ) / 2;
  }
  return x;
}

// other math functions
double Abs(double x) {
  if (x == 0) return +0.0;
  if (x < 0) return -x;
  return x;
}

double Square(double x) {
  return x * x;
}

int Round(double x) {
  return (int)x;
}

Let's also add a couple of helpers to make our lives easier:

template<class It>
void Print(It begin, It end) {
  for (; begin != end; ++begin) {
    cout << *begin << '\t';
  }
  cout << '\n';
}

template<class Vector>
void PrintCollection(const string & name, Vector v) {
  cout << name << ":\t";
  Print(v.begin(), v.end());
}

Now, here is a program that uses the functions above:

// the goal of this program is to apply each function to some inputs
int main(int argc, char * argv[]) {
  vector<double> v{-2.718, -1, 0, 1, 2, 3, 4, 3.14};
  // create a second vector without negatives for square root
  vector<double> v2{0, 1, 2, 3, 4, 3.14, 36, 64, 1e32};

  vector<double> squares;
  for (double x : v) {
    squares.push_back(Square(x));
  }
  vector<double> absolutes;
  for (double x : v) {
    absolutes.push_back(Abs(x));
  }
  vector<int> converted_to_ints;
  for (double x : v) {
    converted_to_int.push_back(Round(x));
  }
  vector<double> square_roots;
  for (double x : v2) { // note that this uses v2
    square_roots.push_back(Sqrt(x));
  }

  // print the output
  PrintCollection("squares", squares);
  PrintCollection("vector 1", v);
  PrintCollection("absolutes", absolutes);
  PrintCollection("converted ints", converted_to_ints);
  PrintCollection("vector 2", v2);
  PrintCollection("square roots", square_roots);


  return 0;
}

The code above has a bunch of repeated loops. How can we merge them so the code is easier to manage? If all of them were over v, we could perhaps merge the loop bodies, but the last loop is over v2. The simplest way to abstract repeated code that is parametric over some values is to create a function. For example, if we had a bunch of expressions like x * x - x and y * y - y, we could create a function int f(int a) { return a * a - a; } to abstract over them. So, what are the parameters that change between the loops? The input vector and the function we apply. So, we need a function wrapping the following code snippet:

vector<double> output;
for (double x : INPUT) {
  output.push_back(FUNCTION_TO_APPLY(x));
}
return output;

1.1. Function types

So, we need types for INPUT and FUNCTION_TO_APPLY to turn this snippet to a true function. INPUT will be a vector<double> (that is the type of v and v2), and FUNCTION_TO_APPLY should have the same type as, say, Square. What is the type of Square? Deductively, its type needs to contain two bits of information:

The input types, so that the compiler can notice that Square("foo") is a type error.
The output type, so that the compiler can, again, notice that string name = Square(42) is a type error.

So, we need a type that contains both of them. In C++, function types are written like so:

return_type (*)(param_type1, param_type_2, ...)

So, the type of Square is double (*)(double). Notice that this looks a lot like its signature: double Square(double), only that the function name is replaced with (*) to denote that this is a pointer to the address of the code of the function, i.e. a function pointer. So, what we are actually going to pass around is the address of the function, and the C++ compiler will generate the code to jump to that address to call the function. With all of this at hand, here is a first version of our functionThis function is conventionally named Map because it maps each element to the result.:

// an alternative signature is
//  vector<double> Map(const vector<double> & input,
//                    double function_to_apply(double))

  vector<double> Map(const vector<double> & input,
                     (double (*)(double)) function_to_apply) {
    vector<double> output;
    for (double x : input) {
      output.push_back(function_to_apply(x));
    }
    return output;
  }

The function signature above looks pretty awful, we can prettify it a bit using type aliases:

// this statement creates a new alias for the type `double(*)(double)`
using double_to_double = double(*)(double);
// we could also use a typedef like below, but it is harder to read IMO.
// typedef double(*)(double) double_to_double;

vector<double> Map(const vector<double> & input,
                   double_to_double function_to_apply) {
  vector<double> output;
  for (double x : input) {
    output.push_back(function_to_apply(x));
  }
  return output;
}

Map is an example of a higher-order function: it takes a function as an input and uses it so it kind of operates at a higher level. All the functions we have written before then were first-order functions (they did not take any functions so they were at the bottom of this hierarchy). Now, we can clean up our main function a bit:

// ...
vector<double> squares = Map(v, Square);
vector<double> absolutes = Map(v, Abs);
vector<int> converted_to_ints;
for (double x : v) {
  converted_to_int.push_back(Round(x));
}
vector<double> square_roots = Map(v2, Sqrt);
// ...

Much nicer! Note that we haven't rewritten the code that computes converted_to_ints because this version of Map works on only functions that convert a double to another double but we need to return a vector<int>. Time for more abstraction!

2. `std::function` and mixing templates with function pointers

We can generalize our Map function to work on arbitrary inputs and outputs using templates:

// In is the parameter type of the function we are expecting, Out is the return type
template<class In, class Out>
vector<Out> Map(const vector<In> & input,
                (Out(*)(In)) function_to_apply) {
  vector<Out> output;
  for (double x : input) {
    output.push_back(function_to_apply(x));
  }
  return output;
}

The code above is good enough for us. While at it, we may want to enable some conversions for the function types (for example, if our function returns a short and a short can be automatically converted to an int, we may want to apply it to generate an int as well). The standard library (but not the language!) ships with a type to represent functions that has these conveniences built-in: std::function. Let's rewrite Map using it:

#include <functional>

template<class In, class Out>
vector<Out> Map(const vector<In> & input,
                function<Out(In)> function_to_apply) {
  vector<Out> output;
  for (double x : input) {
    output.push_back(function_to_apply(x));
  }
  return output;
}

Now, we can rewrite our main function to have no loops:

vector<double> squares = Map<double, double>(v, Square);
vector<double> absolutes = Map<double, double>(v, Abs);
vector<int> converted_to_ints = Map<double, int>(v, Round);
vector<double> square_roots = Map<double, double>(v2, Sqrt);

We need to give Map the template parameters because the compiler cannot infer those in this case. Now that we have our Map function all well-rounded, we can also improve its implementation: we can allocate the space for the output all at once to reduce the number of allocations.

#include <functional>

template<class In, class Out>
vector<Out> Map(const vector<In> & input,
                function<Out(In)> function_to_apply) {
  vector<Out> output;
  output.reserve(input.size()); // the line we just added to do only 1 allocation
  for (double x : input) {
    output.push_back(function_to_apply(x));
  }
  return output;
}

3. Higher-order functions in the standard library

Note: all functions here are parts of the <algorithm> library or <numeric> library. So, you need to include those headers.

Map is a very generic function, and it would be useful whenever we need to apply an unknown transformation over a bunch of things. The standard library of course has its version of it called std::transform with some differences:

it uses iterators so it works on any type of collection (linked lists, vectors, sets, …) that have an iterator interface.
it uses some type introspection to not need the template arguments to be specified most of the time.
rather than creating its own output, it requires a pre-allocated output iterator (so, we create the vector first, then call std::transform).

Without further ado, here is how we would compute converted_to_ints using std::transform (compare it to how we called Map earlier):

// pre-allocate the space by creating a vector of the same size as the input
vector<int> converted_to_ints(v.size()); 
std::transform(v.begin(), // beginning of input
               v.end(), // end of input
               converted_to_ints.begin(), // beginning of output
               Round); // function to apply

Also note that transform is pretty versatile. We could choose to pass around only a small range of the input by giving it the appropriate iterators. Now, let's look into other higher-order functions we can use:

3.1. `std::reduce` and `std::transform_reduce`

Sometimes we want to sum or multiply all elements of a collection to reduce it to a singular value. For example, one way of computing the factorial is to have a collection of first N positive numbers and to multiply them. We can write a loop like this to do it:

vector<int> firstN{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int result = 1;
for (int x : firstN) {
  result = result * x;
}

Since eliminating loops is the theme of this lecture, we will abstract over this loop. We can abstract over 3 things:

the initial value 1
the input vector firstN
the multiplication operation result * x

std::reduce (also std::accumulate, check out cppreference for the difference) does exactly this abstraction. So, we can rewrite the code above as:

int times(int a, int b) { return a * b; }

int result = std::reduce(firstN.begin(), firstN.end(), 1, times);

The first two parameters are the input iterators (like transform), the third one is the initial value 1, and the last one is the function we are using to reduce the result. So, the call to std::reduce above is roughly the same as (with subject to reordering):

times(...times(times(times(times(1, 1), 2), 3) ...), 10)
// which is same as 1 * 1 * 2 * 3 * ... * 10

3.1.1. Anonymous functions

It is kind of silly to create a function with a name just for multiplication. We can also create an anonymous function like so (here, you can think of putting just [] for the function name, we could actually put some other useful things there but that's beyond CS 32):

// this is an anonymous function that computes a * b
[](int a, int b) -> int { return a * b; }

The -> int part above denotes the return type. We can omit it most of the time, and the compiler can infer it for anonymous functions. Let's use our anonymous function in the call to reduce:

int result = std::reduce(firstN.begin(), firstN.end(), 1,
                         [](int a, int b) { return a * b; });

This idea of creating functions on the fly and passing them around is the core of functional programming.

For a * b, there is already a function in the standard library because it is so common: std::multiplies() (it needs the parentheses because of historical design decisions). There is also std::plus() for addition.

3.1.2. Back to `reduce` and `transform_reduce`

We have seen the two basic operations: transforming a collection, and reducing it. We can also combine these, but we would be creating an unnecessary intermediate collection (output of transform below):

input --[transform]--> intermediate output --[reduce]--> single value

For example, consider this small programming problem: we have a bunch of strings we want to put to a buffer, but we want to allocate the buffer only once, so we need the total size. We can break it into two problems:

Compute the size of each string (transform)
Add the sizes (reduce)

We can write the following snippet to solve this using transform and reduce:

vector<string> names{"Mehmet", "Alice", "John"};
vector<size_t> sizes(names.size()); // we are wasting O(|names|) space here, because we won't need each size individually
transform(names.begin(), names.end(), sizes.begin(),
          // a function that computes the size of a string
          [](const string& name) { return name.size(); });
size_t total_size = reduce(sizes.begin(), sizes.end(),
                           0, // initial value
                           std::plus()); // to add each size

The issue here is that we are wasting so much space while reduce only needs 1 element at a time to apply std::plus(). So, the standard library has a fused version of transform and reduce imaginatively called transform_reduce to perform this common operation without wasted space. We can rewrite the code above as:

vector<string> names{"Mehmet", "Alice", "John"};
size_t total_size =
  transform_reduce(names.begin(), names.end(), // input
                   0, // initial value of the result for reduce
                   std::plus(), // reducing function
                   [](const string& name) { return name.size(); }); // transform function

3.2. `std::sort`, `std::find_if` and `std::remove_if`

Suppose we are implementing an in-memory student roster for something like GauchoSpace, and we have these classes at hand:

#ifndef student_h
#define student_h

#include <string>

struct Person {
  std::string first_name;
  std::string last_name;
};

struct Student : public Person {
  std::string major;

  Student(std::string first_name,
          std::string last_name,
          std::string major) :
    Person{first_name, last_name},
    major(major) {}
};

#endif

And, in our main function, we have the following example roster:

vector<Student> roster{
  Student("Mehmet", "Emre", "CS"),
  Student("Arthur", "Murray", "Dance"),
  Student("Charles", "Darwin", "Biology"),
  Student("Alan", "Turing", "CS"),
};

Normally, we would use a proper database with indices for better complexity, but our roster is small enough so we will implement the functionalities using linear search etc.

3.2.1. Finding a specific element

We may be interested in finding a specific student. std::find is similar to map.find which finds an element we are looking for. We may be also interested in finding an element that satisifies some criteria. For example, we may be interested in finding the first student in the roster whose last name is Emre. To do so, we can use std::find_if:

auto iterator = std::find_if(roster.begin(), roster.end(), //input
                             // a "predicate", find_if returns the first value that makes this function's return value true
                             // if it can't find such a value, it returns roster.end()
                             [](const Student& s) {
                               return s.last_name == "Emre";
                             });

We can combine functions like these to build a search functionality for our roster. Generally, processing data by generalizing different operations for each element is where this style of functional programming shines. That's why it is popular in my area of programming languages/compilers.

3.2.2. Sorting the whole list

GauchoSpace has this functionality where you can sort the roster by different keys. We can implement it by having different versions of operator < for Student perhaps? Then, we need to copy over our roster and re-build each Student object in different versions of the class. Another alternative is to have a sorting algorithm that accepts a comparison function so we can swap out operator <. This is what std::sort does. Normally, it takes only the begin/end iterators, but we can also give it a third parameter to specify what to use instead of operator <. For example, we can sort the students by first name using the following:

std::sort(roster.begin(), roster.end(),
          [](Student& a, Student& b) {
            return a.first_name < b.first_name;
          }
          );

3.2.3. Removing some elements

We may also need to filter (keep) some elements. C++'s std::remove_if has a way to do the opposite, removing some elements. It works on different types of containers so it does not have an idea on how to shrink the actual container, we need to do that bit by hand. remove_if removes the elements that satisfy a condition, then moves other elements back in an efficient ( \( \bigO(n) \) ) manner. It returns where the new end of the range is. We can then erase the elements afterward to finish the clean-up. (Exercise: try the code below with and without the call to vector::erase)

auto new_end =
  std::remove_if(roster.begin(),
                 roster.end(),
                 // suppose this is a CS-only class, so we drop everyone who is
                 // not a CS major
                 [](const Student& s) {
                   return s.major != "CS";
                 });
roster.erase(new_end, roster.end());

Footnotes:

For example, if we had a bunch of expressions like x * x - x and y * y - y, we could create a function int f(int a) { return a * a - a; } to abstract over them.

This function is conventionally named Map because it maps each element to the result.