Functions

This is the eleventh part of the Tour of Ceylon. In the previous leg we looked at packages and modules. This leg covers first class and higher-order functions.

First class and higher order functions

Ceylon isn't a pure functional language: it has variable value references and so functions can have side effects. But one thing Ceylon has in common with functional programming languages is that it lets you treat functions as values, which in some people's eyes makes the language a kind of hybrid. In truth, there's nothing remotely new about having functions-as-values in an object oriented language—for example, Smalltalk, one of the first and still one of the cleanest object oriented languages, was built around this idea. Anyway, Ceylon, like Smalltalk and a number of other object oriented languages, lets you treat a function as an object and pass it to another function.

In this installment, we're going to discuss Ceylon's support for first class and higher order functions. A little bit of PL jargon:

  • First class functions means the ability to treat a function as a value, assigning it to a variable, or passing it as an argument to another function.
  • A higher order function is a function that accepts other functions as arguments, or returns another function.

It's clear that these two ideas go hand-in-hand, so we'll just use the term "higher order functions" from now on.

Representing the type of a function

Ceylon is a (very) statically typed language. So if we're going to treat a function as a value, the very first question that arises is: what is the type of the function? We need a way to encode the return type and parameter types of a function into the type system. Remember that Ceylon doesn't have "primitive" types. A strong design principle is that every type should be representable within the type system as a class or interface declaration.

In Ceylon, a single interface Callable abstracts all functions. Its declaration is the following:

shared interface Callable<out Return, in Arguments> 
        given Arguments satisfies Anything[] {}

The type parameter Return represents the return type of the function. The sequenced type parameter Arguments, which must be a sequence type, represents the parameter types of the function. We can encode any parameter list as a tuple type. For example, the parameter list (String s, Float x) is encoded as the tuple type [String,Float].

So, take the following function:

function sum(Integer x, Integer y) => x+y;

The type of sum() is:

Callable<Integer,[Integer,Integer]>

What about void functions? Well, the return type of a void function is considered to be Anything. So the type of a function like print() is:

Callable<Anything,[Anything]>

Folks who have a background in languages like ML might have expected that void would be identified with some "unit" type, for example, Null, or perhaps []. But this approach would mean that a non-void method would not be able to refine a void method, and that a non-void function would not be able to be assigned to a void functional parameter. Therefore, perfectly reasonable code would be rejected by the compiler.

Note that a void function with a concrete implementation implicitly returns the value null. This is different to a function declared to return the type Anything, which may return any value at all, but must do it explicitly, via a return statement. The following functions have the same type, Anything(), but don't do exactly the same thing:

Anything hello() { 
    print("hello");
    return "hello";
}

void hello() { 
    print("hello");
    //implicitly returns null
}

You shouldn't rely upon a function that is declared void returning null, because it might be a method that is refined by a non-void method, or a reference to a non-void function.

We can abbreviate Callable types with a little syntax sugar:

  • Integer(Integer,Integer) means Callable<Integer,[Integer,Integer]>, and, likewise,
  • Anything(String) means Callable<Anything,[String]>.

Defining higher order functions

We now have enough machinery to be able to write higher order functions. For example, we could create a repeat() function that repeatedly executes a function.

void repeat(Integer times, 
        Anything(Integer) perform) {
    for (i in 1..times) {
        perform(i);
    }
}

Let's try it:

void printNum(Integer n) => print(n);
repeat(10, printNum);

Which would print the numbers 1 to 10 to the console.

There's one problem with this. In Ceylon, as we'll see later, we often call functions using named arguments, but the Callable type does not encode the names of the function parameters. So Ceylon has an alternative, more elegant, syntax for declaring a parameter of type Callable:

void repeat(Integer times, 
        void perform(Integer n)) {
    for (i in 1..times) {
        perform { n=i; };
    }
}

This version is also slightly more readable, so it's the preferred syntax.

Function references

When a name of a function appears without any arguments, like printNum does above, it's called a function reference. A function reference is the thing that really has the type Callable. In this case, printNum has the type Callable<Anything,[Integer]>, or simply Anything(Integer).

Now, remember how we said that Anything is both the return type of a void function, and also the logical root of the type hierarchy? Well that's useful here, since it means that we can assign a function with any return type to any parameter which expects a void function, as long as the parameter lists match:

Boolean attemptPrint(Integer n) {
    try {
        print(n);
        return true;
    }
    catch (Exception e) {
        return false;
    }
}

And call it like this

repeat(10, attemptPrint);

Another way we can produce a function reference is by partially applying a method to a receiver expression. For example, we could write the following:

class Hello(String name) {
    shared void say(Integer n) {
        print("Hello, ``name``, for the ``n``th time!");
    }
}

And call it like this:

repeat(10, Hello("Gavin").say);

Here the expression Hello("Gavin").say has the same type as print above. It is of type Anything(Integer).

Function references and equality

The interface Callable doesn't satisfy Identifiable, so function references can't be compared for identity.

On the other hand, value equality of functions is a notion that is mathematically well-defined—two functions are equal if they produce the same value for every combination or arguments—but, unfortunately, is in general computationally undecidable.

Therefore, comparison of function references using equals() or the == operator always evaluates to false.

Gotcha!

This has some undesirable consequences, for example:

  • Repeatedly add()ing the same function reference to a MutableSet results in a set with multiple elements.
  • It's impossible to have a Map with function references as keys.

This is, however, the best we can reasonably do.

Static method and attribute references

A static method reference is a reference to a method, qualified by the type which declares the method.

value say = Hello.say;

In a static method reference, there's no receiving instance of the type! We have not provided any instance of Hello in this snippet. What we just wrote is quite different to this:

value say = Hello("Gavin").say;

So for a static method reference, to invoke the method, we must provide two items of information:

  • first, an instance of the type, and
  • then, the arguments of the function.

We provide these two items in distinct argument lists:

value say = Hello.say;
value sayHello = say(Hello("World"));
sayHello(3);

Or, simply:

value say = Hello.say;
say(Hello("World"))(3);

Let's fill in the types, to see what's really going on here:

Anything(Integer)(Hello) say = Hello.say;
Anything(Integer) sayHello = say(Hello("World"));
sayHello(3);

That is, a static method reference is a function that returns a function.

A static attribute reference is a reference to an attribute, qualified by the type which declares the attribute.

Polar coord = .... ;
value angle = Polar.angle;
value radius = Polar.angle;
Float a = angle(coord);
Float radius = radius(coord);

Just to be sure, let's fill in the types:

Polar coord = .... ;
Float(Polar) angle = Polar.angle;
Float(Polar) radius = Polar.angle;
Float a = angle(coord);
Float radius = radius(coord);

Static attribute references work especially well with the map() method of Iterable:

{String*} names = people.map(Person.name); 

Curried functions

A method or function may be declared in curried form, allowing the method or function to be partially applied to its arguments. A curried function has multiple lists of parameters:

Float adder(Integer n)(Float x) => x+n;

The adder() function has type Float(Float)(Integer). We can invoke it with a single integer argument to get a reference to a function of type Float(Float), and store this reference as a function, like this:

Float addOne(Float x);
addOne = adder(1);

Or as a value, like this:

Float(Float) addOne = adder(1);

(There only real difference between these two approaches is that in the first case we get to assign a name to the parameter of addOne().)

When we subsequently invoke addOne(), the actual body of adder() is finally executed, producing a Float:

Float three = addOne(2.0);

Gotcha!

Did you notice that order of parameter lists is reversed in a type expression compared to a function declaration? Look again:

void printSum(String desc)(Float x, Float y) => print(desc + (x+y).string);
Anything(Float,Float)(String) printSumRef = printSum;

The order of parameter lists in the function declaration reflects the order in which we supply arguments when we invoke the function. But in a function type expression, the return type always comes before the parameter types, therefore the parameters which must be supplied first come at the last in the function type.

Anonymous functions

The most famous higher-order functions are a trio of functions for transforming, filtering, and summarizing sequences of values. In Ceylon, these three functions, map(), filter(), and fold() are methods of the interface Iterable. (They even have a fourth, slightly less glamorous friend called find(), also a method of Iterable.)

As you've probably noticed, all the functions we've defined so far have been declared with a name, using a traditional C-like syntax. There's nothing wrong with passing a named function to map(), filter(), or fold(), and indeed that is often useful:

Float max = measurements.fold(0.0)(largest<Float>);

However, quite commonly, it's inconvenient to have to declare a whole named function just to pass it to map(), filter(), fold() or find(). Instead, we can declare an anonymous function inline, as part of the argument list:

Float max 
    = measurements.fold(0.0)
        ((Float max, Float num) 
                => num>max then num else max);

An anonymous function has:

  • optionally, the keyword function or void, and then
  • a parameter list, enclosed in parentheses, followed by
  • a fat arrow, =>, with an expression, or
  • a block.

So we could rewrite the above using a block

Float max 
    = measurements.fold(0.0) 
        ((Float max, Float num) {
            value tooBig = num > max;
            if (tooBig) {
                return num;
            }
            else {
                return max;
            }
        });

Note that it can sometimes be a bit tricky to come up with a good way to format anonymous functions with blocks in a readable way, so it's often better to just give the function a name and use it by reference.

Tip: use a comprehension instead

You're probably thinking that, since map(), filter() are so powerful (and so popular in other languages), that we must use map() and filter() together with anonymous functions all the time in Ceylon. Actually, we use them a little less often than you might expect! The reason is that we very often use comprehensions instead.

Anonymous function parameter type inference

When an anonymous function occurs in an argument list, it's usually, but not always, possible to infer the parameter types. We can write the previous example as follows:

Float max 
    = measurements.fold(0.0)
        ((max, num) => num>max then num else max);

Here, max and num have inferred type Float.

A similar sort of type inference applies when a reference to a generic function occurs in an argument list:

Float max = measurements.fold(0.0)(largest);

Here, the type argument of the function largest() is inferred to be Float.

Gotcha!

You might have noticed that fold() is defined in curried form, with two argument lists. The reason it's defined like this is to allow inference of the parameter types of its function argument. If it weren't defined in curried form, we would have to explicitly specify the anonymous function parameter types, since the type of the first argument of fold() would not be taken into account when inferring parameter types.

Destructuring anonymous function parameters

Tuple or entry destructuring may be used in the parameter list of an anonymous function.

value magnitude 
    = ([Float x, Float y, Float z]) 
            => (x^2 + y^2 + z^2) ^ 0.5;

print(magnitude([1.0, 2.0, -1.0]));

This is most useful when the anonymous function occurs as a function argument, in which case it may be used together with parameter type inference.

"hello world".indexed.each((index -> char) {
    print("``index`` : ``char``");
});

Destructuring in the parameter list of a named function is not yet allowed.

More about higher-order functions

Let's see a more practical example, which mixes both ways of representing a function type.

Suppose we have some kind of user interface component which can be observed by other objects in the system. We could use something like Java's Observer/Observable pattern:

interface Event { }

interface Observer {
    shared formal void observe(Event event);
}

abstract class Component() {

    variable {Observer*} observers = {};

    shared void addObserver(Observer observer) {
        observers = observers.follow(observer);
    }

    shared void fire(Event event) {
        for (o in observers) {
            o.observe(event);
        }
    }
}

But now all event observers have to implement the interface Observer, which has just one method. Why don't we cut out the interface, and let event observers just register a function object as their event listener? In the following code, we define the addObserver() method to accept a function as a parameter.

interface Event { }

abstract class Component() {

    variable {Anything(Event)*} observers = {};

    shared void addObserver(void observe(Event event)) {
        observers = observers.follow(observe);
    }

    shared void fire(Event event) {
        for (observe in observers) {
            observe(event);
        }
    }
}

Here we see the difference between the two ways of specifying a function type:

  • void observe(Event event) is more readable in parameter lists, where observe is the name of the parameter, but
  • Anything(Event) is useful in container types such as iterables.

Now, any event observer can just pass a reference to one of its own methods to addObserver():

class Listener(Component component) {

    void onEvent(Event e) {
        // respond to the event
        // ...
    }

    component.addObserver(onEvent);

    // ...

}

When the name of a method appears in an expression without a list of arguments after it, it is a reference to the method, not an invocation of the method. Here, the expression onEvent is an expression of type Anything(Event) that refers to the method onEvent().

If onEvent() were shared, we could even wire together the Component and Listener from some other code, to eliminate the dependency of Listener on Component:

class Listener() {

    shared void onEvent(Event e) {
        // respond to the event
        // ...
    }

    // ...

}
void listen(Component component, Listener listener) {
    component.addObserver(listener.onEvent);
}

Here, the syntax listener.onEvent is a kind of partial application of the method onEvent(). It doesn't cause the onEvent() method to be executed (because we haven't supplied all the parameters yet). Rather, it results in a function that packages together the method reference onEvent and the method receiver listener.

It's also possible to declare a method that returns a function. Let's consider adding the ability to remove observers from a Component. We could use a Subscription interface:

interface Event {}

interface Subscription {
    shared formal void cancel();
    shared formal void handle(Event event);
}

class Component() {

    variable {Subscription*} subscriptions = {};

    shared Subscription addObserver(void observe(Event event)) {
        object subscription satisfies Subscription {
            handle = observe;
            cancel() => subscriptions 
                    = subscriptions.filter((s) => s!=this);
        }
        subscriptions = subscriptions.follow(subscription);
        return subscription;
    }

    shared void fire(Event event) {
        for (s in subscriptions) {
            s.handle(event);
        }
    }

}

But a simpler solution might be to just eliminate the interface and return the cancel() method directly:

interface Event {}

class Component() {

    variable {Subscription*} subscriptions = {};

    class Subscription(shared void handle(Event event)) {
        shared void cancel() => subscriptions 
                    = subscriptions.filter((s) => s!=this);
    }

    shared Anything() addObserver(void observe(Event event)) {
        value subscription = Subscription(observe);
        subscriptions = subscriptions.follow(subscription);
        return subscription.cancel;
    }

    shared void fire(Event event) {
        for (s in subscriptions) {
            s.handle(event);
        }
    }

}

Here, addObserver() returns a reference to the method cancel() of the private nested class Subscription. The reference to the method can be called by any code that obtains the reference.

In case you're wondering, the type of the function addObserver() is Anything()(Anything(Event)).

We could invoke our method like this:

addObserver(onEvent)();

But if we were planning to use the method in this way, there would be no good reason for giving it two parameter lists. It's much more likely that we're planning to store or pass the reference to the inner method somewhere before invoking it.

Anything() cancel = addObserver(onEvent);
// ...
cancel();

The first line demonstrates how a function reference can be stored. The second line of code simply invokes the returned reference to cancel().

Gotcha!

Notice that even after eliminating Subscription from the external API of Component class, we still needed to store Subscription objects internally. The reason for this is that function references don't have any well-defined notion of equality (not even identity equality), and so x==y always evaluates to false for any two function references x and y. Thus, in order to be able to remove a function reference from the subscriptions using filter(), we need a wrapping object that defines an identity for the subscription.

Composition and curry

The function compose() performs function composition. For example, given the functions print() and plus() in ceylon.language, with the following signatures:

shared void print(Anything line) { ... }

shared Value plus<Value>(Value x, Value y)
    given Value satisfies Summable<Value> { ... }

We can see that the type of the function reference print is Anything(Anything), and that the type of the function reference plus<Float> is Float(Float,Float). Then we can write the following:

Anything(Float,Float) printSum = compose(print,plus<Float>);
printSum(2.0,2.0); //prints 4.0

The function curry() produces a function with multiple parameter lists, given a function with multiple parameters:

Anything(Float)(Float) printSumCurried = curry(printSum);
Anything(Float) printPlus2 = printSumCurried(2.0);
printPlus2(2.0); //prints 4.0

The function uncurry() does the opposite, giving us back our original uncurried signature:

Anything(Float,Float) printSumUncurried = uncurry(printSumCurried);

Note that compose(), curry(), and uncurry() are ordinary functions, written in Ceylon.

The spread operator

We've already seen a few examples of the spread operator. We've seen how to use it to instantiate an iterable:

{ "hello", *names }

Or a tuple:

[x, y, *labels]

We can also use it when calling a function. Consider the following function:

String formatDate(String format, 
                  Integer day, 
                  Integer|String month, 
                  Integer year) {
    ...
}

And suppose we have a tuple representing a date:

value date = [15, "January", 2010];

Then we can pass the date to our function like this:

formatDate("dd MMMMM yyyy", *date)

Notice that the type of the tuple ["dd MMMMM yyyy", *date] is:

[String,Integer,String,Integer] 

Now consider type of the function formatDate. It is:

String(String,Integer,Integer|String,Integer)

Or rather:

Callable<String,[String,Integer,Integer|String,Integer]>

Since the tuple type [String,Integer,String,Integer] is a subtype of [String,Integer,Integer|String,Integer], the invocation is well-typed. This demonstrates the relationship between tuples and function argument!

There's more...

You'll find a more detailed discussion of how Ceylon represents function types using tuples here, including an in-depth discussion of compose() and curry().

Now we're going to talk about Ceylon's syntax for named argument lists and for defining user interfaces and structured data.