Functions
This is the eleventh part of the Tour of Ceylon. In the previous leg we looked at packages and modules. This leg covers first class and higher-order functions.
First class and higher order functions
Ceylon isn't a pure functional language: it has variable
value
references and so functions can have side effects. But one thing Ceylon
has in common with functional programming languages is that it lets you
treat functions as values, which in some people's eyes makes the language
a kind of hybrid. In truth, there's nothing remotely new about having
functions-as-values in an object oriented language—for example,
Smalltalk, one of the first and still one of the cleanest object oriented
languages, was built around this idea. Anyway, Ceylon, like Smalltalk and
a number of other object oriented languages, lets you treat a function as
an object and pass it to another function.
In this installment, we're going to discuss Ceylon's support for first class and higher order functions. A little bit of PL jargon:
- First class functions means the ability to treat a function as a value, assigning it to a variable, or passing it as an argument to another function.
- A higher order function is a function that accepts other functions as arguments, or returns another function.
It's clear that these two ideas go hand-in-hand, so we'll just use the term "higher order functions" from now on.
Representing the type of a function
Ceylon is a (very) statically typed language. So if we're going to treat a function as a value, the very first question that arises is: what is the type of the function? We need a way to encode the return type and parameter types of a function into the type system. Remember that Ceylon doesn't have "primitive" types. A strong design principle is that every type should be representable within the type system as a class or interface declaration.
In Ceylon, a single interface
Callable
abstracts all functions. Its declaration is the following:
shared interface Callable<out Return, in Arguments>
given Arguments satisfies Anything[] {}
The type parameter Return
represents the return type of the function. The
sequenced type parameter Arguments
, which must be a sequence type,
represents the parameter types of the function. We can encode any parameter
list as a tuple type. For example, the parameter list (String s, Float x)
is encoded as the tuple type [String,Float]
.
So, take the following function:
function sum(Integer x, Integer y) => x+y;
The type of sum()
is:
Callable<Integer,[Integer,Integer]>
What about void
functions? Well, the return type of a void
function is
considered to be Anything
. So the type of a function like print()
is:
Callable<Anything,[Anything]>
Folks who have a background in languages like ML might have expected that
void
would be identified with some "unit" type, for example, Null
, or
perhaps []
. But this approach would mean that a non-void
method would
not be able to refine a void
method, and that a non-void
function would
not be able to be assigned to a void
functional parameter. Therefore,
perfectly reasonable code would be rejected by the compiler.
Note that a void
function with a concrete implementation implicitly returns
the value null
. This is different to a function declared to return the type
Anything
, which may return any value at all, but must do it explicitly, via
a return
statement. The following functions have the same type, Anything()
,
but don't do exactly the same thing:
Anything hello() {
print("hello");
return "hello";
}
void hello() {
print("hello");
//implicitly returns null
}
You shouldn't rely upon a function that is declared void
returning null
,
because it might be a method that is refined by a non-void
method, or a
reference to a non-void
function.
We can abbreviate Callable
types with a little syntax sugar:
-
Integer(Integer,Integer)
meansCallable<Integer,[Integer,Integer]>
, and, likewise, -
Anything(String)
meansCallable<Anything,[String]>
.
Defining higher order functions
We now have enough machinery to be able to write higher order functions.
For example, we could create a repeat()
function that repeatedly executes
a function.
void repeat(Integer times,
Anything(Integer) perform) {
for (i in 1..times) {
perform(i);
}
}
Let's try it:
void printNum(Integer n) => print(n);
repeat(10, printNum);
Which would print the numbers 1 to 10 to the console.
There's one problem with this. In Ceylon, as we'll see later, we often call
functions using named arguments, but the Callable type does not encode the
names of the function parameters. So Ceylon has an alternative, more elegant,
syntax for declaring a parameter of type Callable
:
void repeat(Integer times,
void perform(Integer n)) {
for (i in 1..times) {
perform { n=i; };
}
}
This version is also slightly more readable, so it's the preferred syntax.
Function references
When a name of a function appears without any arguments, like printNum
does
above, it's called a function reference. A function reference is the thing
that really has the type Callable
. In this case, printNum
has the type
Callable<Anything,[Integer]>
, or simply Anything(Integer)
.
Now, remember how we said that Anything
is both the return type of a void
function, and also the logical root of the type hierarchy? Well that's useful
here, since it means that we can assign a function with any return type to any
parameter which expects a void
function, as long as the parameter lists
match:
Boolean attemptPrint(Integer n) {
try {
print(n);
return true;
}
catch (Exception e) {
return false;
}
}
And call it like this
repeat(10, attemptPrint);
Another way we can produce a function reference is by partially applying a method to a receiver expression. For example, we could write the following:
class Hello(String name) {
shared void say(Integer n) {
print("Hello, ``name``, for the ``n``th time!");
}
}
And call it like this:
repeat(10, Hello("Gavin").say);
Here the expression Hello("Gavin").say
has the same type as print
above.
It is of type Anything(Integer)
.
Function references and equality
The interface Callable
doesn't satisfy Identifiable
, so function
references can't be compared for identity.
On the other hand, value equality of functions is a notion that is mathematically well-defined—two functions are equal if they produce the same value for every combination or arguments—but, unfortunately, is in general computationally undecidable.
Therefore, comparison of function references using equals()
or the ==
operator always evaluates to false
.
Gotcha!
This has some undesirable consequences, for example:
- Repeatedly
add()
ing the same function reference to aMutableSet
results in a set with multiple elements. - It's impossible to have a
Map
with function references as keys.
This is, however, the best we can reasonably do.
Static method and attribute references
A static method reference is a reference to a method, qualified by the type which declares the method.
value say = Hello.say;
In a static method reference, there's no receiving instance of the type! We
have not provided any instance of Hello
in this snippet. What we just wrote
is quite different to this:
value say = Hello("Gavin").say;
So for a static method reference, to invoke the method, we must provide two items of information:
- first, an instance of the type, and
- then, the arguments of the function.
We provide these two items in distinct argument lists:
value say = Hello.say;
value sayHello = say(Hello("World"));
sayHello(3);
Or, simply:
value say = Hello.say;
say(Hello("World"))(3);
Let's fill in the types, to see what's really going on here:
Anything(Integer)(Hello) say = Hello.say;
Anything(Integer) sayHello = say(Hello("World"));
sayHello(3);
That is, a static method reference is a function that returns a function.
A static attribute reference is a reference to an attribute, qualified by the type which declares the attribute.
Polar coord = .... ;
value angle = Polar.angle;
value radius = Polar.angle;
Float a = angle(coord);
Float radius = radius(coord);
Just to be sure, let's fill in the types:
Polar coord = .... ;
Float(Polar) angle = Polar.angle;
Float(Polar) radius = Polar.angle;
Float a = angle(coord);
Float radius = radius(coord);
Static attribute references work especially well with the
map()
method of Iterable
:
{String*} names = people.map(Person.name);
Curried functions
A method or function may be declared in curried form, allowing the method or function to be partially applied to its arguments. A curried function has multiple lists of parameters:
Float adder(Integer n)(Float x) => x+n;
The adder()
function has type Float(Float)(Integer)
. We can invoke it
with a single integer argument to get a reference to a function of type
Float(Float)
, and store this reference as a function, like this:
Float addOne(Float x);
addOne = adder(1);
Or as a value, like this:
Float(Float) addOne = adder(1);
(There only real difference between these two approaches is that in the
first case we get to assign a name to the parameter of addOne()
.)
When we subsequently invoke addOne()
, the actual body of adder()
is
finally executed, producing a Float
:
Float three = addOne(2.0);
Gotcha!
Did you notice that order of parameter lists is reversed in a type expression compared to a function declaration? Look again:
void printSum(String desc)(Float x, Float y) => print(desc + (x+y).string);
Anything(Float,Float)(String) printSumRef = printSum;
The order of parameter lists in the function declaration reflects the order in which we supply arguments when we invoke the function. But in a function type expression, the return type always comes before the parameter types, therefore the parameters which must be supplied first come at the last in the function type.
Anonymous functions
The most famous higher-order functions are a trio of functions for transforming,
filtering, and summarizing sequences of values. In Ceylon, these three functions,
map()
, filter()
, and fold()
are methods of the interface
Iterable
.
(They even have a fourth, slightly less glamorous friend called find()
, also a
method of Iterable
.)
As you've probably noticed, all the functions we've defined so far have been
declared with a name, using a traditional C-like syntax. There's nothing wrong
with passing a named function to map()
, filter()
, or fold()
, and indeed
that is often useful:
Float max = measurements.fold(0.0)(largest<Float>);
However, quite commonly, it's inconvenient to have to declare a whole named
function just to pass it to map()
, filter()
, fold()
or find()
. Instead,
we can declare an anonymous function inline, as part of the argument list:
Float max
= measurements.fold(0.0)
((Float max, Float num)
=> num>max then num else max);
An anonymous function has:
- optionally, the keyword
function
orvoid
, and then - a parameter list, enclosed in parentheses, followed by
- a fat arrow,
=>
, with an expression, or - a block.
So we could rewrite the above using a block
Float max
= measurements.fold(0.0)
((Float max, Float num) {
value tooBig = num > max;
if (tooBig) {
return num;
}
else {
return max;
}
});
Note that it can sometimes be a bit tricky to come up with a good way to format anonymous functions with blocks in a readable way, so it's often better to just give the function a name and use it by reference.
Tip: use a comprehension instead
You're probably thinking that, since map()
, filter()
are so powerful
(and so popular in other languages), that we must use map()
and filter()
together with anonymous functions all the time in Ceylon. Actually, we
use them a little less often than you might expect! The reason is that we
very often use comprehensions instead.
Anonymous function parameter type inference
When an anonymous function occurs in an argument list, it's usually, but not always, possible to infer the parameter types. We can write the previous example as follows:
Float max
= measurements.fold(0.0)
((max, num) => num>max then num else max);
Here, max
and num
have inferred type Float
.
A similar sort of type inference applies when a reference to a generic function occurs in an argument list:
Float max = measurements.fold(0.0)(largest);
Here, the type argument of the function largest()
is inferred to be Float
.
Gotcha!
You might have noticed that fold()
is defined in curried form, with two
argument lists. The reason it's defined like this is to allow inference of
the parameter types of its function argument. If it weren't defined in
curried form, we would have to explicitly specify the anonymous function
parameter types, since the type of the first argument of fold()
would
not be taken into account when inferring parameter types.
Destructuring anonymous function parameters
Tuple or entry destructuring may be used in the parameter list of an anonymous function.
value magnitude
= ([Float x, Float y, Float z])
=> (x^2 + y^2 + z^2) ^ 0.5;
print(magnitude([1.0, 2.0, -1.0]));
This is most useful when the anonymous function occurs as a function argument, in which case it may be used together with parameter type inference.
"hello world".indexed.each((index -> char) {
print("``index`` : ``char``");
});
Destructuring in the parameter list of a named function is not yet allowed.
More about higher-order functions
Let's see a more practical example, which mixes both ways of representing a function type.
Suppose we have some kind of user interface component which can be observed by
other objects in the system. We could use something like Java's
Observer
/Observable
pattern:
interface Event { }
interface Observer {
shared formal void observe(Event event);
}
abstract class Component() {
variable {Observer*} observers = {};
shared void addObserver(Observer observer) {
observers = observers.follow(observer);
}
shared void fire(Event event) {
for (o in observers) {
o.observe(event);
}
}
}
But now all event observers have to implement the interface Observer
, which
has just one method. Why don't we cut out the interface, and let event
observers just register a function object as their event listener? In the
following code, we define the addObserver()
method to accept a function as
a parameter.
interface Event { }
abstract class Component() {
variable {Anything(Event)*} observers = {};
shared void addObserver(void observe(Event event)) {
observers = observers.follow(observe);
}
shared void fire(Event event) {
for (observe in observers) {
observe(event);
}
}
}
Here we see the difference between the two ways of specifying a function type:
-
void observe(Event event)
is more readable in parameter lists, whereobserve
is the name of the parameter, but -
Anything(Event)
is useful in container types such as iterables.
Now, any event observer can just pass a reference to one of its own methods to
addObserver()
:
class Listener(Component component) {
void onEvent(Event e) {
// respond to the event
// ...
}
component.addObserver(onEvent);
// ...
}
When the name of a method appears in an expression without a list of arguments
after it, it is a reference to the method, not an invocation of the method.
Here, the expression onEvent
is an expression of type Anything(Event)
that
refers to the method onEvent()
.
class Listener() {
shared void onEvent(Event e) {
// respond to the event
// ...
}
// ...
}
void listen(Component component, Listener listener) {
component.addObserver(listener.onEvent);
}
Here, the syntax listener.onEvent
is a kind of partial application of the
method onEvent()
. It doesn't cause the onEvent()
method to be
executed (because we haven't supplied all the parameters yet). Rather, it
results in a function that packages together the method reference onEvent
and the method receiver listener
.
It's also possible to declare a method that returns a function. Let's
consider adding the ability to remove observers from a Component
. We could
use a Subscription
interface:
interface Event {}
interface Subscription {
shared formal void cancel();
shared formal void handle(Event event);
}
class Component() {
variable {Subscription*} subscriptions = {};
shared Subscription addObserver(void observe(Event event)) {
object subscription satisfies Subscription {
handle = observe;
cancel() => subscriptions
= subscriptions.filter((s) => s!=this);
}
subscriptions = subscriptions.follow(subscription);
return subscription;
}
shared void fire(Event event) {
for (s in subscriptions) {
s.handle(event);
}
}
}
But a simpler solution might be to just eliminate the interface and return the
cancel()
method directly:
interface Event {}
class Component() {
variable {Subscription*} subscriptions = {};
class Subscription(shared void handle(Event event)) {
shared void cancel() => subscriptions
= subscriptions.filter((s) => s!=this);
}
shared Anything() addObserver(void observe(Event event)) {
value subscription = Subscription(observe);
subscriptions = subscriptions.follow(subscription);
return subscription.cancel;
}
shared void fire(Event event) {
for (s in subscriptions) {
s.handle(event);
}
}
}
Here, addObserver()
returns a reference to the method cancel()
of the
private nested class Subscription
. The reference to the method can be called
by any code that obtains the reference.
In case you're wondering, the type of the function addObserver()
is
Anything()(Anything(Event))
.
We could invoke our method like this:
addObserver(onEvent)();
But if we were planning to use the method in this way, there would be no good reason for giving it two parameter lists. It's much more likely that we're planning to store or pass the reference to the inner method somewhere before invoking it.
Anything() cancel = addObserver(onEvent);
// ...
cancel();
The first line demonstrates how a function reference can be stored. The second
line of code simply invokes the returned reference to cancel()
.
Gotcha!
Notice that even after eliminating Subscription
from the external API of
Component
class, we still needed to store Subscription
objects internally.
The reason for this is that function references don't have any well-defined
notion of equality (not even identity equality), and so x==y
always evaluates
to false
for any two function references x
and y
. Thus, in order to be
able to remove a function reference from the subscriptions
using filter()
,
we need a wrapping object that defines an identity for the subscription.
Composition and curry
The function compose()
performs function composition. For example, given
the functions print()
and plus()
in ceylon.language
, with the following
signatures:
shared void print(Anything line) { ... }
shared Value plus<Value>(Value x, Value y)
given Value satisfies Summable<Value> { ... }
We can see that the type of the function reference print
is Anything(Anything)
,
and that the type of the function reference plus<Float>
is Float(Float,Float)
.
Then we can write the following:
Anything(Float,Float) printSum = compose(print,plus<Float>);
printSum(2.0,2.0); //prints 4.0
The function curry()
produces a function with multiple parameter lists, given
a function with multiple parameters:
Anything(Float)(Float) printSumCurried = curry(printSum);
Anything(Float) printPlus2 = printSumCurried(2.0);
printPlus2(2.0); //prints 4.0
The function uncurry()
does the opposite, giving us back our original uncurried
signature:
Anything(Float,Float) printSumUncurried = uncurry(printSumCurried);
Note that compose()
, curry()
, and uncurry()
are ordinary functions, written
in Ceylon.
The spread operator
We've already seen a few examples of the spread operator. We've seen how to use it to instantiate an iterable:
{ "hello", *names }
Or a tuple:
[x, y, *labels]
We can also use it when calling a function. Consider the following function:
String formatDate(String format,
Integer day,
Integer|String month,
Integer year) {
...
}
And suppose we have a tuple representing a date:
value date = [15, "January", 2010];
Then we can pass the date to our function like this:
formatDate("dd MMMMM yyyy", *date)
Notice that the type of the tuple ["dd MMMMM yyyy", *date]
is:
[String,Integer,String,Integer]
Now consider type of the function formatDate
. It is:
String(String,Integer,Integer|String,Integer)
Or rather:
Callable<String,[String,Integer,Integer|String,Integer]>
Since the tuple type [String,Integer,String,Integer]
is a subtype of
[String,Integer,Integer|String,Integer]
, the invocation is well-typed.
This demonstrates the relationship between tuples and function argument!
There's more...
You'll find a more detailed discussion of how Ceylon represents function
types using tuples here,
including an in-depth discussion of compose()
and curry()
.
Now we're going to talk about Ceylon's syntax for named argument lists and for defining user interfaces and structured data.