Type aliases and type inference

This is the seventh step in the Tour of Ceylon. The previous installment introduced various kinds of iterable objects. Now it's time to explore Ceylon's type system in more detail.

In this chapter, we're going to discuss type aliases and local type inference, two features of the language that help reduce the verbosity of statically typed code.

Type aliases

It's often useful to provide a shorter or more semantic name to an existing class or interface type, especially if the class or interface is a parameterized type. For this, we use a type alias.

To define an alias for a class or interface, we use a fat arrow, for example:

interface People => Set<Person>;

A class alias must declare its formal parameters:

class People({Person*} people) => ArrayList<Person>(people);

If you need to create an alias for a union or intersection type you have to use the alias keyword:

alias Num => Float|Integer;

Note: the reason for the difference in syntax is that we can extend or satisfy a class or interface alias, but we can't inherit from a type alias declared using the alias keyword.

A type alias may be parameterized, and have type constraints, which we'll learn about later:

class Named<Value>(String name, Value val) 
        given Value satisfies Object
        => Entry<String,Value>(name,val);

Type aliases help us reduce verbosity, because instead of repeatedly writing out the same generic type, for example Set<Person>, we can use a snappier alias, such as People. But in some cases, as we're about to see below, Ceylon lets us omit the type altogether.

A toplevel type alias or a type alias belonging to a class or interface may be shared.

shared interface People => Set<Person>;

Member class aliases and class alias refinement

When it comes to class aliases, Ceylon has one more trick up its sleeves. Cast your mind back to what we learned about member classes in the fifth leg of the tour. What we saw there with ordinary classes also applies to class aliases.

A type alias may be nested inside a class or interface. In the case of a class alias, it is considered a member of the class or interface:

class BufferedReader(Reader reader)
        satisfies Reader {
    shared default class Buffer()
        => MutableList<Character>();
    ...
}

Now, if the class alias is annotated default, it may be refined, either by an inner alias of a subclass of the original aliased class:

class BufferedFileReader(File file)
        extends BufferedReader(FileReader(file)) {
    shared actual class Buffer()
        => MutableLinkedList<Character>();
    ...
}

Or by an inner subclass of the original aliased class:

class BufferedFileReader(File file)
        extends BufferedReader(FileReader(file)) {
    shared actual class Buffer()
            extends super.Buffer() {  
        ...
    }
    ...
}

(Alternatively, we could have written extends MutableList<Character>() instead of extends super.Buffer(), since both expression refer to the same class type.)

Type inference

So far, we've always been explicitly specifying the type of every declaration. This generally makes code, especially example code, much easier to read and understand.

However, Ceylon does have the ability to infer the type of a local variable or the return type of a local method. Just place the keyword value (in the case of a local value) or function (in the case of a local function) in place of the type declaration.

value polar = Polar(pi, 2.0);
value operators = { "+", "-", "*", "/" };
function add(Integer x, Integer y) => x+y;

There are some restrictions applying to this feature. You can't use value or function:

  • for declarations annotated shared,
  • for declarations annotated formal,
  • to declare a parameter, or
  • for a forward declaration, where the value is specified later in the block of statements.

(Note, this last restriction will be removed in a future version of the language!)

"Left to right" type inference

These restrictions mean that Ceylon's type inference rules are quite simple. Type inference is purely "right-to-left" and "top-to-bottom". The type of any expression is already known without needing to look to any types declared to the left of the = specifier, or further down the block of statements.

  • The inferred type of a reference declared value is just the type of the expression assigned to it using =.
  • The inferred type of a getter declared value is just the union of the returned expression types appearing in the getter's return statements (or Nothing if the getter has no return statement).
  • The inferred type of a method declared function is just the union of the returned expression types appearing in the method's return statements (or Nothing if the method has no return statement).

For example:

function parseIntegerOrFloat(String text) {
    if ('.' in text) {
        return parseFloat(text);
    }
    else {
        return parseInteger(text);
    }
}

The inferred return type of parseIntegerOrFloat() is Integer|Float?, since parseInteger() returns Integer? and parseFloat() returns Float?.

Type inference for streams

What about iterable enumeration expressions like this, which we first met in the last chapter:

value coords  = { Polar(0.0, 0.0), Cartesian(1.0, 2.0) };

What type is inferred for coords? You might answer:

{X+} where X is the common superclass or super-interface of all the element types.

But that can't be right, since there might be more than one common supertype, and supertypes have no well-defined linearization (order).

The correct answer is that the inferred type is {X+} where X is the union of all the element expression types. In this case, the type is {Polar|Cartesian+}.

Now, since Iterable<T> is covariant in T, and since a union type is the most precise common supertype of any two types, it turns out that coords is of type {X+} for every X in the set of common supertypes of the element types. That this even works out is due to a property of Ceylon's type system called principal typing.

Thus, the following code is well-typed:

value coords =
        { Polar(0.0, 0.0), 
          Cartesian(1.0, 2.0) }; //type {Polar|Cartesian+}
{Point+} points = coords;

As is the following code:

value nums = { 12.0, 1, -3, "0" }; //type {Float|Integer|String+}
{Object+} objects = nums;

What about iterables that produce nulls? Well, do you remember the type of null was Null?

{String?*} strings = { null, "Hello", "World" };
String? str = strings.first;

The type of the attribute first of Iterable<Element> is Element?. Here, we have an Iterable<String?>. Substituting String? for Element, we get the type String??, that is, Null|Null|String, which is simply Null|String, written String?. Of course, since the compiler can figure out that kind of thing for us, we could have simply written:

value strings = { null, "Hello", "World" }; //type {Null|String|String+} i.e. {String?+}
value str = strings.first; //type String?

The same thing works out for sequences:

value strings = [null, "Hello", "World"]; //type [Null,String,String]
value str = strings[0]; //type String?

It's interesting just how useful union types turn out to be. Even if you only rarely write code with explicit union type declarations, they're still there, under the covers, helping the compiler solve some hairy, otherwise-ambiguous, typing problems.

Note that what we've just seen is really just a special case of the algorithm Ceylon uses for generic type argument inference, and all of the above works just as well for user-written generic types as it does for Iterable.

Gotcha!

Very occasionally, this "collapsing" behavior of unions of Nulls—that Null|Null|T is just Null|T—is inconvenient. Imagine that we wanted to have a Map that distinguished between:

  • a key for which the map has no entry, and
  • a key for which the map has an entry with a no item.

We might try to use a Map<String,Item?> for this. But then map.get(key) would simply return null in both of the above cases, so how could we distinguish between them? One way would be to call map.defines(key), but that would result in an additional lookup.

In practice, what we actually do is call Map.getOrDefault(), but, for the sake of argument, let's pretend that getOrDefault() isn't there, and ask how else we could solve the problem.

There's two idioms that we could use to handle this situation. The first uses a "wrapper" object for each entry:

class Maybe(shared Item? item) {}
Map<String,Maybe<Item>> map 
        = HashMap<String,Maybe<Item>>();

void put(String key, Item? item) 
        => map.put(key, Maybe(item));

Item? get(String key, Item? defaultItem) {
    if (exists maybe = map[key]) {
        return maybe.item;
    }
    else {
        //no entry
        return defaultItem;
    }
}

The second idiom is more efficient. It uses the unit type pattern:

abstract class Nil() of nil {}
object nil extends Nil() {}
Map<String,Item|Nil> map 
        = HashMap<String,Item|Nil>();

void put(String key, Item? item) 
        => map.put(key, item else nil);

Item? get(String key, Item? defaultItem) {
    if (!is Nil item = map[key]) {
        return item else defaultItem;
    }
    else {
        //entry with no item
        return null;
    }
}

These idioms sometimes arise in problems like caching.

Anonymous classes and type inference

Since an anonymous class aren't supposed to be referred to be name, Ceylon replaces anonymous classes with the intersection of their supertypes when performing type inference:

interface Foo {}
interface Bar {}
object foobar satisfies Foo & Bar {}
value fb = foobar; //inferred type Basic&Foo&Bar
value fbs = { foobar, foobar }; //inferred type {Basic&Foo&Bar+}

If we want to avoid this behavior, we can't use type inference.

Tip: explicitly specifying an anonymous class type

If you really need to the exact type of the the anonymous class, you'll need to specify the type explicitly.

\Ifoobar fb = foobar; //inferred type Basic&Foo&Bar
{\Ifoobar+} fbs = { foobar, foobar }; //inferred type {Basic&Foo&Bar+}

It's extremely rare that this is useful.

There's more...

Next we'll explore some more details of the type system, starting with union types, intersection types, enumerated types, and type switching. Then, after that, we'll be ready to discuss generic types.