Blog tagged design

Tuples and function types

Tuples

Ceylon is getting tuples in the next version. For those who don´t know what a tuple is, it´s a bit like a struct in C: a finite sequence of heterogeneous typed elements values, with no methods (unlike a class) and no names (unlike C structures). But let´s start with an example.

In Ceylon, [1, 2.0, "Hello"] is a tuple value of size 3 with type [Integer, Float, String]. Now let´s see a more useful example:

// convert a sequence of strings into a sequence of tuples with size, string
// and number of uppercase letters
Iterable<[Integer,String,Integer]> results = 
    {"Merry Christmas", "Happy Cheesy Holidays"}
        .map((String s) [s.size, s, s.count((Character c) c.uppercase)]);
for([Integer,String,Integer] result in results){
    Integer size = result[0];
    String s = result[1];
    Integer uppercaseLetters = result[2];
    print("Size: " size ", for: '" s "', uppercase letters: " uppercaseLetters "");
}

Which outputs:

Size: 15, for: 'Merry Christmas', uppercase letters: 2
Size: 21, for: 'Happy Cheesy Holidays', uppercase letters: 3

As you can see, you can access each tuple element using the Correspondence.item syntax sugar result[i], and that's because our Tuple type satisfies Sequence, underneath.

You may ask, but then what is the sequence type of a [Integer, String, Integer] tuple, then? Easy: it´s Sequencial<Integer|String>.

Then how do we possibly know that result[2] is an Integer and not an Integer|String? Well, that´s again syntactic sugar.

You see, our Tuple type is defined like this:

shared class Tuple<out Element, out First, out Rest>(first, rest)
        extends Object()
        satisfies Sequence<Element>
        given First satisfies Element
        given Rest satisfies Element[] {

    shared actual First first;
    shared actual Rest rest;

    // ...
}

So a tuple is just a sort of elaborate cons-list and result[2] is syntactic sugar for result.rest.rest.first, which has type Integer.

Similarly, the [Integer, String, Integer] type literal is syntactic sugar for:

Tuple<Integer|String, Integer, Tuple<Integer|String, String, Tuple<Integer, Integer, Empty>>>

And last, the [0, "foo", 2] value literal is syntactic sugar for:

Tuple(0, Tuple("foo", Tuple(2, empty)))

As you can see, our type inference rules are smart enough to infer the Integer|String sequential types by itself.

Function types

So tuples are nice, for example to return multiple values from a method, but that´s not all you can do with them.

Here´s how our Callable type (the type of Ceylon functions and methods) is defined:

shared interface Callable<out Return, in Arguments> 
    given Arguments satisfies Void[] {}

As you can see, its Arguments type parameter accepts a sequence of anything (Void is the top of the object hierarchy in Ceylon, above Object and the type of null), which means we can use tuple types to describe the parameter lists:

void threeParameters(Integer a, String b, Float c){
}

Callable<Void, [Integer, String, Float]> threeParametersReference = threeParameters;

So, Callable<Void, [Integer, String, Float]> describes the type of a function which returns Void and takes three parameters of types Integer, String and Float.

But you may ask what the type of a function with defaulted parameters is? Defaulted parameters are optional parameters, that get a default value if you don´t specify them:

void oneOrTwo(Integer a, Integer b = 2){
    print("a = " a ", b = " b ".");
}
oneOrTwo(1);
oneOrTwo(1, 1);

That would print:

a = 1, b = 2.
a = 1, b = 1.

So let´s see what the type of oneOrTwo is:

Callable<Void, [Integer, Integer=]> oneOrTwoReference = oneOrTwo;
// magic: we can still invoke it with one or two arguments!
oneOrTwoReference(1);
oneOrTwoReference(1, 1);

So its type is Callable<Void, [Integer, Integer=]>, which means that it takes one or two parameters. We´re not sure yet about the = sign in Integer= to denote that it is optional, so that may change. This is syntactic sugar for Callable<Void, [Integer] | [Integer, Integer]>, meaning: a function that takes one or two parameters.

Similarly, variadic functions also have a denotable type:

void zeroOrPlenty(Integer... args){
    for(i in args){
        print(i);
    }
}
Callable<Void, [Integer...]> zeroOrPlentyReference = zeroOrPlenty;
// magic: we can still invoke it with zero or more arguments!
zeroOrPlentyReference();
zeroOrPlentyReference(1);
zeroOrPlentyReference(5, 6);

Here, [Integer...] means that it accepts a sequence of zero or more Integer elements.

Now if you´re not impressed yet, here´s the thing that´s really cool: thanks to our subtyping rules (nothing special here, just our normal subtyping rules related to union types), we´re able to figure out that a function that takes one or two parameters is a supertype of both functions that take one parameter and functions that take two parameters:

Callable<Void, [Integer, Integer=]> oneOrTwoReference = oneOrTwo;

// we can all it with one parameter
Callable<Void, [Integer]> one = oneOrTwoReference;
// or two
Callable<Void, [Integer, Integer]> two = oneOrTwoReference;

And similarly for variadic functions, they are supertypes of functions that take any number of parameters:

Callable<Void, [Integer...]> zeroOrPlentyReference = zeroOrPlenty;

// we can call it with no parameters
Callable<Void, []> zero = zeroOrPlentyReference;
// or one
Callable<Void, [Integer]> oneAgain = zeroOrPlentyReference;
// or two
Callable<Void, [Integer, Integer]> twoAgain = zeroOrPlentyReference;
// and even one OR two parameters!
Callable<Void, [Integer, Integer=]> oneOrTwoAgain = zeroOrPlentyReference;

Although this is something that dynamic languages have been able to do for decades, this is pretty impressive in a statically typed language, and allows flexibility in how you use, declare and pass functions around that follows the rules of logic (and checked!) and don´t make you fight the type system.

By the way

I´ve shown here that we support method references in Ceylon, but the same is true with constructors, because after all a constructor is nothing more than a function that takes parameters and returns a new instance:

class Foo(Integer x){
    shared actual String string = "Foo " x "";
}
print(Foo(2));

Callable<Foo, [Integer]> makeFoo = Foo;
print(makeFoo(3));

Will print:

Foo 2
Foo 3

Modules in Ceylon

Built-in support for modularity is a major goal of the Ceylon project, but what am I really talking about when I use this word? Well, I suppose there's multiple layers to this:

  1. Language-level support for a unit of visibility that is bigger than a package, but smaller than "all packages".
  2. A module descriptor format that expresses dependencies between specific versions of modules.
  3. A built-in module archive format and module repository layout that is understood by all tools written for the language, from the compiler, to the IDE, to the runtime.
  4. A runtime that features a peer-to-peer classloading (one classloader per module) and the ability to manage multiple versions of the same module.
  5. An ecosystem of remote module repositories where folks can share code with others.

I'm not going to get into a whole lot of fine detail of this, partly because what I have written down in the language spec today will probably change by the time you actually get to use any of this stuff, but let me give you a taste of the overall architecture proposed.

Module-level visibility

A package in Ceylon may be shared or unshared. An unshared package (the default) is visible only to the module which contains the package. We can make the package |shared| by providing a package descriptor:

Package package { 
    name = 'org.hibernate.query'; 
    shared = true; 
    doc = "The typesafe query API."; 
}

(Note: The package descriptor syntax has since changed

(Alert readers will notice that this is just a snippet of Ceylon code, using the "declarative" object builder syntax.)

A shared package defines part of the "public" API of the module. Other modules can directly access shared declarations in a shared package.

Module descriptors

A module must explicitly specify the other modules on which it depends. This is accomplished via a module descriptor:

Module module { 
    name = 'org.hibernate'; 
    version = '3.0.0.beta'; 
    doc = "The best-ever ORM solution!"; 
    license = 'http://www.gnu.org/licenses/lgpl.html'; 
    Import {
        name = 'ceylon.language'; 
        version = '1.0.1'; 
        export = true;
    }, 
    Import {
        name = 'java.sql'; 
        version = '4.0';
    }
}

(Note: The module descriptor syntax has since changed

A module may be runnable. A runnable module must specify a |run()| method in the module descriptor:

Module module { 
    name = 'org.hibernate.test'; 
    version = '3.0.0.beta'; 
    doc = "The test suite for Hibernate";
    license = 'http://www.gnu.org/licenses/lgpl.html'; 
    void run() {
        TestSuite().run();
    } 
    Import {
        name = 'org.hibernate'; version = '3.0.0.beta';
    }
}

(Note: The module descriptor syntax has since changed

Module archives and module repositories

A module archive packages together compiled |.class| files, package descriptors, and module descriptors into a Java-style jar archive with the extension car. The Ceylon compiler doesn't usually produce individual |.class| files in a directory. Instead, it directly produces module archives.

Module archives live in module repositories. A module repository is a well-defined directory structure with a well-defined location for each module. A module repository may be either local (on the filesystem) or remote (on the Internet). Given a list of module repositories, the Ceylon compiler can automatically locate dependencies mentioned in the module descriptor of the module it is compiling. And when it finishes compiling the module, it puts the resulting module archive in the right place in a local module repository.

(The architecture also includes support for source directories, source archives, and module documentation directories, but I'm not going to cover all that today.)

Module runtime

Ceylon's module runtime is based on JBoss Modules, a technology that also exists at the very core of JBoss 7. Given a list of module repositories, the runtime automatically locates a module archive and its versioned dependencies in the repositories, even downloading module archives from remote repositories if necessary.

Normally, the Ceylon runtime is invoked by specifying the name of a runnable module at the command line.

Module repository ecosystem

One of the nice advantages of this architecture is that it's possible to run a module "straight off the internet", just by typing, for example:

ceylon run org.jboss.ceylon.demo -rep http://jboss.org/ceylon/modules

(Note: The command line tools have since changed, and the above command would now be ceylon run org.jboss.ceylon.demo --rep=http://jboss.org/ceylon/modules)

And all required dependencies get automatically downloaded as needed.

Red Hat will maintain a central public module repository where the community can contribute reusable modules. Of course, the module repository format will be an open standard, so any organization can maintain its own public module repository.