Blog of Gavin King

Destructuring considered harmful

A language is said to feature destructuring if it provides a syntax for quickly declaring multiple local variables and assigning their values from the attributes of some complex object. For example, in Ceylon, we let you write:

for (k->v in map) { ... }

This is a simple kind of destructuring where the key and item attributes of the map Entry are assigned to the locals k and v.

Let's see a couple more examples of destructuring, written in a hypothetical Ceylon-like language, before we get to the main point of this post.

A number of languages support a kind of parallel assignment syntax for destructuring tuples. In our hypothetical language, it might look like this:

String name, Value val = namedValues[i];

Some languages support a kind of destructuring that is so powerful that it's referred to as pattern matching. In our language we might support pattern matching in switch statements, using a syntax something like this:

Person|Org identity = getIdentityFromSomewhere();
switch (identity)
case (Person(name, age, ...)) {
    print("Person");
    print("Name: " + name);
    print("Age: " + age);
}
case (Org(legalName, ...)) {
    print("Organization");
    print("Name: " + legalName);
}

Now, I've always had a bit of a soft spot for destructuring—it's a minor convenience, but there are certainly cases (like iterating the entries of a map) where I think it improves the code. A future version of Ceylon might feature a lot more support for destructuring, but there are several reasons why I'm not especially enthusiastic about the idea. I'm going to describe just one of them.

Let's start with the "pattern matching" example above. And let's stipulate that I—perhaps more than most developers—rely almost completely on my IDE to write my code for me. I use Extract Value, Extract Function, Assign To Local, Rename, ⌘1, etc, in Ceylon IDE like it's a nervous tic. So of course the first thing I want to do when I see code like the above is to run Extract Function on the two branches, resulting in:

Person|Org identity = getIdentityFromSomewhere();
switch (identity)
case (Person(name, age, ...)) {
    printPerson(name, age);
}
case (Org(legalName, ...)) {
    printOrg(legalName);
}

...

void printPerson(String name, Integer age) {
    print("Person");
    print("Name: " + name);
    print("Age: " + age);
}

void printOrg(String legalName) {
    print("Organization");
    print("Name: " + legalName);
}

Ooops. Immediately we have a problem. The schema of Person and Org is smeared out over the signatures of printPerson() and printOrg(). This makes the code much more vulnerable to changes to the schema of Person or Org, makes the code more vulnerable to changes to the internal implementation of these methods (if we want to also print the Person's address, we need to add a parameter), and it even makes the code less typesafe. The problem gets worse and worse as I recursively run Extract Value and Extract Function on the implementation of printPerson() and printOrg().

Now consider what we would get without the use of destructuring, as we would do in Ceylon today. We would have started with:

Person|Org identity = getIdentityFromSomewhere();
switch (identity)
case (is Person) {
    print("Person");
    print("Name: " + identity.name);
    print("Age: " + identity.age);
}
case (is Org) {
    print("Organization");
    print("Name: " + identity.legalName);
}

Whether this is better or worse than the code using of pattern matching is somewhat in the eye of the beholder, but clearly it's not much worse and is arguably even a little cleaner. Now let's run Extract Function on it. We get:

Person|Org identity = getIdentityFromSomewhere();
switch (identity)
case (is Person) {
    printPerson(identity);
}
case (is Org) {
    printOrg(identity);
}

...

void printPerson(Person identity) {
    print("Person");
    print("Name: " + identity.name);
    print("Age: " + identity.age);
}

void printOrg(Organization identity) {
    print("Organization");
    print("Name: " + identity.legalName);
}

I think it's very clear that this a much better end result. And I hope it's also clear that this is in no way a contrived example. The arguments I'm making here scale to most uses of pattern matching. The problem here is that introducing local variables too "early" screws things up for refactoring tools.

Essentially the same argument applies to tuples: a tuple seems like a convenient thing to use when you "just" have a quick helper function that returns two values. But after a few iterations of Extract Function/Extract Value, you wind up with five functions with the tuple type (String, Value) smeared out all over the place, resulting in code that is significantly more brittle than it would have been with a NamedValue class.

I've repeatedly heard the complain that "oh but sometimes it's just not worth writing a whole class to represent the return value of one function". I think this overlooks the effect of code growing and evolving and being refactored. And it also presupposes that writing a class is a pain, as it is in Java. But in Ceylon writing a class is easy—indeed, it looks just like a function! Instead of this:

(String, Value) getNamedValue(String name) {
    return (name, findValueForName(name));
}

we can just write this:

class NamedValue(name) {
    shared String name;
    shared Value val = findValueForName(name);
}

No constructor, no getters/setters, and if this is a member of another class, you can just annotate it shared default, and it's even polymorphic, meaning that there is not even a need to write a factory method. And this solution comes with the huge advantage that the schema of a NamedValue is localized in just one place, and won't start to "smear out" as your codebase grows and evolves.

Tricks with iterable objects

Now that Ceylon has full support for mixin inheritance, we've been able to fill in all the interesting operations of the Iterable interface defined in ceylon.language. You can see the definition of Iterable here. (But be aware that, like the rest of the language module, the actual real implementation is in Java and JavaScript. The Ceylon definition is never actually compiled.)

Mapping and filtering an Iterable object

Of course, Iterable has the famous functions map(), and filter(). You can call them like this:

value filtered = "Hello Word".filter((Character c) c.uppercase);
value mapped = "Hello Word".map((Character c) c.uppercased);

(This works because a String is an Iterable<Character>.)

These operations each return Iterable—in this case, Iterable<Character>—so they don't actually allocate memory (except for a single object instantiation). If you want to actually get a new String, you need to call a function to do that:

print(string(filtered...)); //prints "HW"
print(string(mapped...)); //prints "HELLO WORLD"

As an aside, we think this is the right approach. I understand that some folks think it's better that calling filter() on a String results in a String, or that filter()ing a Set results in a Set, but I think it's quite common that the resulting container type should not be the same as the original container type. For example, it's unclear that calling map() on a Set should result in a Set, and it's certainly not correct that map()ing a String results in another String, or that map()ing a Map always results in another Map.)

Now, map() and filter() have their uses, I suppose, but in fact they're not the usual way to do mapping and filtering in Ceylon. We would really write the above code like this:

print(string(for (c in "Hello Word") if (c.uppercase) c)); //prints "HW"
print(string(for (c in "Hello Word") c.uppercased)); //prints "HELLO WORLD"

Likewise, Iterable has the methods any() and every(), but it's still usually more convenient and idiomatic to use comprehensions:

value allLowercase = every(for (c in "Hello Word") c.lowercase); //false
value someUppercase = any(for (c in "Hello Word") c.uppercase); //true

More operations of Iterable

However, there are some really useful methods of Iterable. First, find() and findLast():

value char = "Hello Word".find((Character c) c>`l`); //`o`

We can write this using a comprehension, but to be honest in this case it's slightly less ergonomic:

value char = first(for (c in "Hello World") if (c>`l`) c);

Next, sorted():

value sorted = "Hello World".sorted(byIncreasing((Character c) c.uppercased)); 
        //" deHllloorW"

Finally, fold():

value string = "Hello World".fold("", 
        (String s, Character c) 
            s.empty then c.string 
                    else s + c.string + " ");
        //"H e l l o  W o r l d"

value list = "Hello World".fold({}, 
        (Character[] seq, Character c) 
            c.letter then append(seq,c) 
                     else seq); 
        //{ H, e, l, l, o, W, o, r, l, d }

There's also two very useful attributes declared by Iterable. coalesced produces an iterable object containing the non-null elements:

value letters = { "Hello World".map((Character c) c.letter 
        then c.uppercased).coalesced... };
        //{ H, E, L, L, O, W, O, R, L, D }

The indexed attribute produces an iterable object containing the elements indexed by their position in the stream:

value entries = { "Hello World".indexed... }; 
        //{ 0->H, 1->e, 2->l, 3->l, 4->o, 5-> , 6->W, 7->o, 8->r, 9->l, 10->d }

It's quite interesting to see the declaration of these operations. For example:

shared default Iterable<Element&Object> coalesced {
    return elements(for (e in this) if (exists e) e);
}

Notice how the use of the intersection type Element&Object eliminates the need for a type parameter with a lower bound type constraint, which would be required to write down the signature of this operation in most other languages. Indeed, we use the same trick for the operations union() and intersection() of Set, as you can see here.

Set union and intersection

We let you write the union of two Sets as s|t in Ceylon, and the intersection of two Sets as s&t. Now check this out:

Set<String> strings = ... ;
Set<Integer> ints = ... ;
value stringsAndInts = strings|ints; //type Set<String|Integer>
value stringsAndInts = strings&ints; //type Set<Bottom>, since Integer&String is an empty type

That is, the type of the union of a Set<X> with a Set<Y> is Set<X|Y> and the type of the intersection of a Set<X> with a Set<Y> is Set<X&Y>. Cool, huh?

By the way, we just added similar methods withLeading() and withTrailing() to List:

value floatsAndInts = { 1, 2, 3.0 }; //type Sequence<Integer|Float>
value stuff = floatsAndInts.withTrailing("hello", "world"); //type Sequence<Integer|Float|String>

(These operations didn't make it into M3.1.)

min() and max()

Well, here's something else that's cool. Ceylon distinguishes between empty and nonempty sequences within the type system. An empty sequence is of type Empty, a List<Bottom>. A non-empty sequence is a Sequence<Element>,a List<Element>. (There's even a convenient if (nonempty) construct for narrowing a sequence to a nonempty sequence without too much ceremony.)

This lets us do something pretty cool with the signature of min() and max().

value nothingToMax = max({}); //type Nothing
value somethingToMax = max({0.0, 1.0, -1.0}); //type Float
List<Character> chars = "hello";
value maybeSomethingToMax = max(chars); //type Character?

That is, according to its signature, the max() function returns:

  • null when passed an empty sequence,
  • a non-null value when passed a nonempty sequence, or
  • a possibly-null value when passed an iterable object that we don't know is empty or nonempty.

It's important to point out that this is not some special feature built in as a special case in the type system. Nor is it in any way related to overloading. Indeed, until a couple of weeks ago I didn't even realize that this was possible. It's rather a demonstration of how expressive our type system is: in this case, the combination of intersection types, union types, and principal instantiation inheritance of generic types lets us express something within the type system that might seem almost magical. The real magic is in the declaration of the language module type ContainerWithFirstElement.

Ceylon and Ceylon IDE updated to M3.1

Ceylon M3.1 is now available for download, along with a simultaneous compatible release of Ceylon IDE. This release fixes several bugs in the recent M3 release of Ceylon, and introduces several enhancements.

To the Ceylon language module:

  • a whole suite of new operations for working with iterable objects and collections.

And to Ceylon IDE:

  • major improvements to the builder in Ceylon IDE, incorporating performance enhancements and better support for multi-project builds and for intercompilation of Ceylon with Java, and
  • cross-project refactoring.

Ceylon M3.1 is otherwise compatible with M3.

You can download the Ceylon command line distribution here:

http://ceylon-lang.org/download

Or you can install Ceylon IDE from Eclipse Marketplace or from our Eclipse update site.

Ceylon M3.1 and Ceylon IDE M3.1 require Java 7.

Source code

The source code for Ceylon, its specification, and its website, is freely available from GitHub:

https://github.com/ceylon

Issues

Bugs and suggestions may be reported in GitHub's issue tracker.

Community

The Ceylon community site includes documentation, the current draft of the language specification, the roadmap, and information about getting involved.

http://ceylon-lang.org

Acknowledgement

We're deeply indebted to the community volunteers who contributed a substantial part of the current Ceylon codebase, working in their own spare time. The following people have contributed to this release:

Gavin King, Stéphane Épardaud, Tako Schotanus, Emmanuel Bernard, Tom Bentley, Aleš Justin, David Festal, Flavio Oliveri, Max Rydahl Andersen, Mladen Turk, James Cobb, Tomáš Hradec, Michael Brackx, Ross Tate, Ivo Kasiuk, Enrique Zamudio, Julien Ponge, Julien Viet, Pete Muir, Nicolas Leroux, Brett Cannon, Geoffrey De Smet, Guillaume Lours, Gunnar Morling, Jeff Parsons, Jesse Sightler, Oleg Kulikov, Raimund Klein, Sergej Koščejev, Sjur Bakka, Chris Marshall, Simon Thum, Maia Kozheva, Shelby.

Ceylon M3 and Ceylon IDE M3 released!

Ceylon M3 "V2000" is now available for download, along with a simultaneous compatible release of Ceylon IDE. The compiler now implements almost all of the language specification, and Ceylon now fully supports both Java and JavaScript virtual machines as execution environments. The first three Ceylon platform modules are available in Ceylon Herd, the community module repository.

You can download the Ceylon command line distribution here:

http://ceylon-lang.org/download

Or you can install Ceylon IDE from Eclipse Marketplace or from our Eclipse update site.

Ceylon M3 and Ceylon IDE M3 require Java 7.

The Ceylon team hopes to release Ceylon 1.0 beta in September or October.

Language features

M3 is is an almost-complete implementation of the Ceylon language, including the following new features compared to M2:

The following language features are not yet supported in M3:

  • member class refinement and type families
  • type aliases
  • reified generics
  • user-defined annotations, interceptors, and the type safe metamodel
  • serialization

This page provides a quick introduction to the language. The draft language specification is the complete definition.

Ceylon IDE

Ceylon IDE is a complete development environment for Ceylon based on the Eclipse platform. This release of Ceylon IDE introduces:

  • support for interoperation with Java,
  • integration with Ceylon Herd,
  • many new Quick Fixes and Quick Assists, and
  • many, many bugfixes.

Ceylon IDE now automatically fetches module archives from Ceylon Herd to satisfy dependencies declared in the module descriptor.

It's now possible to write Ceylon code that calls a Java binary, navigate to its attached source code, autocomplete its declarations, hover to view its JavaDoc, etc. It's even possible to have a project that mixes Ceylon code with Java code.

Ceylon IDE M3 requires Java 7. Users of Ceylon IDE on Mac OS should install Eclipse Juno. Users on other platforms may run Ceylon IDE in either Eclipse Indigo or Eclipse Juno on Java 7. Ceylon IDE will not work if Eclipse is run on Java 6.

Support for JavaScript virtual machines and Node.js

The Ceylon command line compiler now integrates support for compilation to JavaScript. All supported language features, and all features of the Ceylon language module are supported on JavaScript.

Ceylon programs compiled to JavaScript execute on standard JavaScript virtual machines. The Ceylon command line distribution includes a launcher for running Ceylon programs on Node.js.

Interoperation with Java

Interoperation with Java code is now robust and well-tested. This release fixes a number of bugs and corner cases that affected Java interoperation in the previous release. Ceylon now requires Java 7.

Platform modules

The following platform modules are now available in Ceylon Herd:

  • ceylon.math provides arbitrary precision numeric types and numeric functions
  • ceylon.file defines an API for interacting with heirarchical filesystems
  • ceylon.process defines an API for starting native child processes.

The language module, ceylon.language is included in the distribution.

Modularity and runtime

The toolset and runtime for Ceylon are based around .car module archives and module repositories. The runtime supports a modular, peer-to-peer class loading architecture, with full support for module versioning and multiple repositories, including support for local and remote module repositories, using the local file system, HTTP, WebDAV, or even Maven repositories for interoperation with Java.

The shared community repository, Ceylon Herd is now online:

https://herd.ceylon-lang.org

Source code

The source code for Ceylon, its specification, and its website, is freely available from GitHub:

https://github.com/ceylon

Issues

Bugs and suggestions may be reported in GitHub's issue tracker.

Community

The Ceylon community site includes documentation, the current draft of the language specification, the roadmap, and information about getting involved.

http://ceylon-lang.org

Acknowledgement

We're deeply indebted to the community volunteers who contributed a substantial part of the current Ceylon codebase, working in their own spare time. The following people have contributed to this release:

Gavin King, Stéphane Épardaud, Tako Schotanus, Emmanuel Bernard, Tom Bentley, Aleš Justin, David Festal, Flavio Oliveri, Max Rydahl Andersen, Mladen Turk, James Cobb, Tomáš Hradec, Michael Brackx, Ross Tate, Ivo Kasiuk, Enrique Zamudio, Julien Ponge, Julien Viet, Pete Muir, Nicolas Leroux, Brett Cannon, Geoffrey De Smet, Guillaume Lours, Gunnar Morling, Jeff Parsons, Jesse Sightler, Oleg Kulikov, Raimund Klein, Sergej Koščejev, Chris Marshall, Simon Thum, Maia Kozheva, Shelby.

M3 status update

So it's now just over a year since my first presentation about Ceylon in Beijing. Since then, we've put together a great team, a website, the community module repository, and a full-featured IDE. And it looks like we've now implemented pretty much all the language features that I talked about in Beijing, and much more besides, for both the JVM-based backend and the JavaScript backend. Of course, while all this was happening, the language itself was evolving and the specification maturing. Phew, that sounds like a lot in just a year!

I've spent the last three days with Stef and Emmanuel in Paris, discussing a bunch of technical problems, and planning out the release of Ceylon M3. The major new features of this release are:

  • integration of the JavaScript backend into the distribution,
  • completion of support for higher-order functions including curried functions, anonymous functions, inline functions in named argument lists, and indirect invocations,
  • concrete interface members, and
  • comprehensions.

The M3 release is now planned for the second week of June.

We're also now turning our thoughts to the Ceylon SDK. By the time M3 is ready, or soon after, we'll have preview releases of several SDK modules available in Ceylon Herd, including ceylon.math, ceylon.net, and ceylon.fs. Of course, the SDK will go through a lot of growth and evolution over the coming year.

Meanwhile, we've started work on integrating Ceylon with Red Hat's open source cloud platform.

The bad news is we've decided to cancel the promised M2 release of Ceylon IDE. Sorry. The good news? We plan to release an M3-compatible IDE in June. The focus of this release will be Java interop and integration with Ceylon Herd, including:

  • automatic fetching of modules from Ceylon Herd in order to satisfy dependencies declared in the module descriptor,
  • the ability to call Java binaries from Ceylon, navigate to their attached source code, autocomplete their declarations, etc, and
  • to inter-compile Ceylon with Java, even in the same Eclipse project!

There's also some new quickfixes and autocompletions, a Create Subtype wizard, and my favorite trick, the Move to New Unit refactoring.

It would be nice to have some support for compiling for and launching to node.js in the M3 release of the IDE, but I can't promise that one.

Now that we've got so much to demo and talk about, we're trying to do more events. There's several talks coming up in June.