Blog of Stéphane Épardaud

Interview of Stéphane Épardaud on Ceylon at Devoxx France

Last week we did a presentation at Devoxx France in Paris, with Emmanuel, and I had the opportunity to talk about Ceylon in an interview where we discuss Ceylon for almost 13 minutes.

The Ceylon Herd has been unleashed

If there's one thing the whole Ceylon team has in common it's that we're pragmatists. We have a ton of ideas of how we can make the programmer's life better with Ceylon, and we try to do it whenever we can. In fact, we have so many ideas that it's hard to bring them all to fruition. With people such as Gavin and I in the team, being very involved in Web Frameworks, I can tell you that among the many things we want to do with Ceylon, the only thing holding us back from writing an excellent Web Framework for Ceylon (or adapting one) is that we simply have more urgent things to fix right now.

Why we needed a new module repository

One of those urgent things to fix, we quickly realised, was to have a better, leaner, nicer, friendlier module repository for Ceylon modules. You probably know by now that all the Ceylon tools can already talk to module repositories, local or remote. By talk I mean, use and publish to.

We added support for WebDAV from the start, and then we tried it and found that it was actually hard for people to set up WebDAV on their server to set up their own repo. Because that's the thing with open-source: we're not trying to solve the repo problem just for us, but also for all our users. We want them to be setting up repos as easily as we do.

We went and looked at the existing module repositories like Perl's CPAN, Ruby's Gems, Maven's Nexus, or Play's modules. We liked what we saw in some of them, but what we noticed was that none of them had all the features we wanted for a great repository experience. Probably the sum of them all would be the Best Repo In The World, but truth is, it'd probably be a spaghetti monster too.

We wanted something that looks nice for consumers, with a friendly, clear and intuitive interface, but at the same time we wanted to make it just as easy for producers, with a clear path to publish modules, and at the same time a collaborative aspect, such as found in Github that we like so much.

Ceylon Herd

So we spent a couple of days banging code together and came up with the following quite impressive list of features for consumers:

  • Browse repository as file system
  • Browse module list
  • Search for modules
  • Browse module documentation
  • View module information, such as dependencies, docs, source
  • Links from modules to file system view and back
  • Feeds to catch up with new modules, new module versions or modules published by user
  • View user activity (list of published modules)

And for producers:

  • Public registration
  • Integrated interactive project claim (more on that later)
  • Upload area: to create private staging repos where you upload your modules
  • Uploaded modules are checked by our system to help you:
    • Module paths
    • Module ownership
    • Duplicate module detection
    • SHA1 Signatures verification
    • API docs presence verification
    • Source code presence verification
    • Dependencies check
    • Many other checks
  • You can upload multiple modules at once
  • Upload using the Ceylon tools or the website, individual artifacts or with a zipped repo
  • Publish your modules once all verifications pass
  • Edit your module information
  • Grant publishing permissions to your colleagues
  • Transfer module ownership

You're not dreaming, we did all of this, and a lot more under the hood as well as in the UI, to make sure newcomers are not overwhelmed or abandoned, to make sure everyone feels right at home in Ceylon Herd, our open-source Ceylon module repository.

You heard that right: today we unveal our official module repository, called Ceylon Herd, which is where we are (we as in me, you, everyone) going to publish our Ceylon modules for all to use. And because we're open-source guys, and we don't want to lock people into a fenced-wall service, we're making Ceylon Herd available as Free Software, so that everyone can not only contribute to make it better, but use it privately or publicly.

Our version of Ceylon Herd, running at modules.ceylon-lang.org, will be the official place to get official Ceylon modules, as well as the central place to get third-party modules, as long as they are open-source. We're still working out the details of the hosting policy, so we've disabled registration for now, which means that for you guys it is purely read-only until we open it up completely, but rest assured that will happen as soon as we can, so you can start sharing too.

Meanwhile you can start consuming the few modules we put there to get you started.

Who can publish what?

Because we plan to use Ceylon Herd as the official Ceylon module repository, we need to make sure that the modules published there are legit and functional. If John Doe can come in and start publishing modules he doesn't own, or participate in, or represent, then that's just bad for our repository. John Doe is free to use another Ceylon Herd instance, but we're going to be careful who we let publish in modules.ceylon-lang.org.

So our solution is called project claims: when you have registered on Ceylon Herd, you can claim a project (essentially a module name), and you explain its license, point to its home page, who you are and why we need it in Ceylon Herd. We're immediately notified and start checking up on the project, verify that you are who you say you are, and that you should indeed be allowed to publish on behalf of that project.

This verification is an interactive process, we may ask you questions, via comments on the claim, which you get notified about and can answer. As soon as we've made up our mind, we'll either confirm your claim, or reject it. If it is rejected, don't be afraid to re-claim it if you feel we were wrong, we can discuss it again, especially if you have new and good evidence. We're only trying to help authenticate module publishers, not control what goes in or not (though it has to be open-source).

Once your claim is verified, you are free to publish as many versions of your module as you want, as well as delegate publishing of that module to other Ceylon Herd users (your colleagues, project buddies, spouse or kids for the luckier). At any moment you can transfer project ownership to another Ceylon Herd user.

If you think this is not good enough, please let us know.

OK, enough with the prep talk, how do I get started?

Ceylon Herd is available here, we can help you get started, and you can even download its source code to run it where you want and improve it. Use, follow, share, contribute, have fun!

Second official release of Ceylon

Today, we're proud to announce the release of Ceylon M2 "Minitel". This is the second official release of the Ceylon command line compiler, documentation compiler, language module, and runtime, and a major step down the roadmap toward Ceylon 1.0, with most of the Java interoperability fully specified and implemented.

You can get it here:

http://ceylon-lang.org/download

We plan a compatible M2 release of Ceylon IDE later this week.

Language features

In terms of the language itself, M2 has essentially all the features of Java except enumerated types, user-defined annotations, and reflection. It even incorporates a number of improvements over Java, including:

  • JVM-level primitive types are ordinary classes in Ceylon
  • type inference and type argument inference based on analysis of principal types
  • streamlined class definitions via elimination of getters, setters, and constructors
  • optional parameters with default values
  • named arguments and the "object builder" syntax
  • intersection types, union types, and the bottom type
  • static typing of the null value and empty sequences
  • declaration-site covariance and contravariance instead of wildcard types
  • more elegant syntax for type constraints
  • top-level function and value declarations instead of static members
  • nested functions
  • richer set of operators
  • more elegant syntax for annotations
  • immutability by default
  • first-class and higher-order functions except anonymous functions
  • method and attribute specifiers
  • algebraic types, enumerated types, and switch/case

Support for the following language features is not yet available:

  • anonymous functions
  • multiple parameter lists
  • comprehensions
  • mixin inheritance
  • member class refinement
  • reified generics
  • user-defined annotations and the type safe metamodel

This page provides a quick introduction to the language. The draft language specification is the complete definition.

Java interoperability

There were many improvements to Java interoperability in this release, which makes it very easy to call Java from Ceylon.

Most of the interoperability issues with Java have been fixed, and there are very few remaining issues that we will fix for the next release, though they only concern corner-cases that we don't expect users to meet.

Performance

We spent a lot of time improving performance for this release, in particular arithmetic operator performance, but we still have a lot of areas we expect to improve for the next release, such as the speed of interoperability with Java arrays and improvements on boxing.

Modularity and runtime

Ceylon modules may be executed on any standard Java 6 compatible JVM. The toolset and runtime for Ceylon is based around .car module archives and module repositories. The runtime supports a modular, peer-to-peer class loading architecture, with full support for module versioning and multiple repositories.

This release of Ceylon includes support for local and remote module repositories, using the local file system, HTTP or WebDAV. In order to make it easy to use Java libraries from Ceylon you can even use Maven repositories.

Support for the shared community repository modules.ceylon-lang.org will be available in the next release.

Chapter 7 of the language specification contains much more information about the Ceylon module system and command line tools.

SDK

At this time, the only module available is the language module ceylon.language, included in the distribution.

Source code

The source code for Ceylon, its specification, and its website, is freely available from GitHub:

https://github.com/ceylon

Issues

Bugs and suggestions may be reported in GitHub's issue tracker.

Community

The Ceylon community site includes documentation, the current draft of the language specification, the roadmap and information about getting involved.

http://ceylon-lang.org

Acknowledgement

We're deeply indebted to the community volunteers who contributed a substantial part of the current Ceylon codebase, working in their own spare time. The following people have contributed to this release:

Gavin King, Stéphane Épardaud, Tako Schotanus, Emmanuel Bernard, Tom Bentley, Aleš Justin, David Festal, Flavio Oliveri, Max Rydahl Andersen, Mladen Turk, James Cobb, Tomáš Hradec, Michael Brackx, Ross Tate, Ivo Kasiuk, Enrique Zamudio, Julien Ponge, Julien Viet, Pete Muir, Nicolas Leroux.

How we test Ceylon

Writing a new language is a lot about the ideas that go into the language, and then a lot about implementation.

In the case of Ceylon, a lot of effort went into thinking, discussions, negotiations, explanations, and eventually trying and documentation, in the form of a specification. Then there's work to be done on the AntLR grammar, the Abstract Syntax Tree (AST) that the parser/lexer give us, and then the typechecking phase until finally we get to the backend (backends, in our case: Java bytecode and JavaScript).

But we're not stopping at the language, we're also developing a whole SDK, with an API, tools (ceylond, ceylonc, ceylon), a module system, and an IDE.

Naturally, while we're in the process of implementing all this stuff, we spend a fair amount of time discovering new problems, new solutions, and refactoring a lot of code to make it fitter. The number of interdependent pieces in all this — and the expectation of high-quality — is such that we have to have a proper test suite for these things and, rest assured: we do. If you've ever wondered how new languages are tested, read on to see how we test the Ceylon implementation.

Type-checker and parser tests

The first tests in the tool chain concern the very early phase of compiling Ceylon: checking that good code compiles, bad code does not compile, and that the type system reasons about types the way we expect it to. We currently have 1251 such checks, quite smartly done using compiler annotations such as @error to mark an AST node where we expect an error, and @type["Foo"] to make sure the AST node is of type Foo.

Here is a brief example where we make sure we can't return a value from a void method:

void cannotReturnValue() {
    if ("Hello"=="Goodbye") {
        @error return var;
    }
}

And another example where we make sure that the type inference is correct:

@type["Sequence<String>|Sequence<Integer>"] value ut = f({ "aaa" },{ 1 });

The tests will walk the AST and when they see one of these annotations, they check that an error is reported on the node, or that the node type is as we expected.

Java bytecode generation tests

In the Java bytecode compiler backend, we generate Java bytecode, but as previously described we are plugged in the javac compiler, and we feed it (pseudo-) Java AST. This turns out to be again very useful since it's much easier to compare Java source code than byte code (for humans). So we have tests that try to compile a small Ceylon file, typically one per feature, and then we compare the Java code generated with the Java code we expect. For example, to make sure we can create a class in Ceylon, we write this test:

class Klass() {}

And we see if it generates the following Java code:

package com.redhat.ceylon.compiler.java.test.structure.klass;

class Klass {
    Klass() {
    }
}

We currently have around 250 tests like these. This might seem a small number, but it's the number of files we're comparing, so for example when we test numeric operation optimisation, we have one test with 350 lines of Ceylon numeric operator tests to be tested. We have 100% coverage of the each feature promised on the roadmap.

Of course, occasionally we have bugs for things that are corner cases, so for each bug reported we have one such test as well.

Model loader tests

The compiler is not only generating bytecode, though. It's also loading bytecode. By the truck-load. Since it's an incremental compiler, it needs to be able to load compiled Ceylon bytecode and map it into a model that the type-checker can work with. The piece that loads bytecode (using three different reflection libaries: Javac, Java Reflection and Eclipse JDT) is called the model loader.

Those get their own tests. We write two files. The first includes code we will compile:

shared class Klass() {
}

And the second will reference declarations from the first file:

Klass klass = Klass();

But because we compile the second one on its own, it will load the model for the first file from the compiled bytecode (incremental compilation). The test will then compare the model representation for Klass that we got during the first compilation (when we were compiling Klass) with the model representation we loaded for Klass during the second compilation. The first model will come directly from the typechecker after parsing the AST, while the second will come from the model loader. By walking both recursively and checking that they are completely equivalent, we ensure that we produce the right model when loading it from bytecode.

We have currently 11 such tests, but once again, do not be fooled by the low number of tests here, there's only a certain number of things we can test here: class, interface, method, attribute declarations and their signatures. We're not testing things like statements or expressions, only signatures of declarations. And for each declaration we have recursive tests that check every property of the model object (of which there are many).

Incidentally, we don't just load Ceylon bytecode, we also load Java bytecode, for interoperability with Java classes, and since the ceylon.language module is currently hand-written in Java, none of the Ceylon code could be compiled if the model loader wasn't able to load a model from its bytecode. So it gets a lot of testing everywhere.

Interoperability tests

As I mentioned previously, Ceylon is fully interoperable with Java, so we have tests that make sure that we can import Java modules, packages, types, call their methods, read their fields, implement Java interfaces, etc…

We currently have 10 tests for this, each including all the variations on a theme (static methods, fields, constructors…).

Recovery tests

We have a few tests to make sure the compiler backend doesn't crash on unparseable input, or on Ceylon code that is improperly typed.

Module system

We have 34 tests that make sure that the compiler produces .car files, source archives and MD5 checksums in the right place, that it can save and find them back locally, or via HTTP, that we can load .jar modules, and that we they contain what is expected. We also test that we can resolve dependency graphs, cache HTTP files and check the MD5 checksums.

Runtime behaviour tests

However beautiful the strategy, you should occasionally look at the results — Sir Winston Churchill

We have about 20 tests that make sure that we can invoke the Ceylon compiler, on any number of files, including incrementally. We test that we can run Ceylon programs. We also test that the runtime behaviour of statements is correct (it's not enough to check that the for loop is compiled to a certain Java bytecode, we want to make sure it runs as we think it should).

Tool tests

We have a few tests that check that ceylond (our documentation generator) is able to produce documentation, and that it produces it correctly, while using the model loader (using Java Reflection, unlike the compiler which uses Javac Reflection).

We have a few manual tests (we need to automate that) to check that our ant tasks run as expected.

ceylon.language tests

Last but not least, we have 633 runtime tests written in Ceylon that check that the ceylon.language API behaves as expected, which is the ultimate test, since it effectively requires all the pieces previously described to work in order to do anything.

We even have one test which attemps to compile the Ceylon reference implementation of ceylon.language, which the typechecker handles, but the backend doesn't yet. Once that one passes, we'll be ready for Ceylon 1.0.

About speed

I've recently had a discussion about the speed of test execution with the Ceylon team, and was shocked to discover that while I was complaining that the Java backend compiler was taking 40 seconds to run, some of my colleagues had to wait more than 2 minutes for the same tests!

We're now looking at running some of those tests in parallel to speed things up on multi-core systems, but unfortunately it doesn't look like we'll be able to run them in parallel using Eclipse. We should be able to do it using Ant, though.

On breaking tests

We break tests all the time. Most of the time when we fix a bug or add a feature we break a given number of tests. Sometimes we fix a bigger bug and break most of the tests. This is great: we can find the cause of the breakage straight away thanks to all these tests. Sometimes though, it takes a little bit of work to fix the tests we broke :)

Conclusion

We've an awful lot of tests, and we test (almost) everything. Those that are missing will get automated as soon as possible so we can forget about them, because that's what tests are for: when we fix a bug or add a feature, we know it works, and we know we didn't break anything. This is both priceless and fundamental for a new language.

Note

Since this post was originally written:

  • ceylonc has become ceylon compile.
  • ceylond has become ceylon doc.

Let it work

Hi, my name is Stéphane Épardaud and I´ll be your technical writer today :)

I want to talk a bit about some of the challenges we faced in the Ceylon compiler, and the solutions we found. As is described in the compiler architecture page the backend of the Ceylon compiler extends OpenJDK´s Javac compiler by translating Ceylon source code into Javac AST, which is then compiled into bytecode by Javac. Some of the reasons why we went this route of extending Javac rather than create our own compiler from scratch are that:

  • We are guaranteed to generate valid bytecode, because it has to be valid Java code, since it´s checked by Javac.
  • We can compile Java and Ceylon code at the same time, without needing to write a Java parser and compiler. (Well this is not technically true in M1, but it will definitely be possible).

But there are things we can´t do properly in Java, and here I´m going to give you an example where we scratched our heads in trying to find a proper mapping.

Attributes instead of fields

In Ceylon, we don´t have Java fields, we have attributes, which are similar to JavaBean´s properties. This means that Ceylon attributes are translated to JavaBean getters and setters. And for interoperability we map JavaBean properties to Ceylon attributes. Now the biggest challenge with using JavaBean getter and setter methods in place of fields is that we want attributes to support the same operations you can do on Java fields, such as the ++ operation. How do we map this:

class Counter() {
    Natural n = 0;
}
Counter c = Counter();
Natural n = c.n++;

Into working Java code which looks like this (optimised for long because otherwise ++ is polymorphic):

class Counter{
    long n;
    long getN(){
        return n;
    }
    void setN(long n){
        this.n = n;
    }
}
Counter c = new Counter();
long n = c.getN()++;

Wait a minute: this is not valid!

So the problem is that there are a lot of operations you can do on l-values, that is, variables which can be assigned. To summarize the difference between l-values and r-values, the following mnemonics helps: an l-value is something which can be assigned and read, it can appear as the left side of an assignment, while an r-value is an expression that can only be read and not assigned. In our example, c.n is an l-value while 2 + 2 would be an r-value.

So we expect to be able to do every assignment operation on l-values, such as :=, += and ++. The problem we face is that in Java, c.n is an l-value but when using getters, c.getN() is not: it´s an r-value, you can´t assign to it, you can´t do ++ on it. For that you need to use the setter. Now the thing is that setters in JavaBean return void, so they´re not expressions, or even an l-value: they´re statements. And we can´t put statements inside expressions. For instance we can´t do:

Counter c = new Counter();
long n = c.setN(c.getN()+1);

We cannot do that because setN() is a statement: it returns void. Plus that would actually be an incorrect way to define ++, since we need to return the old value of n prior to the increment, so we´d need a temporary variable. The only way to have statements inside expressions in Java is to create an anonymous class:

Counter c = new Counter();
long n = new Object(){
    long postIncrement(Counter c){
        long previousValue = c.getN();
        c.setN(previousValue+1);
        return previousValue;
    }
}.postIncrement(c);

And the solution to all other other assignment operations are similar: anonymous classes for things as trivial as ++, surely this is crazy? If only there were some other way, short of generating bytecode ourselves (in which case we can do whatever we want without needing do make it translatable into Java).

Let it be…

So one day we´re looking inside OpenJDK´s Javac to try to find something, and we stumble upon mention of a comma operator. For those who don´t know C), the comma operator (,) allows you to execute several expressions and return the right-most expression value.

We look at this and we think: “this can´t be right, Java doesn´t have the comma operator, we´d know”. So why is it there? Looking a bit more we discover that it´s there to support ++ on boxed Integer values. Because this isn´t a primitive operation, you need the same sort of workaround we have:

Integer i = new Integer(0);
Integer j = new Object(){
    Integer postIncrement(Integer previousValue){
        // assuming you could assign a captured variable:
        i = new Integer(previousValue.intValue() + 1);
        return previousValue;
    }
}.postIncrement(i);

So they use this operator in order to save a temporary value in an expression context, where you normally can´t. And upon further examination it turns out that they (the OpenJDK Javac authors) implemented the comma operator using an even more generic exppression: a Let expression!

I´m very familiar with let expressions, such as they are in Scheme or in ML, but I´m sure many of you are not, so in short:

A let expression allows you to declare and bind new variables in a local scope, run statements and return an expression from this scope, all in the context of an expression.

So let´s rewrite our previous example in pseudo-Java with let:

Integer i = new Integer(0);
Integer j = (let
              // store the previous value in a temporary variable
              Integer previousValue = i;
             in 
              // assign the new value
              i = new Integer(previousValue.intValue() + 1);
              // return the previous value
              return previousValue;);

Now, obviously this is not valid Java, because let expressions are not part of the Java language, but the OpenJDK Javac compiler uses this construct behind the scenes to rewrite parts of the Java AST into pseudo-code that can be translated into efficient bytecode in the end. All they needed was an AST node to represent this, and support from the bytecode generator to support this AST type.

And guess what: since we feed Java AST to Javac we can use this construct :)

In fact this is precisely how we solved most of our issues, such as the ++ operator:

Counter c = new Counter();
long n = (let
           long previousValue = c.getN();
          in
           c.setN(previousValue+1);
           return previousValue;);

This solution allows us to define every assignment operator such as :=, ++ or += on attributes, that are mapped into JavaBean getter/setter methods using efficient code.

All we needed to do was to add some bits of support for let expressions inside Javac because they never needed to get them so early in the AST so it was missing some support in one or two phases of the compiler, but peanuts really.

Conclusion

When we set out to extend the Javac compiler we didn´t really know what to expect, but over time we´ve found it has a really solid API and is very well done and documented. We were able to extend it in ways it was never imagined to be extended, and it followed along nicely. Not only that but we found out that the OpenJDK developers, when faced with the issue of ++ on boxed Integers didn´t just hack along some quick and dirty way to fix it: they went ahead and implemented a much more powerful and generic way to solve every similar issue with the let expression. Congratulation guys, you did good and it was worth it, because thanks to you we can implement really crazy stuff.

We´re now using this let expression for implementing many operators and features, such as:

  • named parameter invocation, to keep source-file evaluation order before reordering the parameters for the callee,
  • the ?., ? and ?[] null-safe operators, to store the temporary variable before we test it for null.

So thanks, OpenJDK authors, thanks to you we´ll have efficient compilation of Ceylon code :)