Interview of Stéphane Épardaud on Ceylon at Devoxx France
Last week we did a presentation at Devoxx France in Paris, with Emmanuel, and I had the opportunity to talk about Ceylon in an interview where we discuss Ceylon for almost 13 minutes.
Last week we did a presentation at Devoxx France in Paris, with Emmanuel, and I had the opportunity to talk about Ceylon in an interview where we discuss Ceylon for almost 13 minutes.
If there's one thing the whole Ceylon team has in common it's that we're pragmatists. We have a ton of ideas of how we can make the programmer's life better with Ceylon, and we try to do it whenever we can. In fact, we have so many ideas that it's hard to bring them all to fruition. With people such as Gavin and I in the team, being very involved in Web Frameworks, I can tell you that among the many things we want to do with Ceylon, the only thing holding us back from writing an excellent Web Framework for Ceylon (or adapting one) is that we simply have more urgent things to fix right now.
One of those urgent things to fix, we quickly realised, was to have a better, leaner, nicer, friendlier module repository for Ceylon modules. You probably know by now that all the Ceylon tools can already talk to module repositories, local or remote. By talk I mean, use and publish to.
We added support for WebDAV from the start, and then we tried it and found that it was actually hard for people to set up WebDAV on their server to set up their own repo. Because that's the thing with open-source: we're not trying to solve the repo problem just for us, but also for all our users. We want them to be setting up repos as easily as we do.
We went and looked at the existing module repositories like Perl's CPAN, Ruby's Gems, Maven's Nexus, or Play's modules. We liked what we saw in some of them, but what we noticed was that none of them had all the features we wanted for a great repository experience. Probably the sum of them all would be the Best Repo In The World, but truth is, it'd probably be a spaghetti monster too.
We wanted something that looks nice for consumers, with a friendly, clear and intuitive interface, but at the same time we wanted to make it just as easy for producers, with a clear path to publish modules, and at the same time a collaborative aspect, such as found in Github that we like so much.
So we spent a couple of days banging code together and came up with the following quite impressive list of features for consumers:
And for producers:
You're not dreaming, we did all of this, and a lot more under the hood as well as in the UI, to make sure newcomers are not overwhelmed or abandoned, to make sure everyone feels right at home in Ceylon Herd, our open-source Ceylon module repository.
You heard that right: today we unveal our official module repository, called Ceylon Herd, which is where we are (we as in me, you, everyone) going to publish our Ceylon modules for all to use. And because we're open-source guys, and we don't want to lock people into a fenced-wall service, we're making Ceylon Herd available as Free Software, so that everyone can not only contribute to make it better, but use it privately or publicly.
Our version of Ceylon Herd, running at modules.ceylon-lang.org
,
will be the official place to get official Ceylon modules, as
well as the central place to get third-party modules, as long as they are open-source. We're
still working out the details of the hosting policy, so we've disabled registration for now,
which means that for you guys it is purely read-only until we open it up completely, but rest
assured that will happen as soon as we can, so you can start sharing too.
Meanwhile you can start consuming the few modules we put there to get you started.
Because we plan to use Ceylon Herd as the official Ceylon module repository, we need to make
sure that the modules published there are legit and functional. If John Doe can come in
and start publishing modules he doesn't own, or participate in, or represent, then that's
just bad for our repository. John Doe is free to use another Ceylon Herd instance, but we're
going to be careful who we let publish in modules.ceylon-lang.org
.
So our solution is called project claims: when you have registered on Ceylon Herd, you can claim a project (essentially a module name), and you explain its license, point to its home page, who you are and why we need it in Ceylon Herd. We're immediately notified and start checking up on the project, verify that you are who you say you are, and that you should indeed be allowed to publish on behalf of that project.
This verification is an interactive process, we may ask you questions, via comments on the claim, which you get notified about and can answer. As soon as we've made up our mind, we'll either confirm your claim, or reject it. If it is rejected, don't be afraid to re-claim it if you feel we were wrong, we can discuss it again, especially if you have new and good evidence. We're only trying to help authenticate module publishers, not control what goes in or not (though it has to be open-source).
Once your claim is verified, you are free to publish as many versions of your module as you want, as well as delegate publishing of that module to other Ceylon Herd users (your colleagues, project buddies, spouse or kids for the luckier). At any moment you can transfer project ownership to another Ceylon Herd user.
If you think this is not good enough, please let us know.
Ceylon Herd is available here, we can help you get started, and you can even download its source code to run it where you want and improve it. Use, follow, share, contribute, have fun!
Today, we're proud to announce the release of Ceylon M2 "Minitel". This is the second official release of the Ceylon command line compiler, documentation compiler, language module, and runtime, and a major step down the roadmap toward Ceylon 1.0, with most of the Java interoperability fully specified and implemented.
You can get it here:
http://ceylon-lang.org/download
We plan a compatible M2 release of Ceylon IDE later this week.
In terms of the language itself, M2 has essentially all the features of Java except enumerated types, user-defined annotations, and reflection. It even incorporates a number of improvements over Java, including:
null
value and empty sequencesstatic
membersswitch/case
Support for the following language features is not yet available:
This page provides a quick introduction to the language. The draft language specification is the complete definition.
There were many improvements to Java interoperability in this release, which makes it very easy to call Java from Ceylon.
Most of the interoperability issues with Java have been fixed, and there are very few remaining issues that we will fix for the next release, though they only concern corner-cases that we don't expect users to meet.
We spent a lot of time improving performance for this release, in particular arithmetic operator performance, but we still have a lot of areas we expect to improve for the next release, such as the speed of interoperability with Java arrays and improvements on boxing.
Ceylon modules may be executed on any standard Java 6 compatible JVM. The toolset and
runtime for Ceylon is based around .car
module archives and module
repositories. The runtime supports a modular, peer-to-peer class
loading architecture, with full support for module versioning and
multiple repositories.
This release of Ceylon includes support for local and remote module repositories, using the local file system, HTTP or WebDAV. In order to make it easy to use Java libraries from Ceylon you can even use Maven repositories.
Chapter 7 of the language specification contains much more information about the Ceylon module system and command line tools.
At this time, the only module available is the language module
ceylon.language
, included in the distribution.
The source code for Ceylon, its specification, and its website, is freely available from GitHub:
Bugs and suggestions may be reported in GitHub's issue tracker.
The Ceylon community site includes documentation, the current draft of the language specification, the roadmap and information about getting involved.
We're deeply indebted to the community volunteers who contributed a substantial part of the current Ceylon codebase, working in their own spare time. The following people have contributed to this release:
Gavin King, Stéphane Épardaud, Tako Schotanus, Emmanuel Bernard, Tom Bentley, Aleš Justin, David Festal, Flavio Oliveri, Max Rydahl Andersen, Mladen Turk, James Cobb, Tomáš Hradec, Michael Brackx, Ross Tate, Ivo Kasiuk, Enrique Zamudio, Julien Ponge, Julien Viet, Pete Muir, Nicolas Leroux.
Writing a new language is a lot about the ideas that go into the language, and then a lot about implementation.
In the case of Ceylon, a lot of effort went into thinking, discussions, negotiations, explanations, and eventually trying and documentation, in the form of a specification. Then there's work to be done on the AntLR grammar, the Abstract Syntax Tree (AST) that the parser/lexer give us, and then the typechecking phase until finally we get to the backend (backends, in our case: Java bytecode and JavaScript).
But we're not stopping at the language, we're also developing a whole SDK, with an API, tools (ceylond
, ceylonc
,
ceylon
), a module system, and an IDE.
Naturally, while we're in the process of implementing all this stuff, we spend a fair amount of time discovering new problems, new solutions, and refactoring a lot of code to make it fitter. The number of interdependent pieces in all this — and the expectation of high-quality — is such that we have to have a proper test suite for these things and, rest assured: we do. If you've ever wondered how new languages are tested, read on to see how we test the Ceylon implementation.
The first tests in the tool chain concern the very early phase of compiling Ceylon: checking that good code compiles,
bad code does not compile, and that the type system reasons about types the way we expect it to. We currently have
1251 such checks, quite smartly done using compiler annotations such as @error
to mark an AST node where we expect
an error, and @type["Foo"]
to make sure the AST node is of type Foo
.
Here is a brief example where we make sure we can't return a value from a void
method:
void cannotReturnValue() {
if ("Hello"=="Goodbye") {
@error return var;
}
}
And another example where we make sure that the type inference is correct:
@type["Sequence<String>|Sequence<Integer>"] value ut = f({ "aaa" },{ 1 });
The tests will walk the AST and when they see one of these annotations, they check that an error is reported on the node, or that the node type is as we expected.
In the Java bytecode compiler backend, we generate Java bytecode, but as previously described
we are plugged in the javac
compiler, and we feed it (pseudo-) Java AST. This turns out to be again very useful since
it's much easier to compare Java source code than byte code (for humans). So we have tests that try to compile a small Ceylon
file, typically one per feature, and then we compare the Java code generated with the Java code we expect. For example, to make
sure we can create a class in Ceylon, we write this test:
class Klass() {}
And we see if it generates the following Java code:
package com.redhat.ceylon.compiler.java.test.structure.klass;
class Klass {
Klass() {
}
}
We currently have around 250 tests like these. This might seem a small number, but it's the number of files we're comparing, so for example when we test numeric operation optimisation, we have one test with 350 lines of Ceylon numeric operator tests to be tested. We have 100% coverage of the each feature promised on the roadmap.
Of course, occasionally we have bugs for things that are corner cases, so for each bug reported we have one such test as well.
The compiler is not only generating bytecode, though. It's also loading bytecode. By the truck-load. Since it's an incremental compiler, it needs to be able to load compiled Ceylon bytecode and map it into a model that the type-checker can work with. The piece that loads bytecode (using three different reflection libaries: Javac, Java Reflection and Eclipse JDT) is called the model loader.
Those get their own tests. We write two files. The first includes code we will compile:
shared class Klass() {
}
And the second will reference declarations from the first file:
Klass klass = Klass();
But because we compile the second one on its own, it will load the model for the first file from the compiled bytecode
(incremental compilation). The test
will then compare the model representation for Klass
that we got during the first compilation (when we were compiling Klass
)
with the model representation we loaded for Klass
during the second compilation. The first model will come directly from the
typechecker after parsing the AST, while the second will come from the model loader. By walking both recursively and checking
that they are completely equivalent, we ensure that we produce the right model when loading it from bytecode.
We have currently 11 such tests, but once again, do not be fooled by the low number of tests here, there's only a certain number of things we can test here: class, interface, method, attribute declarations and their signatures. We're not testing things like statements or expressions, only signatures of declarations. And for each declaration we have recursive tests that check every property of the model object (of which there are many).
Incidentally, we don't just load Ceylon bytecode, we also load Java bytecode, for interoperability with Java classes, and since
the ceylon.language
module is currently hand-written in Java, none of the Ceylon code could be compiled if the model loader wasn't
able to load a model from its bytecode. So it gets a lot of testing everywhere.
As I mentioned previously, Ceylon is fully interoperable with Java, so we have tests that make sure that we can import Java modules, packages, types, call their methods, read their fields, implement Java interfaces, etc…
We currently have 10 tests for this, each including all the variations on a theme (static methods, fields, constructors…).
We have a few tests to make sure the compiler backend doesn't crash on unparseable input, or on Ceylon code that is improperly typed.
We have 34 tests that make sure that the compiler produces .car
files, source archives and MD5 checksums in the right place, that it
can save and find them back locally, or via HTTP, that we can load .jar
modules, and that we they contain what is expected. We also
test that we can resolve dependency graphs, cache HTTP files and check the MD5 checksums.
However beautiful the strategy, you should occasionally look at the results — Sir Winston Churchill
We have about 20 tests that make sure that we can invoke the Ceylon compiler, on any number of files, including incrementally. We test
that we can run Ceylon programs. We also test that the runtime behaviour of statements is correct (it's not enough to check that the
for
loop is compiled to a certain Java bytecode, we want to make sure it runs as we think it should).
We have a few tests that check that ceylond
(our documentation generator) is able to produce documentation, and that it produces it
correctly, while using the model loader (using Java Reflection, unlike the compiler which uses Javac Reflection).
We have a few manual tests (we need to automate that) to check that our ant tasks run as expected.
ceylon.language
testsLast but not least, we have 633 runtime tests written in Ceylon that check that the ceylon.language
API behaves as expected, which
is the ultimate test, since it effectively requires all the pieces previously described to work in order to do anything.
We even have one test which attemps to compile the Ceylon reference implementation of ceylon.language
, which the typechecker handles,
but the backend doesn't yet. Once that one passes, we'll be ready for Ceylon 1.0.
I've recently had a discussion about the speed of test execution with the Ceylon team, and was shocked to discover that while I was complaining that the Java backend compiler was taking 40 seconds to run, some of my colleagues had to wait more than 2 minutes for the same tests!
We're now looking at running some of those tests in parallel to speed things up on multi-core systems, but unfortunately it doesn't look like we'll be able to run them in parallel using Eclipse. We should be able to do it using Ant, though.
We break tests all the time. Most of the time when we fix a bug or add a feature we break a given number of tests. Sometimes we fix a bigger bug and break most of the tests. This is great: we can find the cause of the breakage straight away thanks to all these tests. Sometimes though, it takes a little bit of work to fix the tests we broke :)
We've an awful lot of tests, and we test (almost) everything. Those that are missing will get automated as soon as possible so we can forget about them, because that's what tests are for: when we fix a bug or add a feature, we know it works, and we know we didn't break anything. This is both priceless and fundamental for a new language.
Since this post was originally written:
ceylonc
has become ceylon compile
.ceylond
has become ceylon doc
.Hi, my name is Stéphane Épardaud and I´ll be your technical writer today :)
I want to talk a bit about some of the challenges we faced in the Ceylon compiler, and the solutions we found. As is described in the compiler architecture page the backend of the Ceylon compiler extends OpenJDK´s Javac compiler by translating Ceylon source code into Javac AST, which is then compiled into bytecode by Javac. Some of the reasons why we went this route of extending Javac rather than create our own compiler from scratch are that:
But there are things we can´t do properly in Java, and here I´m going to give you an example where we scratched our heads in trying to find a proper mapping.
In Ceylon, we don´t have Java fields, we have attributes, which are similar to JavaBean´s properties.
This means that Ceylon attributes are translated to JavaBean getters and setters. And for interoperability
we map JavaBean properties to Ceylon attributes. Now the biggest challenge with using JavaBean getter and setter
methods in place of fields is that we want attributes to support the same operations you can do on Java fields,
such as the ++
operation. How do we map this:
class Counter() {
Natural n = 0;
}
Counter c = Counter();
Natural n = c.n++;
Into working Java code which looks like this (optimised for long
because otherwise ++
is polymorphic):
class Counter{
long n;
long getN(){
return n;
}
void setN(long n){
this.n = n;
}
}
Counter c = new Counter();
long n = c.getN()++;
Wait a minute: this is not valid!
So the problem is that there are a lot of operations you can do on
l-values,
that is, variables which can be assigned. To summarize the difference between l-values and r-values, the following mnemonics
helps: an l-value is something which can be assigned and read, it can appear as the left side of an assignment, while an
r-value is an expression that can only be read and not assigned. In our example, c.n
is an l-value while 2 + 2
would be
an r-value.
So we expect to be able to do every assignment operation on l-values, such as :=
, +=
and ++
. The problem we face is that
in Java, c.n
is an l-value but when using getters, c.getN()
is not: it´s an r-value
, you can´t assign to it, you can´t
do ++
on it. For that you need to use the setter. Now the thing is that setters in JavaBean return void
, so they´re not
expressions, or even an l-value
: they´re statements. And we can´t put statements inside expressions. For instance we can´t do:
Counter c = new Counter();
long n = c.setN(c.getN()+1);
We cannot do that because setN()
is a statement: it returns void. Plus that would actually be an incorrect way to define ++
,
since we need to return the old value of n
prior to the increment, so we´d need a temporary variable. The only way to have
statements inside expressions in Java is to create an anonymous class:
Counter c = new Counter();
long n = new Object(){
long postIncrement(Counter c){
long previousValue = c.getN();
c.setN(previousValue+1);
return previousValue;
}
}.postIncrement(c);
And the solution to all other other assignment operations are similar: anonymous classes for things as trivial as ++
, surely
this is crazy? If only there were some other way, short of generating bytecode ourselves (in which case we can do whatever we
want without needing do make it translatable into Java).
So one day we´re looking inside OpenJDK´s Javac to try to find something, and we stumble upon mention of a comma
operator. For
those who don´t know C
),
the comma operator (,
) allows you to execute several
expressions and return the right-most expression value.
We look at this and we think: “this can´t be right, Java doesn´t have the comma operator, we´d know”. So why is it there? Looking
a bit more we discover that it´s there to support ++
on boxed Integer
values. Because this isn´t a primitive operation,
you need the same sort of workaround we have:
Integer i = new Integer(0);
Integer j = new Object(){
Integer postIncrement(Integer previousValue){
// assuming you could assign a captured variable:
i = new Integer(previousValue.intValue() + 1);
return previousValue;
}
}.postIncrement(i);
So they use this operator in order to save a temporary value in an expression context, where you normally can´t. And upon
further examination it turns out that they (the OpenJDK Javac authors) implemented the comma operator using an even more
generic exppression: a Let
expression!
I´m very familiar with let expressions, such as they are in Scheme or in ML, but I´m sure many of you are not, so in short:
A let expression allows you to declare and bind new variables in a local scope, run statements and return an expression from this scope, all in the context of an expression.
So let´s rewrite our previous example in pseudo-Java with let
:
Integer i = new Integer(0);
Integer j = (let
// store the previous value in a temporary variable
Integer previousValue = i;
in
// assign the new value
i = new Integer(previousValue.intValue() + 1);
// return the previous value
return previousValue;);
Now, obviously this is not valid Java, because let
expressions are not part of the Java language, but the OpenJDK Javac
compiler uses this construct behind the scenes to rewrite parts of the Java AST into pseudo-code that can be translated
into efficient bytecode in the end. All they needed was an AST node to represent this, and support from the bytecode
generator to support this AST type.
And guess what: since we feed Java AST to Javac we can use this construct :)
In fact this is precisely how we solved most of our issues, such as the ++
operator:
Counter c = new Counter();
long n = (let
long previousValue = c.getN();
in
c.setN(previousValue+1);
return previousValue;);
This solution allows us to define every assignment operator such as :=
, ++
or +=
on attributes, that are mapped
into JavaBean getter/setter methods using efficient code.
All we needed to do was to add some bits of support for let
expressions inside Javac because they never needed to get
them so early in the AST so it was missing some support in one or two phases of the compiler, but peanuts really.
When we set out to extend the Javac compiler we didn´t really know what to expect, but over time we´ve found it has a really
solid API and is very well done and documented. We were able to extend it in ways it was never imagined to be extended, and
it followed along nicely. Not only that but we found out that the OpenJDK developers, when faced with the issue of ++
on boxed
Integers
didn´t just hack along some quick and dirty way to fix it: they went ahead and implemented a much more powerful and
generic way to solve every similar issue with the let
expression. Congratulation guys, you did good and it was worth it,
because thanks to you we can implement really crazy stuff.
We´re now using this let
expression for implementing many operators and features, such as:
?.
, ?
and ?[]
null-safe operators, to store the
temporary variable before we test it for null.