Chapter 1. Introduction

This document defines the syntax and semantics of the Ceylon language. The intended audience includes compiler implementors, interested parties who wish to contribute to the evolution of the language, and experienced developers seeking a precise definition of language constructs. However, in light of the newness of the language, we will begin with an overview of the main features of the language and SDK. A brief introduction to programming in the language may be found at the following address:

http://ceylon-lang.org/documentation/tour/

1.1. Language overview

Ceylon is a general-purpose programming language featuring a syntax similar to Java and C#. It is imperative, statically-typed, block-structured, object-oriented, and higher-order. By statically-typed, we mean that the compiler performs extensive type checking, with the help of type annotations that appear in the code. By object-oriented, we mean that the language supports user-defined types and features a nominative type system where a type is a set of named attributes and operations, and that it supports inheritance and subtype polymorphism. By higher-order, we mean that every referenceable program element (every attribute, every operation, and every type) is also a value. By block-structured, we mean to say that the language features lexical scoping and an extremely regular recursive syntax for declarations and statements.

Ceylon improves upon the Java language and type system to reduce verbosity and increase typesafety compared to Java and C#. Ceylon encourages a more functional, somewhat less imperative style of programming, resulting in code which is easier to reason about, and easier to refactor.

1.1.1. Runtime and platform

Ceylon programs execute in any standard Java Virtual Machine or on any JavaScript virtual machine, and take advantage of the memory management and concurrency features of the virtual machine in which they execute. Ceylon programs are packaged into modules with well-defined inter-module dependencies, and always execute inside a runtime environment with module isolation.

The Ceylon compiler is able to compile Ceylon code that calls Java classes or interfaces, and Java code that calls Ceylon classes or interfaces. JavaScript code is able to interact with Ceylon classes and functions compiled to JavaScript. Via a special dynamic mode, code written in Ceylon may call functions defined natively in JavaScript.

Moreover, Ceylon provides its own native SDK as a replacement for the Java platform class libraries. Certain SDK modules depend upon services available only on the Java platform. Other SDK modules, including the core language module, are cross-platform and may also be used in a JavaScript virtual machine.

1.2. Type system

Ceylon, like Java and C#, features a hybrid type system with both subtype polymorphism and parameteric polymorphism. A type is either a stateless interface, a stateful class, a type parameter, or a union or intersection of other types. A class, interface, or type parameter may be defined as a subtype of another type. A class or interface may declare type parameters, which abstract the definition of the class or interface over all types which may be substituted for the type parameters.

Like C#, and unlike Java, Ceylon's type system is fully reified. In particular, generic type arguments are reified, eliminating many problems that result from type erasure in Java.

There are no primitive types or arrays in Ceylon—every Ceylon type can be represented within the language itself. So all values are instances of the type hierarchy root Anything, which is a class. However, the Ceylon compiler is permitted to optimize certain code to take advantage of the optimized performance of primitive types on the Java or JavaScript VM.

Furthermore, all types inferred or even computed internally by the Ceylon compiler are expressible within the language itself. Within the type system, non-denoteable types simply do not arise. The type system is based upon computation of principal types. There is no way to construct an expression which does not have a unique principal type expressible within the language. The principal type of an expression is a subtype of all other types to which the expression could be soundly assigned.

1.2.1. Mixin inheritance

Ceylon supports a restricted form of multiple inheritance, often called mixin inheritance. A class must extend exactly one other class. But a class or interface may satisfy (extend or implement) an arbitrary number of interfaces.

Classes hold state and define logic to initialize that state when the class is instantiated. A concrete class is a class that contains only concrete member definitions. Concrete classes may be directly instantiated. An abstract class may contain formal member declarations. Abstract classes may not be instantiated.

Interfaces may define concrete members, but may not hold state (references to other objects) or initialization logic. This restriction helps eliminate the problems traditionally associated with multiple inheritance. Ceylon never performs any kind of "linearization" of the supertypes of a type. Interfaces may not be directly instantiated.

1.2.2. Algebraic types, self types, and type families

Ceylon does not feature Java-style enumerated types as a first-class construct. Instead, any abstract type may specify its cases—an enumerated list of instances and/or subtypes. This facility is used to simulate both enumerated types and functional-style "algebraic" (sum) types.

interface Identity of Person | Organization { ... }

A closely related feature is support for self types and type families. A self type is a type parameter of an abstract type (like Comparable) which represents the type of a concrete instantiation (like String) of the abstract type, within the definition of the abstract type itself. In a type family, the self type of a type is declared not by the type itself, but by a containing type which groups together a set of related types.

1.2.3. Simplified generics

Ceylon does not support Java-style wildcard type parameters, raw types, or any other kind of existential type. And the Ceylon compiler never even uses any kind of "non-denotable" type to reason about the type system. So generics-related error messages are understandable to humans.

Instead of wildcard types, Ceylon features declaration-site variance. A type parameter may be marked as covariant or contravariant by the class or interface that declares the parameter.

Ceylon has a somewhat more expressive system of generic type constraints with a cleaner, more regular syntax. The syntax for declaring constraints on a type parameter looks very similar to a class or interface declaration. Ceylon supports upper bound type constraints and also enumerated bounds.

interface Producer<out Value, in Rate> 
        given Value satisfies Object 
        given Rate of Float|Decimal { ... }

1.2.4. Union and intersection types

A union type, for example String|Number, or intersection type, for example Identifiable&List<String>, may be formed from two or more types defined elsewhere.

Union types make it possible to write code that operates polymorphically over types defined in disparate branches of the type heirarchy without the need for intermediate adaptor classes. Intersection types make it possible to operate polymorphically over all subtypes of a list of types. Union and intersection types provide some of the benefits of structural ("duck") typing, within the confines of a nominative type system, and therefore certain Ceylon idioms are reminiscent of code written in dynamically-typed languages.

Union and intersection types play a central role in generic type argument inference and therefore underly the whole system of principal typing. For example, the following expression has type HashMap<String,Integer|Float>:

HashMap { "float"->0.0, "integer"->0 }

1.2.5. Type aliases and type inference

Type aliases and type inference help reduce the verbosity of code which uses generic types, eliminating the need to repeatedly specify generic type arguments.

A type alias is similar to a C-style typedef.

interface Strings => Sequence<String>;
alias Number => Integer|Float|Whole|Decimal;

Local type inference allows a type annotation to be eliminated altogether. The type of a block-local value or function is inferred from its definition if the keyword value or function occurs in place of the type declaration.

value name = person.name;
function sqrt(Float x) => x^0.5;

The type of a control-structure variable also may be inferred.

for (n in 0..max) { ... }

Ceylon features an especially elegant approach to generic type argument inference, making it possible to instantiate container types, even inhomogeneous container types, without the need to explicitly mention any types at all.

value numbers = { -1, 0, -1, -1.0, 0.0, 1.0 };

By limiting type inference to local declarations, Ceylon ensures that all types may be inferred by the compiler in a single pass of the source code. Type inference works in the "downward" and "outward" directions. The compiler is able to determine the type of an expression without considering the rest of the statement or declaration in which it appears.

1.2.6. Metaprogramming

In other statically typed languages, runtime metaprogramming, or reflection, is a messy business involving untypesafe strings and typecasting. Even worse, in Java, generic type arguments are erased at runtime, and unavailable via reflection. Ceylon, uniquely, features a typesafe metamodel and typed metamodel expressions. Since generic type arguments are reified at runtime, the metamodel fully captures generic types at both compile time and execution time.

Ceylon's support for program element annotations is based around this metamodel. Annotations are more flexible than in Java or C#, and have a much cleaner syntax.

Ceylon does not support macros or any other kind of compile-time metaprogramming.

1.3. Object-oriented programming

The primary unit of organization of an object-oriented program is the class. But Ceylon, unlike Java, doesn't require that every function or value belong to a class. It's perfectly normal to program with a mix of classes and toplevel functions. Contrary to popular belief, this does not make the program less object-oriented. A function is, after all, an object.

1.3.1. Class initialization and instantiation

Ceylon does not feature any Java-like constructor declaration and so each Ceylon class has a parameter list, and exactly one initializer—the body of the class. This helps reduce verbosity and results in a more regular block structure.

class Point(Float x, Float y) { ... }

The Ceylon compiler guarantees that the value of any attribute of a class is initialized before it is used in an expression.

A class may be a member of an outer class. Such a member class may be refined (overridden) by a subclass of the outer class. Instantiation is therefore a polymorphic operation in Ceylon, eliminating the need for a factory method in some circumstances.

Ceylon provides a streamlined syntax for defining an anonymous class which is only instantiated in exactly the place it is defined. Among other uses, the object declaration is useful for creating singleton objects or method-local interface implementations.

object origin extends Point(0.0, 0.0) {}

1.3.2. Functions, methods, values, and attributes

Functions and values are the bread and butter of programming. Ceylon functions are similar to Java methods, except that they don't need to belong to a class. Ceylon values are polymorphic, and abstract their internal representation, similar to C# properties.

String name => firstName + " " + lastName;

The Ceylon compiler guarantees that any value is initialized before it is used in an expression.

A function belonging to a type is called a method. A value belonging to a type is called an attribute. There are no static members. Instead, toplevel functions and values are declared as direct members of a package. This, along with certain other features, gives the language a more regular block structure.

By default, an attribute or value may not be assigned a new value after its initial value has been specified. Mutable attributes and variable values must be explicitly declared using the variable annotation.

Ceylon does not support function overloading. Each method of a type has a distinct name.

1.3.3. Defaulted parameters and variadic parameters

Instead of method and constructor overloading, Ceylon supports parameters with default values and variadic parameters.

void addItem(Product product, Integer quantity=1) { ... }
String join(String* strings) { ... }

Furthermore, a generic method may be used to emulate parameter type overloading.

Number sum<Number>(Number* numbers) given Number of Integer | Float { ... }

Therefore, a single method in Ceylon may emulate the signatures of several overloaded methods in Java.

1.3.4. First-class functions and higher-order programming

Ceylon supports first-class function types and higher-order functions, with minimal extensions to the traditional C syntax. A function declaration may specify a callable parameter that accepts references to other functions with a certain signature.

String find(Boolean where(String string)) { ... }

The argument of such a callable parameter may be either a reference to a named function declared elsewhere, or a new function defined inline as part of the method invocation. A function may even return an invocable reference to another function.

value result = { "C", "Java", "Ceylon" }.find((String s) => s.size>1);

The type of a function is expressed within the type system as an instantiation of the interface Callable. The parameter types are expressed as a tuple type. So the type of the function (String s) => s.size>1 is Callable<Boolean,[String]>, which may be abbreviated to Boolean(String).

Methods and attributes may also be used as functions.

value names = people.map(Person.name);
value values = keys.map(keyedValues.get);

1.3.5. Naming conventions, annotations, and inline documentation

The Ceylon compiler enforces the traditional Smalltalk naming convention: type names begin with an initial uppercase letter—for example, Liberty or RedWine—member names and local names with an initial lowercase letter or underscore—for example, blonde, immanentize() or boldlyGo().

This restrictions allows a much cleaner syntax for program element annotations than the syntax found in either Java or C#. Declaration "modifiers" like shared, abstract, and variable aren't keywords in Ceylon, they're ordinary annotations.

"Base type for higher-order abstract stuff."
shared abstract class AbstractMetaThingy() { ... }

The documentation compiler reads inline documentation specified using the doc annotation.

1.3.6. Named arguments and tree-like structures

Ceylon's named argument lists provide an elegant means of initializing objects and collections. The goal of this facility is to replace the use of XML for expressing hierarchical structures such as documents, user interfaces, configuration and serialized data.

Html page = Html {
    Head { 
        title="Hello"; 
    };
    Body { 
        P {
            css = "greeting"; 
            "Hello, World!" 
        }; 
    };
}

An especially important application of this facility is Ceylon's built-in support for program element annotations.

1.3.7. Modularity

Toplevel declarations are organized into packages and modules. Ceylon features language-level access control via the shared annotation which can be used to express block-local, package-private, module-private, and public visibility for a program element. There's no equivalent to Java's protected.

A module corresponds to a versioned packaged archive. Its module descriptor expresses its dependencies to other modules. The tooling and execution model for the language is based around modularity and module archives.

1.4. Language module

The Ceylon language module defines a set of built-in types which form the basis for several powerful features of the language. The following functionality is defined as syntactic "sugar" that makes it easier and more convenient to interact with the language module.

1.4.1. Operators and operator polymorphism

Ceylon features a rich set of operators, including most of the operators supported by C and Java. True operator overloading is not supported. However, each operator is defined to act upon a certain class or interface type, allowing application of the operator to any class which extends or satisfies that type. For example, the + operater may be applied to any class that satisfies the interface Summable. This approach is called operator polymorphism.

1.4.2. Numeric and character types

Ceylon's numeric type system is much simpler than C, C# or Java, with exactly two built-in numeric types (compared to six in Java and eleven in C#). The built-in types are classes representing integers and floating point numbers. Integer and Float values are 64 bit by default, and may be optimized for 32 bit architectures via use of the small annotation.

The module ceylon.math provides two additonal numeric types representing arbitrary precision integers and arbitrary precision decimals.

Ceylon has Character and String classes, and, unlike Java or C#, every character is a full 32-bit Unicode codepoint. Conveniently, a String is a List<Character>.

1.4.3. Compile-time safety for optional values and type narrowing

There is no primitive null in Ceylon. The null value is an instance of the class Null that is not assignable to user-defined class or interface types. An optional type is a union type like Null|String, which may be abbreviated to String?. An optional type is not assignable to a non-optional type except via use of the special-purpose if (exists ... ) construct. Thus, the Ceylon compiler is able to detect illegal use of a null value at compile time. Therefore, there is no equivalent to Java's NullPointerException in Ceylon.

Similarly, there are no C-style typecasts in Ceylon. Instead, the if (is ... ) and case (is ... ) constructs may be used to narrow the type of an object reference without risk of a ClassCastException. The combination of case (is ... ) with algebraic types amounts to a kind of language-level support for the visitor pattern.

Alternatively, type assertions, written assert (is ... ) or assert (exists ... ) may be used to narrow the type of a reference.

1.4.4. Iterable objects and comprehensions

The interface Iterable represents a stream of values, which might be evaluated lazily. This interface is of central importance in the language module, and so the language provides a syntactic abbreviation for the type of an iterable object. The abbreviation {String*} means Iterable<String>. There is a convenient syntax for instantiating an iterable object, given a list of values:

{String*} words = {"hello", "world", "goodbye"};

A nonempty iterable is an iterable object which always produces at least one value. A nonempty iterabe type is written {String+}. Distingushing nonempty streams of values lets us correctly express the type of functions like max():

{Float+} oneOrMore = .... ;
{Float*} zeroOrMore = .... ;
Float maxOfOneOrMore = max(oneOrMore); //never null
Float? maxOfZeroOrMore = max(zeroOrMore); //might be null

Comprehensions are an expressive syntax for filtering and transforming streams of values. For example, they may be used when instantiating an iterable object or collection:

value adults = { for (p in people) if (p.age>18) p.name };
value peopleByName = HashMap { for (p in people) p.name->p };

Comprehensions are evaluated lazily.

1.4.5. Sequences and tuples

Sequences are Ceylon's version of arrays. However, the Sequential interface does not provide operations for mutating the elements of the sequence—sequences are considered immutable. Because this interface is so useful, a type like Sequential<String> may be abbreviated to [String*], or, for the sake of tradition, to String[].

A nonempty sequence is a kind of sequence which always has at least one element. A nonempty sequence type is written [String+]. The special-purpose if (nonempty ... ) construct narrows a sequence type to a nonempty sequence type.

Tuples are a kind of sequence where the type of each element is encoded into the static type of the tuple. Tuple is just an ordinary class in Ceylon, but syntactic abbreviations let us write down tuple types in using a streamlined syntax. For example, [Float,Float] is a pair of Floats. There is also a convenient syntax for instantiating tuples and accessing their elements.

[Float,Float] origin = [0.0, 0.0];
Float x = origin[0];
Float y = origin[1];
Null z = origin[2]; //only two elements!