Blog of Stéphane Épardaud

Java Reflection oddities with inner and enum class constructor parameters

Note: edited on 16/5/2013 to add info about enum constructors as well.

About Java inner classes

Java allows member classes (classes that are defined inside other classes), local classes (classes that are defined inside statement blocks) and anonymous classes (classes with no names):

class Outer {
    Object anonymous = new Object(){}; // this is an anonymous class

    // anonymous initialisation block
    {
        // this is a local class
        class Local{}
        Local l = new Local();
    }

    Outer() {
        // this is a local named class in a constructor
        class Local{}
        Local l = new Local();
    }

    void method() {
        // this is a local named class in a method
        class Local{}
        Local l = new Local();
    }

    // this is a member class
    class Inner{}
    Inner i = new Inner();
}

The Java Language Specification classifies member, local and anonymous classes as inner classes.

Implementation “details”

What the Java Language or Virtual Machine specifications do not tell you is how they are implemented. Some of it is explained already in other articles, such as how the Java compiler generates synthetic methods to allow these members classes access to private fields, which would not be allowed by the JVM.

Another implementation detail of inner classes that is handy to know is that inner class constructors take extra synthetic parameters. It is relatively well-known that the first synthetic parameter of an inner class constructor will be its enclosing instance, which it will store in a this$X synthetic field. This is valid for all three kinds of inner classes: member, local and anonymous.

But it is generally not known that local classes who capture non-constant final variables will require all these variables to be passed as extra synthetic constructor parameters (captured constant final variables will be inlined and not generate extra synthetic constructor parameters):

class Outer {
    void method() {
        final String constant = "foo";
        final String nonConstant = "foo".toUpperCase();
        class Local{
            /* synthetic fields and constructor: 

            Outer this$0;
            String nonConstant;

            Local(Outer this$0, String nonConstant){
                this.this$0 = this$0;
                this.nonConstant = nonConstant;
            }
            */
        }
        Local l = new Local();
    }
}

Another example: Java enum classes

Java allows you to create enumeration classes, which is essentially little more than syntactic sugar to help you define a list of singleton values of a given type.

The following Java code:

enum Colours {
    RED, BLUE;
}

Is essentially equivalent to:

final class Colours extends java.lang.Enum {
    public final static Colours RED = new Colours("RED", 0);
    public final static Colours BLUE = new Colours("BLUE", 1);

    private final static values = new Colours[]{ RED, BLUE };

    private Colours(String name, int sequence){
        super(name, sequence);
    }

    public static Colours[] values(){
        return values;
    }

    public static Colours valueOf(String name){
        return (Colours)java.lang.Enum.valueOf(Colours.class, name);
    }
}

As you can see, it saves quite some code, but also adds synthetic fields, methods and constructor parameters. If you had defined your own constructor, with its own set of parameters, like this:

enum Colours {
    RED("rouge"), BLUE("bleu");

    public final String french;

    Colours(String french){
        this.french = french;
    }
}

You would have gotten the following Java code generated:

final class Colours extends java.lang.Enum {
    public final static Colours RED = new Colours("RED", 0, "rouge");
    public final static Colours BLUE = new Colours("BLUE", 1, "bleu");

    private final static values = new Colours[]{ RED, BLUE };

    public final String french;

    private Colours(String name, int sequence, String french){
        super(name, sequence);
        this.french = french;
    }

    public static Colours[] values(){
        return values;
    }

    public static Colours valueOf(String name){
        return (Colours)java.lang.Enum.valueOf(Colours.class, name);
    }
}

Luckily, enums can’t be inner classes, so they will not have an extra synthetic parameter inserted for the container instance to add to those two.

OK, but why should I care?

In most cases you don’t care, other than for your own curiosity. But if you’re doing Java reflection with inner or enum classes, there are a few things you should know, and because I haven’t found them listed or specified online, I thought it would be important to make a list of things to help others figure it out, because different compilers will produce different results in the Java reflection API.

The question is what happens when you use Java reflection to get a java.lang.reflect.Constructor instance for inner or enum class constructors? In particular, what happens with the methods that allow you to access the parameter types (pre-generics: getParameterTypes()), the generic parameter types (post-generics: getGenericParameterTypes()) and annotations (getParameterAnnotations()), and the answer is: it depends.

Suppose the following Java classes:

class Outer {
    class Inner {
        Inner(){}
        Inner(String param){}
        Inner(@Deprecated Integer param){}
    }
}
enum class Enum {
    ;// yes this is required
    Enum(){}
    Enum(String param){}
    Enum(@Deprecated Integer param){}
}

Here are the size of the arrays returned by these three reflection methods, on each of our constructor, and how they differ depending on the Java compiler used:

Outer.Inner.class
.getDeclaredConstructor()
Outer.Inner.class
.getDeclaredConstructor(
String.class)
Outer.Inner.class
.getDeclaredConstructor(
Integer.class)
getParameterTypes()
.length
1 2 2
getGenericParameterTypes()
.length
compiled with Eclipse
1 2 2
getGenericParameterTypes()
.length
compiled with Javac
0 1 1
getParameterAnnotations()
.length
1 2 1

And the results are consistent for our enum class:

Enum.class
.getDeclaredConstructor()
Enum.class
.getDeclaredConstructor(
String.class)
Enum.class
.getDeclaredConstructor(
Integer.class)
getParameterTypes()
.length
2 3 3
getGenericParameterTypes()
.length
compiled with Eclipse
2 3 3
getGenericParameterTypes()
.length
compiled with Javac
0 1 1
getParameterAnnotations()
.length
2 3 1

As you can see, the synthetic parameters are always included in getParameterTypes(), but are only included in getGenericParameterTypes() when compiled with Eclipse.

getParameterAnnotations() on the other hand, will always include synthetic parameters except when at least one of your constructor parameters are annotated.

With this info, you now understand the differences between the results of these methods, but so far I still haven’t found a way to determine which parameter is synthetic or not, because although you can make a good guess for the this$X synthetic parameter, which is required by every inner class, you have no way of knowing the number of non-constant captured variables that will end up as synthetic parameters to local class constructors.

JUDCon Boston 2013

Gavin, Emmanuel and I will be at this year's Boston JUDCon for a 4h Ceylon hands-on, split in two 2h parts on Monday, June 10th.

During this session, we will help you learn the Ceylon programming language, hand in hand, from downloading the tools, using the IDE, getting to know the various tools, the language SDK, all the way to running your own module repository and publishing your first Ceylon modules to it and to the official Ceylon module repository.

No edge of Ceylon required, though the audience should be familiar with the Java programming language.

The first part of the workshop (10:00am - 12:00pm) will get you familiar with the Ceylon IDE and its command-line tools, as well as the basics of the language: the new type system, how to work with lists, functional programming and how to write classes and interfaces.

During the second part (1:00pm - 3:00pm), we will examine more advanced topics, such as using the Ceylon SDK to write a REST client, and even as ambitious as writing the ceylon.html SDK module from scratch and publishing it on Herd, our module repository.

Please take the time to download, install and check the following things to make the hands-on as smooth as possible, since it's not fun to spend too much time on installation:

  • Install the Java 7 JDK, and configure it as the default JDK
  • Install the command-line distribution of Ceylon (M5 Version)
  • If you did not manage to configure Java 7 as the default JDK on Windows, define the JAVA_HOME environment variable in your systems properties so that it points to your JDK 7 installation
  • Check that the Ceylon command-line tools run: ceylon --version must print the correct version of Ceylon (M5)
  • Install Eclipse Juno and the Ceylon IDE (or using our update site)
  • Check that you can create a new Ceylon project from within the Ceylon IDE and that you can launch it as a Ceylon Application (make a demo project with Hello Word! to test)
  • Finally check that the Ceylon command-line tools work in your project as well: in your demo Ceylon project's folder, the ceylon compile demo command should compile your project (provided you named your Ceylon module demo), and ceylon run demo/1 (provided you named your version 1) should print Hello World!

Make sure you register, because this is going to be another great JUDCon!

About modules

Modules, ah, modules. The albatross of Java. I frequently joke that modules are scheduled for Java N+1 where N moves forward with each release. I remember perfectly the first time I heard of Java getting modules at Devoxx, back when they were still planned for Java 7. I remember I heard the announcement and what I saw made a lot of sense, I couldn't wait to get them. I agreed completely, and still do, with the assertion that modules belong in the language, just like packages, and that they should not be delegated to third-party external systems which will never be able to have the level of integration that you can achieve by being an integral part of the language. And then, they pushed it to Java 8. And then to Java 9. It's a shame because it looks like a really well thought-out module system.

Meanwhile, Ceylon already supports modules, in a generally awesome way. But why do we need modules? Here are some easy answers:

  • To serve as a larger container than packages. Everyone ships Java code into jars which are generally accepted to be modules: they represent a group of packages that implement a library or program and have a version.
  • To express and resolve dependencies: your module (jar) is going to depend on other modules (jars), so we should know which they are and how to find them.
  • To distribute modules: Linux distributions have been doing this forever. With a modular system where modules have names and versions, you can organise them in a standard hierarchy, which can then be used by tools to obtain those modules in a standard way. With dependencies, the tools can then download/upload the dependencies too, which means distribution becomes a lot simpler.
  • To isolate modules at runtime. This is part of escaping the infamous classpath hell: if you isolate your modules at runtime, then you can have multiple programs that depend on different versions of the same module loaded at the same time.

So why is it better if we have modules in the language, rather than outside of it?

  • It's standard and tools have to support it. You could view this as a downside, but really, what part of the Java language would you rather have left unspecified? Would you be ready to delegage the implementation of packages to third-party tools? Having a single way to deal with modules helps a lot, both users and implementors.
  • You get rid of external tools, because suddenly javac, java and friends know how to publish, fetch and execute modules and their dependencies. No more plumbing and sub-par fittings.
  • It gets integrated with reflection. Modules are visible at runtime, fully reified and hopefully dynamic too, just like classes and packages are.
  • Dependencies and modules are separated from the build system. There's absolutely no good reason why the two should be confused with one another.

These are the reasons why I can't wait for Java N+1 to have modules, I think they'll be great. But if you need modules now, then you can use Ceylon :)

Ceylon support modules in the language, from the start, with:

  • a super small ceylon.language base module, which is all you need to start using Ceylon
  • a modular SDK
  • a great module repository: Herd
  • support for module repositories in all the tools, be it the command-line or the IDE. They know how to deal with dependencies, fetch or publish modules to/from local or remote repositories
  • superb support for Herd repositories from the IDE, with the ability to search for modules, have auto-completion and all
  • support for a modular JDK, using the same module map as the Jigsaw project (Java's planned modular SDK)
  • and even interoperability with Maven repositories, so that you can use Maven modules as if they were Ceylon modules

Other existing third-party module systems

As I mentioned, we support using Maven repositories from Ceylon, and we will probably support OSGi too, in time. Those are the two main third-party module systems for Java. OSGi is used a lot in application servers or IDEs, but rarely by most Java programmers, which prefer Maven. Maven was the first system that provided both modularity and a build system for the JVM. Prior to that we had Ant, which only provided the build system. Maven essentially prevailed over Ant because it supported modularity, which meant that people no longer had to worry about how to get their dependencies. Even applying the modularity solution to itself, it became easier to distribute Maven modules than Ant modules.

Maven supports modularity and dependencies, in order to resolve and download them, but once they are downloaded, the dependencies are crammed into the classpath, so as far as the Java compiler or runner are concerned, modularity has been erased. There is no support for multiple versions of the same module at runtime, or any kind of validation of dependencies.

The undeclared transitive dependency problem

We recently had bug reports for Ceylon where our users has trouble using Maven modules from Ceylon, due to missing dependencies. Since we do use the dependencies provided by Maven, we found that a bit weird. After checking, it appears that we've been hit by the modularity erasure of Maven. Here's a simple example of something you can do with Maven modules:

  • You write a module A, which uses module B and C, but you only declare a dependency on module B
  • Module B depends on module C
  • Module C does not depend on anything

In Ceylon, if you tried to compile module A, the compiler would not let you because you failed to depend on C. With Maven, it just works, because Maven fetches modules B and C and puts them in the classpath, which means all implicit transitive dependencies end up visible to your module. Modularity is erased. This may seem convenient, but it really means that dependencies are not checked and cannot be trusted.

Due to that, we can't rely on Maven modules to properly declare their dependencies, so we cannot run Maven modules in isolation.

The undeclared implicit dependency problem

There's something more subtly wrong with unchecked module systems: implicit dependencies. This is something you're allowed to do with Maven:

  • Module A uses module B
  • Module B depends on module C and declares it
  • Module B uses types from module C in it public API, such as parameter types, method or field types, or even super classes

This is a variant of the first kind of problem, except that in this case nobody can use module B without also importing module C directly, because it's not possible to use the types of module B without seeing the types of modules C.

In Ceylon, if you tried to compile module B, the compiler would not let you unless you export your module C dependency. This way, when you depend on module B, you automatically also depend on module C, because you really need both to be visible to be able to use module B's API.

Another point in favour of integrated module systems

If we had an integrated module system from the start, in Java, with proper module isolation, we would not have the issues I just described with missing dependencies that are so widespread in Maven, because there are no tools to prevent you from making these mistakes. Compilers do not let you use packages unless you import them, there's no reason to expect that the same would not hold for modules.

I still think the modules project for Java will be a great leap forward, but since Java N+1 is still not here, and there's a huge library of Maven modules that we want to be able to use, we have to find a way to bypass the limitations of Maven's dependency declarations to let you use them in Ceylon. We have various ideas of how to do that, from automatic detection of dependencies through bytecode analysis, to storing Maven modules in a "flat classpath" container, or even via dependency overrides where users could "fix" Maven dependencies. We're still in the process of evaluating each of these solutions, but if you have other suggestions, feel free to pitch in :)

Google Summer of Code 2013

This year we are going to participate in the Google Summer of Code under the JBoss Community organisation.

We've put together a page with our proposals, so check it out, and if you're a student and would like to participate in the Ceylon project in a way that will really matter for our users, go apply now.

Tuples and function types

Tuples

Ceylon is getting tuples in the next version. For those who don´t know what a tuple is, it´s a bit like a struct in C: a finite sequence of heterogeneous typed elements values, with no methods (unlike a class) and no names (unlike C structures). But let´s start with an example.

In Ceylon, [1, 2.0, "Hello"] is a tuple value of size 3 with type [Integer, Float, String]. Now let´s see a more useful example:

// convert a sequence of strings into a sequence of tuples with size, string
// and number of uppercase letters
Iterable<[Integer,String,Integer]> results = 
    {"Merry Christmas", "Happy Cheesy Holidays"}
        .map((String s) [s.size, s, s.count((Character c) c.uppercase)]);
for([Integer,String,Integer] result in results){
    Integer size = result[0];
    String s = result[1];
    Integer uppercaseLetters = result[2];
    print("Size: " size ", for: '" s "', uppercase letters: " uppercaseLetters "");
}

Which outputs:

Size: 15, for: 'Merry Christmas', uppercase letters: 2
Size: 21, for: 'Happy Cheesy Holidays', uppercase letters: 3

As you can see, you can access each tuple element using the Correspondence.item syntax sugar result[i], and that's because our Tuple type satisfies Sequence, underneath.

You may ask, but then what is the sequence type of a [Integer, String, Integer] tuple, then? Easy: it´s Sequencial<Integer|String>.

Then how do we possibly know that result[2] is an Integer and not an Integer|String? Well, that´s again syntactic sugar.

You see, our Tuple type is defined like this:

shared class Tuple<out Element, out First, out Rest>(first, rest)
        extends Object()
        satisfies Sequence<Element>
        given First satisfies Element
        given Rest satisfies Element[] {

    shared actual First first;
    shared actual Rest rest;

    // ...
}

So a tuple is just a sort of elaborate cons-list and result[2] is syntactic sugar for result.rest.rest.first, which has type Integer.

Similarly, the [Integer, String, Integer] type literal is syntactic sugar for:

Tuple<Integer|String, Integer, Tuple<Integer|String, String, Tuple<Integer, Integer, Empty>>>

And last, the [0, "foo", 2] value literal is syntactic sugar for:

Tuple(0, Tuple("foo", Tuple(2, empty)))

As you can see, our type inference rules are smart enough to infer the Integer|String sequential types by itself.

Function types

So tuples are nice, for example to return multiple values from a method, but that´s not all you can do with them.

Here´s how our Callable type (the type of Ceylon functions and methods) is defined:

shared interface Callable<out Return, in Arguments> 
    given Arguments satisfies Void[] {}

As you can see, its Arguments type parameter accepts a sequence of anything (Void is the top of the object hierarchy in Ceylon, above Object and the type of null), which means we can use tuple types to describe the parameter lists:

void threeParameters(Integer a, String b, Float c){
}

Callable<Void, [Integer, String, Float]> threeParametersReference = threeParameters;

So, Callable<Void, [Integer, String, Float]> describes the type of a function which returns Void and takes three parameters of types Integer, String and Float.

But you may ask what the type of a function with defaulted parameters is? Defaulted parameters are optional parameters, that get a default value if you don´t specify them:

void oneOrTwo(Integer a, Integer b = 2){
    print("a = " a ", b = " b ".");
}
oneOrTwo(1);
oneOrTwo(1, 1);

That would print:

a = 1, b = 2.
a = 1, b = 1.

So let´s see what the type of oneOrTwo is:

Callable<Void, [Integer, Integer=]> oneOrTwoReference = oneOrTwo;
// magic: we can still invoke it with one or two arguments!
oneOrTwoReference(1);
oneOrTwoReference(1, 1);

So its type is Callable<Void, [Integer, Integer=]>, which means that it takes one or two parameters. We´re not sure yet about the = sign in Integer= to denote that it is optional, so that may change. This is syntactic sugar for Callable<Void, [Integer] | [Integer, Integer]>, meaning: a function that takes one or two parameters.

Similarly, variadic functions also have a denotable type:

void zeroOrPlenty(Integer... args){
    for(i in args){
        print(i);
    }
}
Callable<Void, [Integer...]> zeroOrPlentyReference = zeroOrPlenty;
// magic: we can still invoke it with zero or more arguments!
zeroOrPlentyReference();
zeroOrPlentyReference(1);
zeroOrPlentyReference(5, 6);

Here, [Integer...] means that it accepts a sequence of zero or more Integer elements.

Now if you´re not impressed yet, here´s the thing that´s really cool: thanks to our subtyping rules (nothing special here, just our normal subtyping rules related to union types), we´re able to figure out that a function that takes one or two parameters is a supertype of both functions that take one parameter and functions that take two parameters:

Callable<Void, [Integer, Integer=]> oneOrTwoReference = oneOrTwo;

// we can all it with one parameter
Callable<Void, [Integer]> one = oneOrTwoReference;
// or two
Callable<Void, [Integer, Integer]> two = oneOrTwoReference;

And similarly for variadic functions, they are supertypes of functions that take any number of parameters:

Callable<Void, [Integer...]> zeroOrPlentyReference = zeroOrPlenty;

// we can call it with no parameters
Callable<Void, []> zero = zeroOrPlentyReference;
// or one
Callable<Void, [Integer]> oneAgain = zeroOrPlentyReference;
// or two
Callable<Void, [Integer, Integer]> twoAgain = zeroOrPlentyReference;
// and even one OR two parameters!
Callable<Void, [Integer, Integer=]> oneOrTwoAgain = zeroOrPlentyReference;

Although this is something that dynamic languages have been able to do for decades, this is pretty impressive in a statically typed language, and allows flexibility in how you use, declare and pass functions around that follows the rules of logic (and checked!) and don´t make you fight the type system.

By the way

I´ve shown here that we support method references in Ceylon, but the same is true with constructors, because after all a constructor is nothing more than a function that takes parameters and returns a new instance:

class Foo(Integer x){
    shared actual String string = "Foo " x "";
}
print(Foo(2));

Callable<Foo, [Integer]> makeFoo = Foo;
print(makeFoo(3));

Will print:

Foo 2
Foo 3