Lambda a peek under the hood Notes

(Note: This note is from watching Brian Goetz‘s Lambda: A peek under the hood)

What bytecode java complier will emit for Java lambda?

The talk outlines the technical details of how lambda expressions are implemented in Java SE 8 at the bytecode level.

A Lambda expression is like an “anonymous method”

Why lambdas for java?

  • Provide libraries a path to multicore
    • Parallel-friendly APIs need internal iteration
    • Internal iteration needs a concise code-as-data mechanism
  • Empower library developers
    • Enable a higher degree of cooperation between libraries and client code
  • It’s about time!
  • Java already has inner classes, isn’t that enough?
    • Inner classes give use some of these benefits, but are too clunky
    • Complex naming lookup rules
  • How to represent lambda expressions at runtime is not a trivial question!
    • Need a clean story at both the language and VM levels
    • That’s what this talk is about

Question 1: Typing

What is the type of a lambda expression?

  • Early proposals included adding function types to the type system
  • Why not?
    • Proposals rarely went deeper than syntax
      • Without impedance mismatches

Why not just add function types?

  • Inconvinent questions
    • How could we represent functions in VM method/field/type signatures?
      • Need a way to write void method(String -> int f) in bytecode
      • Method signatures are (currently) entirely nominal
    • How would we represent invocation in bytecode?
    • How would we create instances of function-typed variables?
    • How would we support variance?
  • Gaps between language and VM representation are always pain points

Alternative: functional interfaces (Use what you know)

  • Historically we modeled functions using single-method interfaces
    • Runnable, Comparator
  • Rather than complicate the type system, let’s just formalize that

    • Give them a name
      • Always convert lambda expressions to instance of a functional interface
  • “Just add function types” was obvious… and wrong

    • Would have introduced complexity and corner cases
    • Would have bifurcated libraries into “old” and “new” styles
    • Would have created interoperability challenges
  • People already understand one-method interfaces

  • Bonus: existing libraries are now forward-compatible to lambdas
  • Lessons learned: a stodgy old approach is sometimes better tha a shinny new one!

Question 2: Representation

How does lambda instance get created?

  • Need to covert a function into an instance of a functional interface
  • Need to handle captured variables ((person p) -> p.age < minAge;)
  • The obvious choice is … inner classes

Why not “just” use inner classes?

We could say that a lambda is “just” an inner class intance:

1
2
3
4
5
6
7
class Foo$1 implements Predicate<Person> {
private final int $minAge;
Foo$1(int v0) { this.$minAge = v0; }
public boolean test(Person p) {
return p.age < $minAge;
}
}

Then lambda capture becomes constructor invocation

list.removeIf(p -> p.age < minAge); => list.removeIf(new Foo$1(minAge));

and it will be translated to

1
2
3
4
5
6
aload_1 // list
new #2 // class Foo$1
dup
iload_2 // minAge
invokespecial #3 // Method Foo$1."<init>":(I)V
invokeinterface #4 // java/util/List.removeIf:(Ljava/util/functions/Predicate:)Z

But translating to inner classes means we inherit most of their problems

  • Performance issues
    • One class per lambda expression
    • Type profile pollution
    • Always allocates a new instance
  • Complicated and error-prone “comb” lookup for names
  • Whatever we do becomes a binary representation for lambdas in Java

New bytecode tool: MethodHandle

  • Java SE7 adds VM-level method handles
    • Can store references to methoes in the constant pool, load with LDC
    • Can obtain a method handle for any method (or filed access)
    • VM will happily inline through MH calls
    • API-based combinators for manipulating method handles
      • Add, remove, reorder arguments
      • Adapt (box, unbox, cast) arguments and return types
      • Compose methods

Why not “just” use MethodHandle?

  • Translating lambdas to MethodHandle seems obvious
    • Lambda is language-level method object
    • MethodHandle is VM-level method object
  • We could
    • Desugar lambdas expression to methods
    • Represent lambdas using MethodHandle in bytecode signatures

If we represented lambdas as MethodHandle, we’d translate list.removeIf(p -> p.age < minAge); to:

1
2
3
private static boolean lambda$1(int minAge, Person p) {
return p.age < minAge;
}

which is:

1
2
3
MethodHandle mh = LDC(lambda$1);
mh = MethodHandles.insertArguments(mh, 0, minAge);
list.removeIf(mh);

But if we did this, the signagure of List.removeIf becomes void removeIf(MethodHandle predicate)! Which is erasure on steroids!

  • Confusing: Can’t overload methods that take differently “Shaped” lambdas:
    • void foo(String -> int)
    • void foo(int -> int)
  • Plus, still would need to encode the erased type information somewhere
  • (Also: is invocation performance competitive with bytecode invocation?)

Stepping back

  • We would like a binary interface not tied to a specific implementation
    • Inner classes have too much baggage
    • MethodHandle is too low-level, is erased
    • Can’t force users to recompile, ever, so have to pick now
  • What we need is … another level of indirection
    • Let the static compiler emit a declarative recipe, rather than imperative code, for creating a lambda
    • Let the runtime execute that recipe however it deems best
    • And make it darned fast
    • Which sounds like a job for invokedynamic!

Bytecode invocation modes

  • Prior to Java SE 7, the JVM had four bytecodes for method invocation:
    • invokestatic: for static methods
    • invokevirtual: for class methods
    • invokeinterface: for interface methods
    • invokespecial: for constructors, private methods, and super-calls
  • Each specifies a class name, method name, and method signature

New bytecode tool: invokedynamic

  • Java SE 7 adds a 5th invocation mode: invokedynamic (indy)
  • Behavior of invoke {virtual, static, interface} are fixed and Java-like
    • But other langauges need custom method linkage
  • Basic idea: let some “language logic” determine call target

    • But then get out of the way
    • Language and VM become partners in flexible and efficient method dispatch
  • Invokedynamic started out as a tool for dynamic languages

    • A typical application would be invoking a method like def add(a, b) { a + b }
  • Here, the types of a and b are not known at compile time

    • And can change from call to call… but probably don’t
      • Good chance that, if called with two ints, next call will still be with two ints
    • We win by not having to re-link the call site for every invocation
  • The first time t he JVM executes an invokedynamic

    • Consults the bootstrap method for call site
    • Bootstrap returns a linked call site
    • Call site embed conditions under which it needs relinking (if any)
      • Such as the argument types changing
      • Otherwise, JVM does not have to consult bootstrap again
  • After linkage, JVM can treat the call site as fully linked

    • Can inline through linked indy callsites like any other
  • An indy callsite has three groups of operands

    • A bootstrap mthod
      • Called by the VM for linking the callsite on first invocation
      • Not called again after that
    • A static argument list, embedded in the constant pool
      • Available to the bootstrap method
    • A dynamic argument list, like any other method invocation
      • Not available to the bootstrap, but their static types and arity are
      • Passed to whatever target the callsite is linked to
  • If indy is for dynamic languages, why is the Java compiler using it?

    • All the types involved are static
    • What dynamic here is the code generation strategy
      • Generate inner classes?
      • Use method handles?
      • Use dynamic proxies?
      • Use VM-private APIs for constructing objects?
  • Indy lets us turn this choice into a pure implementation detail

    • Separate from the binary representation
  • We use indy to embeded a recipe for constructing a lambda, including:

    • The desugared implementation method (static)
    • The functional interface we are converting to (static)
    • Additional meta data, such as serialization information (static)
    • Values captured from the lexical scope (dynamic)
  • The capture site is called the lambda factory
    • Invoked with indy, returns an instance of the desired functional interface
    • Subsequent captures bypass the (slow) linkage path

Desugaring lambdas to methods

  • First, we desugar the lambda to a method, as before
    • Signarture matches functional interface method
    • Plus captured arguments prepended
    • Simplest lambdas desugar to static methods
      • But some need access to receiver, and so are intance methods
1
Predicate<Person> pred = p -> p.age < minAge;

becomes

1
2
3
private static boolean lambda$1(int minAge, Person p) {
return p.age < minAge;
}
  • Then we generate a indy call site which, when called, return the lambda:
1
list.removeIf(p -> p.age < minAge);

to

1
2
3
4
5
6
7
8
9
10
Predicate $p = indy(bootstrap=LambdaFactory,
staticargs=[Predicate, lambda$1],
dynaargs=[minAge]);
list.removeIf($p);
...
private static boolean lambda$1(int minAge, Person p) {
return p.age < minAge;
}

By deferring the code generation choice to runtime, it becomes a pure implementation detail.

For stateless (non-capturing) lambdas, we can create one single instance of the lambda object and always return that

Indy functions as a lazily intialized cache:

  • Defers initialization cost to first use
  • No heap overhead if lambda is never used
  • No extra field or static initializer
  • All stateless lambdas get lazy intialization and caching for free

Performance costs

  • Any translation scheme imposes costs at several levels:
    • Linkage cost: one-time cost of setting up lambda capture
    • Capture cost: cost of creating a lambda instance
    • Invocation cost: cost of invoking the lambda method
  • For inner class instances, these correspond to:
    • Linkage: loading the class
    • Capture: invoking the constructor
    • Invocation: invokeinterface
  • For linkage cost, lambda is 8%-24% faster than inner class.
  • For capture cost, worst-case lambda is equal to inner class, but for non-capturing lambda, it is 20 times more scalable.

Future VM suport

  • With VM help, we can optimize even further
  • VM could intrinsify lambda capture sites
  • Lambda capture is essentially a “boxing” operation

Serialization

  • We can’t just serialize the lambda object
    • Implementing class won’t exist at deserialization time
    • Deserializing VM may use a differenct translation stragegy
    • Need a dynamic serialization strategy too!

Summary

  • The evolutionary path is often full of obvious-but-wrong ideas
  • We use invokedynamic to capture lambda expressions
    • Gives us flexibility and performance
    • Free to change translation strategy at runtime
  • Even using the “dumb” translation strategy
    • No worse than inner classes in the worst case
    • 5 - 20x better than inner classes in a common case