Pattern Matching Quartet - JDK News #1

A summary of four recent discussions about pattern matching on the Project Amber mailing lists

Always embed videos

(and give me a cookie to remember - privacy policy)

Watch on YouTube

What follows is the script I used for the video.

Intro

Hi everyone,

I'm nipafx (but you can call me Nicolai) and today it's gonna be you, me, and the JDK news of the last few weeks up to today, January 26th 2021.

As you know, the JDK is developed out in the open and lots of interesting ideas and considerations are discussed on its many mailing lists. So many, actually, that I got stuck on the very first one I went through, which was Project Amber, so today we're just gonna look at that, in particular at these four items, all related to pattern matching:

The JDK is developed out in the open
  • Pattern Matching Docs
  • Array Patterns
  • Destructuring Patterns in for-each Loops
  • Diamond in Type Patterns?

My goal is not to explain all the ins and outs, but to give you an overview of what's being discussed with lots of links in the video description if you want to take a closer look.

For the next few minutes, keep in mind that we're peeking into an ongoing conversation, so none of this is decided and there's a lot of speculation in there, some of which is mine. Don't get distracted by the syntax - it's often just, you know, one of those - and focus on language capabilities instead. And please take a look at the relevant links before making up your mind.

With that out of the way, let's dive right in!

Pattern Matching Docs

As I mentioned, all of today's topics are related to pattern matching, so it's good to get the basics down first. Fortunately for us, Brian Goetz and Gavin Bierman recently published Pattern Matching in the Java Object Model and I highly recommend to give it a thorough read. It covers all the bases, dives deep, and explains well - here's a table of contents.

Why pattern matching?

  • Recap -- what is pattern matching?
  • Aggregation and destructuring
  • Object creation in Java
  • Composition
  • Isn't this just multiple return?
  • Patterns as API points
  • Data-driven polymorphism
If you're interested in pattern matching, it's a must-read

Patterns as class members

  • Anatomy of a pattern
  • Deconstruction patterns
  • Method patterns
  • Additional degrees of freedom

Combining patterns

  • A possible approach for parsing APIs
  • Down the road: structured patterns?
  • Flatter APIs

If you're interested in pattern matching in Java, it's a must-read. If you're not, you might as well stop the video now, because next up is more pattern matching.

Array Patterns

In a mail from January 5th, Brian Goetz starts the discussion of array patterns.

Object objects = new String[] { "f", "x" }

if (objects instanceof String[] { var a, var b })
	// prints "f / x"
	System.out.println(a + " / " + b);

These are patterns that you can use to match and destructure arrays. There are three steps hiding in this line:

  1. is objects a String array with length 2?
  2. if so, cast it to that type and extract the two elements
  3. declare the variables a and b and assign the two elements to them

After processing this, one of my first thoughts was "What if there are more elements? Can we extract the rest of the array into a variable, too?". And you can tell that Brian has been doing this for a while because here's what he writes next - and I quote:

People are immediately going to ask "can I bind something to the remainder"; I think this is mostly an "attractive distraction", and would prefer to not have this dominate the discussion.

Ok, then, sorry I asked.

As Patterns

In a follow-up mail Suminda Sirinath asks whether it will be possible to optionally extract the array itself.

Object objects = new String[] { "f", "x" }

if (objects instanceof String[] { var a, var b } letters)
	// use String a, b and String[] letters

This is apparently called an ass pattern, no not like that and I recommend not to image search that while at work. An as pattern and Gavin Bierman says it's on the list of things to consider. Neat.

Of Heads And Tails

After a little more back and forth, the conversation indeed lands on matching the array's head and tail, and Brian gives a good insight into how - quote - "idioms from language X are very much connected to the rest of language X and ignoring this rarely works out well":

Destructuring a list into head and tail is common in functional programming languages like Lisp and Haskell, where they work well because (a) lists are essentially linked lists, so creating the tail is cheap and (b) such languages have tail call elimination, so that using recursion to process lists is "natural, efficient, and doesn't lead to StackOverflowExceptions".

(Quick aside on tail call elimination - what's that? A recursive funtion calls itself, right?

What's tail call elimination?
public long length(Iterator<?> elements) {
	return length(elements, 0)
}

private long length(Iterator<?> elements, int length) {
	if (!elements.hasNext())
		return length;
	iterator.next();
	return length(elements, length + 1);
}

So when it does that for the first time, you have the original call with its arguments on the JVM's call stack and then the first recursive call with its arguments. Keep doing this and every time you put another method call plus arguments on the stack. Unfortunately, the stack is finite, so you can run out and get a StackOverflowException. You can expect the JVM to arrive at that point after a few tens of thousands of recursive calls.

Functional programming languages rely on recursion, though, so to prevent these problems, their compilers use a trick: If the recursive call is at the tail end of the method, meaning it's the last thing the method does before returning, like in this example, the entire recursive solution can automatically be transformed into a loop. So the compilers do that. They eliminate the tail call to create a loop instead.)

Back to Brian's points, Java's arrays aren't linked lists (so creating a tail is expensive) and it doesn't eliminate tail calls (so using recursion is risky). Giving developers a syntactically easy way to split an array into head and tail will invite them to create solutions that perform poorly and are unreliable.

So I guess that's out the window.

Destructuring Patterns in For-Each Loops

record Rectangle(double xLength, double yLength) { }
Rectangle[] rectangles = ...

// print all areas
for (Rectangle rect : rectangles) {
	double area = rect.xLength() * rect.yLength();
	System.out.println(area);
}

for (Rectangle(double xLength, double yLength) : rectangles) {
	double area = xLength * yLength;
	System.out.println(area);
}

As August Nagro points out, another good place to use patterns are for-each loops and much to my delight, Brian says "absolutely this is on the radar". His reply goes further, though, and starts with an interesting observation.

But before we get there, we need some terminology.

//         |--------- pattern --------|
//  target |----- test ------| variable
    animal instanceof Elephant elephant

In this example, you can see the parts that make up a pattern:

  • the target is a variable or expression that we try to match
  • the test is a run-time check of some property the target may or may not have
  • the variable is what the target gets assigned to if it passes the test
  • test and variable together make up the pattern

Brian's observation is that, "if you squint", any run-of-the-mill declaration can be seen as a pattern match:

//  |---- pattern ----|   |---- target ----|
//  |- test -|variable|
    Rectangle rectangle = computeRectangle();
  • the right-hand side is the target
  • the left-hand side is the pattern with the type as test and the variable as, well, the variable

If we see it like that, deconstruction can be applied to the left-hand side.

//  |--------------- pattern ---------------|
//  |- test -|--------- variables ----------|
    Rectangle(double xLength, double yLength) =
//		|---- target ----|
		computeRectangle();
// use xLength and yLength

And this works everywhere where you declare variables: for-each loops, try-with-resources, method declarations,... Lambda expressions, too, I think?

Neat, huh? I'm really looking forward to see where exactly this goes.

Type Patterns and Generics

Today's last topic deals with type patterns and generics. As the feature is currently finalized in Java 16, using a raw type as a pattern's test means the variable will also be of that raw type. If you need the variable to have a generic type, you need to use the generic type in the test. That's not too bad in this example, but add more or nested generics, wildcards, and enterprise-grade class names and you can see that this gets out of hand quickly.

Collection<String> words = /*...*/;
if (words instanceof List wordList)
	// wordList is of raw type List
if (words instanceof List<String> wordList)
	// wordList is of type List<String>

In a mail on January 4th, Brian Goetz points this out and wonders aloud whether it would've been better to have the compiler infer the generic type for instanceof List. Since type patterns are finalized in Java 16, this ship has sailed, though, and Brian proposes to allow the diamond operator to request generic type inference from the compiler.

Collection<String> words = /*...*/;
if (words instanceof List wordList)
	// wordList is of raw type List
if (words instanceof List<String> wordList)
	// wordList is of type List<String>
if (words instanceof List<> wordList)
	// wordList is of type List<String>

There are two topics to be discussed here.

To Diamond Or Not To Diamond?

The obvious one is whether it would've been better to have the raw-appearing type trigger generic type inference or use the diamond for that. Remi Forax says that "the mix between parenthesis and angle brackets rapidly becomes unreadable, so for a type pattern inside a switch, I think that even the diamond syntax is too much."

I don't have enough insight into that aspect, yet, to have an opinion on it, but there's another one that came to my mind: Having raw types, diamond operator, and full generics behave the same way in type patterns as elsewhere in the language keeps our mental model of Java simpler, so even though nobody wants to type out the diamond operator in every generic-related pattern, it has some value.

Big Features - Small Releases?

The less obvious but arguably more interesting discussion is what Brian's discovery tells us about the development and release process of these larger language features. If you remember, in the past they would drop in one big chunk in some large release. Nowadays, we get smaller releases with self-contained and functional featurettes, but some of them are also parts of something larger, not the whole thing.

Remy argues that, by releasing patterns bit by bit, more discrepancies like this will be discovered when it's too late and that all patterns should be released at the same time. He also proposes to revert the finalization of type patterns in Java 16, so they stay preview features and can thus still be changed in future releases.

What do you think? Do you enjoy getting features and giving feedback earlier or would you prefer keeping more of them in preview until the larger picture is filled in?

Outro

And that was it for the JDK news, or rather the Project Amber news, for today. I hope you enjoyed it. If you did, leave a like or a comment, and there'll be more videos like this in the future.

Until then, have a great time. So long...