Java Lambdas and Streams

Introduction to Functional Programming in Java using Lambdas and Streams

PREREQUISITES — You should already…
  • have some Java programming experience;
  • be familiar with object-oriented concepts.

Java 8 Lambdas and Streams

Java 8 is probably the most significant upgrade to the Java programming language since Java 5 intro­duced generics and annotations. The most important features of Java 8 are, without a doubt, lambda expressions and streams. There are a host of new features that support lambdas and streams: default (defender) methods, constructor and method references, and classes such as Optional, Spliterator and StringJoiner. Lambdas and streams are more than just the inclusion of a few new language con­structs and libraries: by adding a functional programming style to Java, they can fundamentally change the way we code and think about Java programming.

Java 8 Goals

In general, the high-level goals of Java 8 are to make code more flexible and reusable, to more easily process large data sets, and to better utilise multiple CPU cores.

There are a number of reasons why it’s important to learn how to use lambdas and streams in Java 8:

The final version of Java 8 was released in March 2014, nearly ten years after Java 5 introduced the generic programming model to Java.

Traditionally, functional languages like Haskell and LISP have libraries of algorithms for data pro­cess­ing. Java-based languages such as Scala and Kotlin also have them, and now Java 8 has its own library, being the new Streams API. The streams syntax is new, and can look strange at first. There are also a large number of new methods to learn and apply, particularly within the existing collection classes.

Lambdas don’t turn Java into Haskell or Lisp, and streams don’t turn Java into Hadoop. Nevertheless, they provide significant new capabilities to Java, and result in the biggest change in Java pro­gram­ming style since generics and annotations were added to the language.

This course will explain the syntax and use of lambda expressions, introduce streams, and give exam­ples of the types of applications to which they are well suited.

What is Functional Programming?

If we take a highly simplistic view of programming, it relies on two very different building blocks: data and code. Data is usually manipulated by code, and gets passed around between various pieces of code. Thus we can say that data is a first-class citizen of the programming world, while code is a sec­ond-class citizen. Think of the old Roman Empire — the second-class citizens usually did all the work, while the first-class citizens led a life of leisure…

Functional programming aims to promote functions to first-class citizens, in as much as we can pass code to other pieces of code, we can manipulate code, and we can pass algorithms around just like we pass objects around.

To support a functional programming style, a programming language must support functions as first-class citizens. Prior to Java 8, the only way we could write code in a functional style was with lots of an­on­ymous inner class boilerplate code. With the introduction of lambda expressions, functions have become first-class citizens and can be passed around just like any other variables.

Lambda expressions are required if a programming language needs to support higher-order func­tions. Higher-order functions are functions that either accept other functions as arguments or return a function as a result. Java 8 now supports higher-order functions.

Programming Constructs

The three programming constructs at the core of functional programming in Java are lambdas, me­thod references, and constructor references.

Lambdas will be introduced in this section, while method and constructor references will be covered in the next sections.

Origins of the Term Lambda

Lambda expressions have their roots in the Lambda calculus. Lambda calculus (also written as λ-cal­cu­lus) is a formal system in mathematical logic for expressing computations based on function ab­strac­tion using variable binding and substitution. It is a universal computational model that can be used to simulate any Turing machine. It was first introduced by mathematician and logician Alonzo Church in the 1930s as part of his research into the foundations of mathematics, and formally published in a paper in 1936.

Alonzo Church wanted to formalize what it means for a mathematical function to be effectively com­pu­table. He used the Greek letter lambda (λ) to mark parameters. Since then, an expression with pa­ra­meter variables has been called a lambda expression. The lambda calculus influenced the design of the LISP programming language and functional programming languages in general.

Review of Anonymous Classes

Let’s consider one of the major uses of functional programming. A common problem in Java (as with other languages) is that we often need to create and/or reference code that must be executed by an­other piece of code, such as:

If you are familiar with the concept of design patterns, this is a classic problem that can easily be solved using the Strategy design pattern, which is one of the most widely used OO design patterns. The following quote is from the Gang of Four (GoF) book, “Designs Patterns, Elements of Reusable Ob­ject-Oriented Software”:

Define a family of algorithms, encapsulate each one, and make them interchangeable. Strategy lets the algorithm vary independently from the clients that use it.

The Strategy design pattern (also called Algorithm or Policy) is where we define a set of steps for solv­ing a particular problem, and pass that algorithm to an object, instead of that object im­ple­men­ting its own algorithm. It’s a way of changing the behaviour of an object on the fly at runtime. The usual (pre-functional) Java programming mechanisms for implementing the Strategy design pattern include:

Now with Java 8 we can also use lambdas, which give us a simple and concise way to define an algorithm on the fly as needed.

Let’s see some code using the simplest possible example of a Runnable thread.

Main class implementing the Runnable interface

Separate class implementing the Runnable interface

Named inner class implementing the Runnable interface

Anonymous inner class implementing the Runnable interface

Lambda Expressions

When we look at the code of the previous anonymous inner class, we can see that the compiler knows that the object passed to the Thread constructor must be a Runnable object, and as such, it must override the run() method. An intelligent compiler would be able to infer a fair amount of in­fo­rma­tion from the existing code, and therefore much of it then becomes redundant. The only thing the compiler would not be able to infer, is what the parameter names are (if any), and what code should be inside the method body.

This code can be replaced by a lambda expression which represents a block of code, i.e. a function, that can be executed at a later time. There are no actual function types in Java, so instead, functions are implemented as instances of classes that implement a particular interface. We can think of lambda expressions as being very similar to anonymous inner classes. Lambda expressions give us a con­ve­nient syntax for creating such instances.

A lambda expression is a concise description of an anonymous function that can be passed around. The function doesn’t have a name, but it does have a list of parameters, a body, a return type, and possibly also a list of potentially thrown exceptions. We think of lambdas as functions, as opposed to methods, because lambdas aren’t associated with any particular classes in the same way as methods. The syntax has the following form:

The three components of the lambda expression are the parameter list, the arrow symbol and the body.

A normal method has four elements:

The major difference between a lambda and a method is that a lambda expression only has the last two elements: the parameter list and the body.

Simple Examples

Lambda Explanation
n -> n % 2 == 0 Given a number n, returns a boolean if it is even.
(char c) -> c == 'A' Given a character c, returns a boolean if it is equal to A.
(x, y) -> x + y Given two numbers x and y, returns the sum.
(int a, int b) -> a * a + b * b Given two ints a and b, returns the sum of their squares.
() -> 42 Given no parameters, returns 42.
() -> { return 3.14159 } Given no parameters, returns 3.14159.
(String s) -> { System.out.println(s); } Given a String s, prints s and returns void.
() -> { System.out.println("Hello!"); } Given no parameters, prints Hello! and returns void.
Arrays.sort(nums, (n1, n2) -> n2 - n1); Passing a lambda to sort an array in descending order.

Lambdas versus Anonymous Inner Classes

Despite the fact that a lambda expression is very similar to an anonymous inner class, the most im­por­tant difference between the two constructs is scope.

Inner Class / Anonymous Inner Class Lambda Expression
Creates new scope. Uses enclosing scope.
Can shadow local variables from enclosing scope. Cannot shadow local variables.
The this keyword refers to its own instance. The this keyword refers to the enclosing instance.
Enclosing scope variables need not be final. Enclosing scope variables must be final or effectively final.
Type is explicitly specified on instantiation. Type of lambda determined by context.

Functional Interfaces

A functional interface is an interface with a single abstract method (previously called a SAM interface). Before Java 8 there were already a large number of SAM interfaces:

Java 8 has formalised this concept with a new optional @FunctionalInterface annotation. Because Java 8 now supports static and default method implementations in an interface, it’s useful to use the @FunctionalInterface annotation to allow the compiler to check for a single abstract method, be­cause the code in interfaces can become quite large.

In Java, lambda expressions are represented as objects and must be bound to a particular object type, known as a functional interface. This is called the target type. Since a functional interface can only have a single abstract method, the parameter types of the lambda expression must correspond to the parameters of that method, and the type of the lambda body must correspond to the return type of the same method. In addition, any exceptions that are thrown in the body must be allowed by the throws clause of the functional interface method.

As we’ve just seen, the Runnable interface has only a single abstract method, therefore it can be re­fer­red to as a functional interface. This concept is now formalised by annotating the interface with the @FunctionalInterface annotation.

Functional interface types can be used as targets for lambda expressions. To continue the example, we can create a reference to a Runnable object and assign a lambda expression to it:

We could use the reference as follows:

Additional Examples

There are a number of commonly used examples to introduce the syntax and style of lambda pro­gram­ming. These include code implementations of the Runnable, ActionListener and Comparator in­ter­faces.

Runnable

We’ve already seen examples using the Runnable interface where we replaced:

with:

ActionListener

To avoid repeating all the different code implementations from earlier (separate class, main class, inner class, anonymous inner class), let’s just focus on replacing an anonymous inner class with a lamb­da expression. We can implement event handling code using an anonymous inner class as follows:

We can replace the previous anonymous inner class with a lambda expression similar to the fol­low­ing:

Note the parameter within the parentheses. Because we wish to use the parameter within the block of code, we need to define it beforehand. It would have been equally correct to define the lambda as:

or even just:

Comparator

We’ve been discussing commonly-used implementations of the Strategy pattern. Another common use is with the Collections.sort() method. There are two overloaded sort() methods: one takes a List object whose elements must implement the Comparable interface; the other takes a List object and a Comparator object which then does the comparison operation. This is a good example of the Strategy pattern — instead of implementing a fixed algorithm for comparison, we can define a desired algorithm on the fly.

The usual (pre-functional) Java programming mechanisms include:

A comparator is an instance of a class that implements the Comparator interface:

For example, to compare strings by length, we can define a class that implements Comparator<String> :

To compare two strings by length, we would need to instantiate an object of this type and then call its compare() method:

The difference between defining a Comparator class and using the Comparable behaviour of the String class itself, is that the compare() method is called on the comparator object, not the string it­self. If we used the pre-defined default compareTo() method of the String class, the code would have been string1.compareTo(string2), which compares two strings lexicographically.

When using this LengthComparator with the Collections.sort() method, we can either create an object beforehand:

or create an anonymous inner class on the fly:

How can we make this code simpler? In the same way as we saw earlier, the compiler can infer a lot of information from the Collections.sort(list, comparator) method. It knows that the second pa­ra­me­ter must be an object that implements the Comparator interface, which then also implies that it must have a compare() method that takes two objects of the defined parameterized type.

We’ve already learned that, instead of creating an anonymous inner class, we can just pass a lambda to the Collections.sort(list, comparator) method. We need to provide the parameters (with their types if they can’t be inferred), and the body that implements the actual algorithm.

This can be simplified further because the body consists of a single statement returning a value. We can replace it with an expression of the same value, which will then be implicitly returned:

An even further simplification can be applied. The parameterized type of the List is a String, there­fore the compiler is able to infer that the two parameters of the comparator will then also be String types:

It’s very obvious here that all we’re doing is passing a short piece of in-line code to a method — the ideal use of a lambda expression! The body is just one line with no semicolons and no need for the return keyword. This is an ideal lambda expression with high signal-to-noise ratio.

If the body of the lambda expression is longer than a few lines of code, then the signal-to-noise ratio goes down, because there’s more additional syntax (braces, semi-colons, return statements, etc.).

Remember that a functional interface can be used as a target for a lambda expression. To make the code a bit easier to read, it’s possible to do the following:

The Arrays.sort() method works in exactly the same way. There are a number of overloaded sort() methods taking a single parameter, which is a primitive array. However, there are two sort() me­thods taking a parameterized object array. The last parameter of both methods is a Comparator. So we can simply supply a lambda as we did earlier to sort a String array:

Behind the scenes, the second parameter variable of the method receives an instance of a class that implements Comparator<String>.

Lambda Best Practices

Short, concise lambda expressions support code readability and reusability, which are key benefits of the functional programming style. Multi-line lambdas make code noisy and hard to read, test and re­use, which leads to poor code quality and duplication. Fortunately, it’s easy to avoid these issues by moving the body of a multi-line lambda to a named function, then invoking the function from within the lambda. Wherever possible, we should also replace lambda expressions with method references.

If possible, use one-line lambdas instead of a large block of code enclosed in braces. Remember that lambdas should be an expression, not a full function. Despite its concise syntax, lambdas should pre­cise­ly express the functionality they provide.

As a first important step, we should simply avoid using braces in lambdas — this will force us to think more carefully and clearly about what we want to achieve with the lambda. But we mustn’t use this “one-line lambda” rule as gospel. If we have two or three lines in the definition of a lambda and it is clear and concise, then it probably won’t be worthwhile refactoring that code into a separate method.

The rationale behind lambdas is to allow us to write quick throwaway functions without giving them names. We don’t have to bother with naming and declaring a function that we’re only going to use once: we can just write the expression where we need it, essentially being in-line code.

Summary

2018-05-22: Edited. [jjc]
2018-03-29: Edited. [lsc]
2018-03-23: Edited. [jjc]
2018-03-12: Created. [lsc]