Java Lambdas and Streams
Introduction to Functional Programming in Java using Lambdas and Streams
PREREQUISITES — You should already…
- have some Java programming experience;
- be familiar with object-oriented concepts.
Java 8 Lambdas and Streams
Java 8 is probably the most significant upgrade to the Java programming language since Java 5 introduced generics and annotations. The most important features of Java 8 are, without a doubt, lambda expressions and streams. There are a host of new features that support lambdas and streams: default (defender) methods, constructor and method references, and classes such as Optional
, Spliterator
and StringJoiner
. Lambdas and streams are more than just the inclusion of a few new language constructs and libraries: by adding a functional programming style to Java, they can fundamentally change the way we code and think about Java programming.
Java 8 Goals
In general, the high-level goals of Java 8 are to make code more flexible and reusable, to more easily process large data sets, and to better utilise multiple CPU cores.
There are a number of reasons why it’s important to learn how to use lambdas and streams in Java 8:
The new functional programming features of Java 8 address modern programming requirements of processing big data on multi-core machines.
Lambda expressions are a way of representing the functions of functional programming, and can make code significantly more flexible and reusable, as well as being easier to write and maintain.
Streams make it easy to process big data in simple and easy ways. Streams are sequences of data elements that wrap around collections and other data sources. Streams use lambda expressions extensively, and support high-performance operations such as the map-filter-reduce paradigm. This substantially simplifies stream processing, compared to the low-level, iterative approach of the
Collection
APIs.Streams are faster and more memory efficient due to lazy evaluation and automatic parallelization, without needing programmers to write explicit multithreading code.
The final version of Java 8 was released in March 2014, nearly ten years after Java 5 introduced the generic programming model to Java.
Traditionally, functional languages like Haskell and LISP have libraries of algorithms for data processing. Java-based languages such as Scala and Kotlin also have them, and now Java 8 has its own library, being the new Streams API. The streams syntax is new, and can look strange at first. There are also a large number of new methods to learn and apply, particularly within the existing collection classes.
Lambdas don’t turn Java into Haskell or Lisp, and streams don’t turn Java into Hadoop. Nevertheless, they provide significant new capabilities to Java, and result in the biggest change in Java programming style since generics and annotations were added to the language.
This course will explain the syntax and use of lambda expressions, introduce streams, and give examples of the types of applications to which they are well suited.
What is Functional Programming?
If we take a highly simplistic view of programming, it relies on two very different building blocks: data and code. Data is usually manipulated by code, and gets passed around between various pieces of code. Thus we can say that data is a first-class citizen of the programming world, while code is a second-class citizen. Think of the old Roman Empire — the second-class citizens usually did all the work, while the first-class citizens led a life of leisure…
Functional programming aims to promote functions to first-class citizens, in as much as we can pass code to other pieces of code, we can manipulate code, and we can pass algorithms around just like we pass objects around.
To support a functional programming style, a programming language must support functions as first-class citizens. Prior to Java 8, the only way we could write code in a functional style was with lots of anonymous inner class boilerplate code. With the introduction of lambda expressions, functions have become first-class citizens and can be passed around just like any other variables.
Lambda expressions are required if a programming language needs to support higher-order functions. Higher-order functions are functions that either accept other functions as arguments or return a function as a result. Java 8 now supports higher-order functions.
Programming Constructs
The three programming constructs at the core of functional programming in Java are lambdas, method references, and constructor references.
- Lambdas are perfect for on-the-fly functions such as event handlers, comparisons, sorting, threading and the like. They are typically short, concise, in-line pieces of code, which are not necessarily reusable. Very often they are one-time only functions.
- Method references are reusable functions, encapsulated in a class to process data in some class-appropriate way. They might occur in exactly the same place as a lambda:
// a lambda expression
Arrays.asList(nums).stream().forEach(n -> System.out.println(n.getValue()));
// a method reference
Arrays.asList(nums).stream().forEach(Num::dump);
- Constructor references are basically object factories that can be used to generate instances.
@FunctionalInterface
interface NumberFactory
{
abstract Num[] makeNumArray(final int n);
}
...
NumberFactory numsFactory = Num[]::new; // constructor reference
Num[] nums = numsFactory.makeNumArray(size); // use the number factory
Lambdas will be introduced in this section, while method and constructor references will be covered in the next sections.
Origins of the Term Lambda
Lambda expressions have their roots in the Lambda calculus. Lambda calculus (also written as λ-calculus) is a formal system in mathematical logic for expressing computations based on function abstraction using variable binding and substitution. It is a universal computational model that can be used to simulate any Turing machine. It was first introduced by mathematician and logician Alonzo Church in the 1930s as part of his research into the foundations of mathematics, and formally published in a paper in 1936.
Alonzo Church wanted to formalize what it means for a mathematical function to be effectively computable. He used the Greek letter lambda (λ) to mark parameters. Since then, an expression with parameter variables has been called a lambda expression. The lambda calculus influenced the design of the LISP programming language and functional programming languages in general.
Review of Anonymous Classes
Let’s consider one of the major uses of functional programming. A common problem in Java (as with other languages) is that we often need to create and/or reference code that must be executed by another piece of code, such as:
- Code to respond to an event such as a button or mouse click in a graphical program.
- Code to run as a background process (in a thread).
- Code to sort and compare objects.
If you are familiar with the concept of design patterns, this is a classic problem that can easily be solved using the Strategy design pattern, which is one of the most widely used OO design patterns. The following quote is from the Gang of Four (GoF) book, “Designs Patterns, Elements of Reusable Object-Oriented Software”:
“Define a family of algorithms, encapsulate each one, and make them interchangeable. Strategy lets the algorithm vary independently from the clients that use it.”
The Strategy design pattern (also called Algorithm or Policy) is where we define a set of steps for solving a particular problem, and pass that algorithm to an object, instead of that object implementing its own algorithm. It’s a way of changing the behaviour of an object on the fly at runtime. The usual (pre-functional) Java programming mechanisms for implementing the Strategy design pattern include:
- The main class implementing the code.
- A separate top-level class implementing the code.
- A named inner class implementing the code.
- An anonymous inner class implementing the code.
Now with Java 8 we can also use lambdas, which give us a simple and concise way to define an algorithm on the fly as needed.
Let’s see some code using the simplest possible example of a Runnable
thread.
Main class implementing the Runnable
interface
// main class
public class MainClass implements Runnable
{
@Override
public void run()
{
// code here
}
}
public void startThread()
{
Thread t = new Thread (this);
t.start();
...
}
}
Separate class implementing the Runnable
interface
// separate class
class MyRunnableClass implements Runnable
{
@Override
public void run()
{
// code here
}
}
public class MainClass
{
public void startThread()
{
Runnable runner = new MyRunnableClass();
Thread t = new Thread (runner);
t.start();
...
}
}
Named inner class implementing the Runnable
interface
public class MainClass
{
// named inner class
class MyRunnableInnerClass() implements Runnable
{
@Override
public void run()
{
// code here
}
}
public void startThread()
{
Runnable runner = new MyRunnableInnerClass();
Thread t = new Thread (runner);
t.start();
...
}
}
Anonymous inner class implementing the Runnable
interface
public class MainClass
{
public void startThread()
{
Thread t = new Thread (
// anonymous inner class
new Runnable()
{
@Override
public void run()
{
// code here
}
}
);
t.start();
...
}
}
Lambda Expressions
When we look at the code of the previous anonymous inner class, we can see that the compiler knows that the object passed to the Thread
constructor must be a Runnable
object, and as such, it must override the run()
method. An intelligent compiler would be able to infer a fair amount of information from the existing code, and therefore much of it then becomes redundant. The only thing the compiler would not be able to infer, is what the parameter names are (if any), and what code should be inside the method body.
This code can be replaced by a lambda expression which represents a block of code, i.e. a function, that can be executed at a later time. There are no actual function types in Java, so instead, functions are implemented as instances of classes that implement a particular interface. We can think of lambda expressions as being very similar to anonymous inner classes. Lambda expressions give us a convenient syntax for creating such instances.
A lambda expression is a concise description of an anonymous function that can be passed around. The function doesn’t have a name, but it does have a list of parameters, a body, a return type, and possibly also a list of potentially thrown exceptions. We think of lambdas as functions, as opposed to methods, because lambdas aren’t associated with any particular classes in the same way as methods. The syntax has the following form:
The three components of the lambda expression are the parameter list, the arrow symbol and the body.
- The parameter list must be enclosed in parentheses. If there are no parameters, then the pair of parentheses must still be included. If there is a single parameter, the parentheses are optional. The types of the parameters are also optional if they can be otherwise inferred.
- The arrow symbol consists of a hyphen sign and a greater-than sign. This can be pronounced as “arrow” (the simplest), or “results in”, “evaluates to” or “maps to”.
- The body can contain either a single expression or any number of Java statements, each terminated by a semi-colon. If there is only a single expression, the curly braces are optional; if there are a number of statements (even just a single
return
statement), the braces are required.
// no curly braces for a single expression
(parameter list) -> expression
// required curly braces for statements
(parameter list) -> {statements;}
A normal method has four elements:
- name;
- return type;
- parameter list; and
- body.
The major difference between a lambda and a method is that a lambda expression only has the last two elements: the parameter list and the body.
Simple Examples
Lambda | Explanation |
---|---|
n -> n % 2 == 0 |
Given a number n , returns a boolean if it is even. |
(char c) -> c == 'A' |
Given a character c , returns a boolean if it is equal to A . |
(x, y) -> x + y |
Given two numbers x and y, returns the sum. |
(int a, int b) -> a * a + b * b |
Given two int s a and b, returns the sum of their squares. |
() -> 42 |
Given no parameters, returns 42 . |
() -> { return 3.14159 } |
Given no parameters, returns 3.14159 . |
(String s) -> { System.out.println(s); } |
Given a String s , prints s and returns void . |
() -> { System.out.println("Hello!"); } |
Given no parameters, prints Hello! and returns void . |
Arrays.sort(nums, (n1, n2) -> n2 - n1); |
Passing a lambda to sort an array in descending order. |
Lambdas versus Anonymous Inner Classes
Despite the fact that a lambda expression is very similar to an anonymous inner class, the most important difference between the two constructs is scope.
When we use an inner/anonymous inner class, it creates a new scope. We can hide local variables from the enclosing scope by creating new local variables with the same names within the body of the inner class. The
this
keyword, when used inside an inner class, refers to its own instance. i.e to the anonymous class itself.Lambda expressions work with the enclosing scope. We can’t hide variables from the enclosing scope inside the lambda’s body. This gives a compile time error when done inside a lambda expression. In this case, the
this
keyword is a reference to the enclosing instance, i.e. to the class enclosing the lambda expression.Lambdas can work with variables belonging to the enclosing scope; however they must be either
final
or effectively final variables. This means that these variables can only be assigned a value once. A variable or parameter declared with thefinal
keyword is final. A variable or parameter that isn’t declared withfinal
, but whose value never changes after it has been initialized, is effectively final.The type of the lambda expression is determined from the context, whereas the type of the anonymous class is specified explicitly when we create the instance of the anonymous class.
Inner Class / Anonymous Inner Class | Lambda Expression |
---|---|
Creates new scope. | Uses enclosing scope. |
Can shadow local variables from enclosing scope. | Cannot shadow local variables. |
The this keyword refers to its own instance. |
The this keyword refers to the enclosing instance. |
Enclosing scope variables need not be final . |
Enclosing scope variables must be final or effectively final. |
Type is explicitly specified on instantiation. | Type of lambda determined by context. |
Functional Interfaces
A functional interface is an interface with a single abstract method (previously called a SAM interface). Before Java 8 there were already a large number of SAM interfaces:
public interface ActionListener extends EventListener
{
public void actionPerformed(ActionEvent e);
}
public interface Comparator<T>
{
int compare(T o1, T o2);
}
public interface Runnable
{
public void run();
}
public interface Callable<V>
{
V call() throws Exception;
}
public interface AutoCloseable
{
void close() throws Exception;
}
Java 8 has formalised this concept with a new optional @FunctionalInterface
annotation. Because Java 8 now supports static and default method implementations in an interface, it’s useful to use the @FunctionalInterface
annotation to allow the compiler to check for a single abstract method, because the code in interfaces can become quite large.
In Java, lambda expressions are represented as objects and must be bound to a particular object type, known as a functional interface. This is called the target type. Since a functional interface can only have a single abstract method, the parameter types of the lambda expression must correspond to the parameters of that method, and the type of the lambda body must correspond to the return type of the same method. In addition, any exceptions that are thrown in the body must be allowed by the throws
clause of the functional interface method.
As we’ve just seen, the Runnable
interface has only a single abstract method, therefore it can be referred to as a functional interface. This concept is now formalised by annotating the interface with the @FunctionalInterface
annotation.
Functional interface types can be used as targets for lambda expressions. To continue the example, we can create a reference to a Runnable
object and assign a lambda expression to it:
We could use the reference as follows:
public class MainClass
{
public void startThread()
{
Runnable runner = () -> {...};
Thread t = new Thread (runner);
t.start();
...
}
}
Additional Examples
There are a number of commonly used examples to introduce the syntax and style of lambda programming. These include code implementations of the Runnable
, ActionListener
and Comparator
interfaces.
Runnable
We’ve already seen examples using the Runnable
interface where we replaced:
Thread t = new Thread (
// anonymous inner class
new Runnable()
{
@Override
public void run()
{
// code here
}
}
);
with:
ActionListener
To avoid repeating all the different code implementations from earlier (separate class, main class, inner class, anonymous inner class), let’s just focus on replacing an anonymous inner class with a lambda expression. We can implement event handling code using an anonymous inner class as follows:
JButton button = new JButton ("Press me!");
button.addActionListener (
// anonymous inner class
new ActionListener()
{
@Override
public void actionPerformed(ActionEvent e)
{
System.out.println(e);
}
}
);
We can replace the previous anonymous inner class with a lambda expression similar to the following:
JButton button = new JButton ("Press me!");
button.addActionListener ( (e) -> System.out.println(e) );
Note the parameter within the parentheses. Because we wish to use the parameter within the block of code, we need to define it beforehand. It would have been equally correct to define the lambda as:
or even just:
Comparator
We’ve been discussing commonly-used implementations of the Strategy pattern. Another common use is with the Collections.sort()
method. There are two overloaded sort()
methods: one takes a List
object whose elements must implement the Comparable
interface; the other takes a List
object and a Comparator
object which then does the comparison operation. This is a good example of the Strategy pattern — instead of implementing a fixed algorithm for comparison, we can define a desired algorithm on the fly.
The usual (pre-functional) Java programming mechanisms include:
- A separate top-level class implementing the code, e.g.
Collections.sort(aList, new SeparateClass(...));
- The main class implementing the code, e.g.
Collections.sort(aList, this);
- A named inner class, e.g.
Collections.sort(aList, new InnerClass(...));
- An anonymous inner class, e.g.
Collections.sort(aList, new Comparator<String>() { ... });
A comparator is an instance of a class that implements the Comparator
interface:
For example, to compare strings by length, we can define a class that implements Comparator<String>
:
class LengthComparator implements Comparator<String>
{
public int compare(String first, String second)
{
return first.length() - second.length();
}
}
To compare two strings by length, we would need to instantiate an object of this type and then call its compare()
method:
Comparator<String> comparator = new LengthComparator();
if (comparator.compare(string1, string2) > 0)
{
...
}
The difference between defining a Comparator
class and using the Comparable
behaviour of the String
class itself, is that the compare()
method is called on the comparator object, not the string itself. If we used the pre-defined default compareTo()
method of the String
class, the code would have been string1.compareTo(string2)
, which compares two strings lexicographically.
When using this LengthComparator
with the Collections.sort()
method, we can either create an object beforehand:
or create an anonymous inner class on the fly:
List<String> list = new ArrayList<>();
Collections.sort(list,
new Comparator<String>()
{
public int compare(String first, String second)
{
return first.length() - second.length();
}
}
);
How can we make this code simpler? In the same way as we saw earlier, the compiler can infer a lot of information from the Collections.sort(list, comparator)
method. It knows that the second parameter must be an object that implements the Comparator
interface, which then also implies that it must have a compare()
method that takes two objects of the defined parameterized type.
We’ve already learned that, instead of creating an anonymous inner class, we can just pass a lambda to the Collections.sort(list, comparator)
method. We need to provide the parameters (with their types if they can’t be inferred), and the body that implements the actual algorithm.
List<String> list = new ArrayList<>();
Collections.sort(list,
(String first, String second)
->
{ return first.length() - second.length(); }
);
This can be simplified further because the body consists of a single statement returning a value. We can replace it with an expression of the same value, which will then be implicitly returned:
List<String> list = new ArrayList<>();
Collections.sort(list,
(String first, String second)
->
first.length() - second.length()
);
An even further simplification can be applied. The parameterized type of the List
is a String
, therefore the compiler is able to infer that the two parameters of the comparator will then also be String
types:
List<String> list = new ArrayList<>();
Collections.sort(list, (first, second) -> first.length() - second.length() );
It’s very obvious here that all we’re doing is passing a short piece of in-line code to a method — the ideal use of a lambda expression! The body is just one line with no semicolons and no need for the return
keyword. This is an ideal lambda expression with high signal-to-noise ratio.
If the body of the lambda expression is longer than a few lines of code, then the signal-to-noise ratio goes down, because there’s more additional syntax (braces, semi-colons, return statements, etc.).
Remember that a functional interface can be used as a target for a lambda expression. To make the code a bit easier to read, it’s possible to do the following:
Comparator comp1 = (first, second) -> first.length() - second.length();
Comparator comp2 = (first, second) -> second.length() - first.length();
List<String> list = new ArrayList<>();
Collections.sort(list, comp1);
...
Collections.sort(list, comp2);
The Arrays.sort()
method works in exactly the same way. There are a number of overloaded sort()
methods taking a single parameter, which is a primitive array. However, there are two sort()
methods taking a parameterized object array. The last parameter of both methods is a Comparator
. So we can simply supply a lambda as we did earlier to sort a String
array:
Behind the scenes, the second parameter variable of the method receives an instance of a class that implements Comparator<String>
.
Lambda Best Practices
Short, concise lambda expressions support code readability and reusability, which are key benefits of the functional programming style. Multi-line lambdas make code noisy and hard to read, test and reuse, which leads to poor code quality and duplication. Fortunately, it’s easy to avoid these issues by moving the body of a multi-line lambda to a named function, then invoking the function from within the lambda. Wherever possible, we should also replace lambda expressions with method references.
If possible, use one-line lambdas instead of a large block of code enclosed in braces. Remember that lambdas should be an expression, not a full function. Despite its concise syntax, lambdas should precisely express the functionality they provide.
As a first important step, we should simply avoid using braces in lambdas — this will force us to think more carefully and clearly about what we want to achieve with the lambda. But we mustn’t use this “one-line lambda” rule as gospel. If we have two or three lines in the definition of a lambda and it is clear and concise, then it probably won’t be worthwhile refactoring that code into a separate method.
The rationale behind lambdas is to allow us to write quick throwaway functions without giving them names. We don’t have to bother with naming and declaring a function that we’re only going to use once: we can just write the expression where we need it, essentially being in-line code.
Summary
Lambdas are perfect for on-the-fly functions such as event handlers, comparisons, sorting, threading and the like.
Lambdas should be written as short, concise, in-line pieces of code.
Lambdas are written as
(parameter list) -> {body}
We can assign lambda expressions to variables of functional interface types.
We can pass functions around just like other objects.
We can pass functions into higher-order functions. Higher-order functions are functions that accept other functions as parameters.
2018-05-22: Edited. [jjc]
2018-03-29: Edited. [lsc]
2018-03-23: Edited. [jjc]
2018-03-12: Created. [lsc]