Java Lambdas and Streams

Review of Generics

Generics were introduced in Java 5 and are used extensively for type safe collections. With the inclusion of functional programming in Java 8, generic usage has expanded into lambdas and higher-order functions, which are almost impossible to use without a good knowledge of generics.

Why Use Generics?

Generics allow us to use types (interfaces and classes) as parameters when defining classes, inter­faces and methods. Much like formal parameters in method declarations, type parameters allow us to reuse the same code with different inputs. The difference is that the inputs (arguments) to formal para­meters are values, while the inputs to type parameters are types.

Code that uses generics has many benefits over non-generic code:

When rewritten using generics, no casting is required:

Knowing how to use and apply generics is extremely important, especially when using Java 8 and above. Even novice Java programmers need to know how to use classes that support generics. For ex­ample, we can’t get the benefit of type checking and type safety when using collections such as List, Map, and Set without knowing how to use generics.

Intermediate Java developers should be able to define classes or methods that support generics. In Java 7 and earlier, being able to do this was mostly reserved for advanced developers. But this is done much more commonly in Java 8, because of the need to use generics for lambda expressions and stream processing. The goal of both generics and lambda functions is to make code safer and more reusable, which is a goal that all programmers share.

Syntax for Generic Classes and Methods

When declaring a class or method that supports generics, we need to define the type parameter sec­tion, which is delimited by angle brackets (<>). It either follows the class name, or precedes a method return value if the class itself is not parameterized. It specifies the type parameters (also called type variables) T1, T2, …, to Tn. The use of those identifiers refers to types, not to variables.

The following are some example code snippets that show type parameter usage:

As can be seen above, the widely-used type naming convention is to use single uppercase letters, usually:

This is just a convention, so any valid and relevant identifiers can be used.

Generic Method Examples

The following class is not a generic class, but its methods are generic, which means that the type parameter <T> is not at the class declaration level, but only on the method declarations. The firstMatch() method takes a List of T objects and returns a T object. The <T> at the beginning of the method declaration means T is not a real type, but a type parameter that the Java compiler will determine from the context in which it is used, either as the types of the parameters of a method call, or from the instantiation of an object (if it is a generic class).

We could use the generic firstMatch() method as follows:

The following additional example is again for a non-parameterized class, but with a generic method. The method returns a random element from a generic array that is passed to it.

The T in the randomElement() method declaration refers to the type which Java will infer from exam­in­ing the parameters of the method call. Even if there was an existing class called T, it is irrelevant here, because T is a placeholder for a type to be passed in as a parameter later. The method takes in an array of T objects and returns a T object. For example, if we pass in an Integer array, an Integer object will be returned; if we pass in a Person array, a Person object will be returned. No typecasts are necessary.

We could use the RandomUtils class as follows:

Note again that typecasting is not required to convert to String, Color, Person, or Integer. Autoboxing allows us to assign an element from the Integer[] array to an int, but the array passed to randomElement() must be Integer[], not int[], since generics work only with Object types, not primitive data types.

Generic Class Example

The following example is for a very simple generic (parameterized) stack class with push() and pop() methods. Both the class and the methods are generic. For comparison, there is a full Stack class in the java.util package.

Methods in the class can now refer to E both for arguments and for return values, where E doesn’t re­fer to an existing type. Instead, it refers to whatever type was defined when a stack was created. In the following code, E would refer to a String, the push() method would accept a String parameter, and the pop() method would return a String object:

In the same way, if we created Stack<Person>, the push() method would accept a Person and the pop() method would return a Person object. No typecasts would be required when using push() and pop().

Type Inference (Diamond) Operator

From Java 7, we can replace the type arguments when invoking a constructor of a generic class with an empty set of type arguments (<>) as long as the compiler can determine, or infer, the type argu­ments from the context. This pair of angle brackets is informally called the diamond operator.

Using the previous code:

We can use the <> operator with the constructor, because the compiler will be able to infer the type from the usage context:

The Java compiler uses a type inference algorithm to look at each method invocation and the cor­res­pon­ding declaration, to determine the type argument(s) that can apply to the invocation and, if available, the type of the returned result. The compiler takes advantage of target typing to infer the type parameters of a generic method invocation. The target type of an expression is the data type that the compiler expects, based on the context. Finally the inference algorithm tries to find the most specific type that works with all of the arguments.

Multiple Type Parameters

A generic class can have multiple type parameters. For example, if we wanted to model a key:value pair, where both the key and the value could be of any type, we might create a generic Pair class:

The following statements instantiate a few objects of the Pair class:

From Java 7 we can use the diamond operator to reduce a certain amount of typing:

Type Erasure

Adding generics to Java created a problem for backwards compatibility, which has always been an important issue when adding new features to the language. The problem was how to allow older, non-generic collection classes to be used alongside newer generic collections.

The designers decided to do this with typecasts:

This means that, on some level, List and List<String> are compatible as types. Java achieves this compatibility by type erasure, which means that generic types are only visible at compile time and are stripped out by the compiler. All that is left after type erasure is the raw type of the container — in this case myStringList has the type of List.

Non-generic types such as List are referred to as raw types. It is still perfectly legal to work with raw types, however we lose the strict type checking that the compiler gives us, and it’s generally a sign of poor quality code.

Compile and Runtime Typing

Consider the following statement:

We might be surprised to learn that the type of list is different at compile time to runtime.

Wildcards and Bounds

In generic code, the question mark symbol ? is called a wildcard, and represents an unknown type. The wildcard can be used in a variety of situations: as the type of a parameter, field, or local variable; and occasionally as a return type. The wildcard is never used as a type argument for a generic class instance creation, generic method invocation, or a supertype.

We have three major ways we can use wildcards — unbounded, upper bounded and lower bounded:

Bounded wildcards are used as arguments for instantiation of generic types. Bounded wildcards are useful where only partial knowledge about the type argument of a parameterized type is needed, but where unbounded wildcards carry too little type information. A bounded wildcard carries more infor­mation than an unbounded wildcard. The supertype of such a family is called the upper bound; the subtype of such a family is called the lower bound.

We can specify an upper bound for a wildcard, or we can specify a lower bound, but we cannot spe­ci­fy both at the same time.

Unbounded Wildcards

The unbounded wildcard type is specified using the wildcard character ?, for example, List<?>. This is called a list of unknown type. There are two scenarios where an unbounded wildcard is useful:

Consider the following printList() method:

The obvious goal of printList() is to print a list of any type, but unfortunately it can only print a list of Object instances; it can’t print List<Integer>, List<String>, List<Person>, etc., because they are not subtypes of List<Object>.

To write a generic printList() method, we must use the wildcard syntax List<?> as follows:

This works because List<T> is a subtype of List<?> for any concrete type T. That means we can use printList() to print a list of any type:

It’s important to remember that List<Object> and List<?> are not the same. We can add an Object, or any subtype of Object, into a List<Object>. But we can only add null into a List<?>.

Upper Bounded Wildcards

We can use an upper bounded wildcard to relax the restrictions on a variable. For example, let’s sup­pose we want to write a method that works on List<Integer>, List<Double>, and List<Number>. We can do this by using an upper bounded wildcard.

An upper bounded wildcard restricts the unknown type to be a specific type or a subtype of that type and is written as: <? extends T> where T is the upper bound. In this context, extends is used in a gen­er­al sense to mean either implements (as in interfaces) or extends (as in classes).

To write a method that works on lists of Number and its subtypes, such as Integer, Double, etc., we would specify List<? extends Number>. The term List<Number> is more restrictive than List<? extends Number> because List<Number> matches a list of type Number only, whereas List<? extends Number> matches a list of type Number or any of its subclasses.

Lower Bounded Wildcards

A lower bounded wildcard restricts the unknown type to be a specific type or a super type of that type and is written as: <? super T> where T is the lower bound.

Suppose we would like to write a method that puts Integer objects into a list. For flexibility, we’d like the method to work with List<Integer>, List<Number>, and List<Object>, i.e. anything that can hold Integer objects.

To write the method that works on lists of Integer and its supertypes, we specify List<? super Integer>. The term List<Integer> is more restrictive than List<? super Integer> because the List<Integer> matches a list of type Integer only, whereas List<? super Integer> matches a list of any type that is a supertype of Integer.

The following code adds the numbers 1 through 16 to the end of a list:

Wildcard Guidelines and PECS

One of the more confusing aspects when learning to program with wildcards is determining when to use an unbounded wildcard, an upper bounded wildcard or a lower bounded wildcard.

There is an acronym coined by Joshua Block in his Effective Java book called PECS: Producer Extends, Consumer Super.

Here is a simple example of copying a source list to a destination list. Note how the source list src (the producing list) uses extends, and the destination list dest (the consuming list) uses super:

Here is another way to remember when to use super or extends, if we think in terms of an object X:

This all boils down to:

Summary

2018-05-19: Edited [jjc]
2018-03-28: Revised [lsc]
2018-03-24: Edited. [jjc]
2018-03-24: Created. [lsc]