Programming languages evolve continuously to help simplify the development of complex solutions. Following this path, Oracle introduced several functions into Java 8, such as functional programming, lambda expressions, streams, etc., while Java 8 introduced streams API a new functional interface to simplify programming.
Streams API brings functional programming concepts into Java programming to help programmers use available computational power and resources to build an efficient solution.
What is ‘stream’, and how does it work?
Stream is not a data structure; instead, it is a Monad that represents the computational operations as a sequence of steps in the pipeline that can be chained on underline data elements. Java 8 stream API supports building the pipeline of operations on underlining data sources.
Stream operations are either intermediate or terminal. Intermediate operations return a stream to support the chaining of multiple intermediate operations. Terminal operations either return void or non-stream results, so these are usually placed at the end of the stream operation pipeline. Here, filters sorted are intermediate operations while each stream is a terminal operation. Almost all the stream operations accept lambda expression or functional interface, which follows the principle of functions. For example, if we print the original collection, it will still print all the elements while the stream would print only the filtered elements in the sorted sequence.
How to create streams on different types of datasource?
Java 8 provides various ways to obtain the streams from the underline data source. Collection interface has also been updated with default methods to allow getting the stream on collection using stream () and parallel stream ().
Parallel streams can perform operations on the data source using multiple threads to reduce execution time, and this is covered later in this article.
java.util.Arrays is loaded with the capability to create a stream on provided data elements.
java.util.stream.Stream interface also provides the static method to create a stream from multiple object references.
Java 8 also supports having a stream pipeline over primitive data types. In addition to other stream operations, these primitive streams also provide additional terminal operations for aggregations such as average(), sum().
Java 8 also supports the stream on files IO operations. For example:
- The lines of a file can be obtained from lines().
- Streams of file paths can be obtained from methods in Files such as find().
- Streams of random numbers can be obtained fromints().
How to transform one type of stream to another type?
Sometimes we get the data source of one type, but we want to perform the stream operations on a specific subtype of a provided data source. Java 8 stream API solves these cases through map operation. The map is an intermediate operation that converts elements into other objects using the provided function.
The streams also support the special mapping operations such as mapToInt, mapToLong, mapToDouble, mapToObj to transform the object stream to a primitive stream, or vice versa.
In this instance, since our need is to perform any stream operations on the person name or age, we can use map operations to get the stream of only name or age and then perform any further chain operations on it. Since stream works on the copy of the underline data source, this approach avoids creating complete complex objects and helps by providing better performance and reduced memory usage.
How are operations in the stream chain processed?
Operations in-stream API follow the functional programming paradigms. An important characteristic of intermediate operations in the stream is laziness. For example, intermediate operations are only executed when a terminal operation is present.
In the above example, since there is no terminal operational invoked, the filter operation would also not be invoked. However, the intermediate operation will be called only if we uncomment and add the terminal for each operation into the chain.
In stream, every element is processed through a complete chain of operations vertically, unlike the traditional approach where all the operations are performed sequentially on a complete list.
In the first stream approach, the first name “” passes through filter operation, and then for Each, only then the second name is processed through the same operations chain.
Let’s look at another example of processing order behavior in the stream.
As we can see, the execution of the stream operations chain is performed vertically on each element. Since the first element returns true in any Match, the remaining elements are not processed.
This behavior of stream chain processing order helps in reducing the number of operations and iterations on data source elements and hence reducing required memory and time.
How to reuse streams on the same data source?
Streams in Java 8 cannot be reused. It is closed after a terminal operation is invoked on it. As we can see above, we get IllegalStateException if trying to reuse the stream for the second terminal operation. We need to create a new stream for every desired terminal operation. Java 8 also provides the
Supplier functional interface to supply the multiple streams on the same datasource. For example 'personNameSupplier' provides the stream of names using intermediate operations. We can then use the supplier’s functional interface to get () operation for each desired terminal operation.
How to build an operations pipeline with parallel execution?
Streams API provides the capability to run the operations pipeline in sequential or parallel mode. Using parallel mode, multiple threads can be used in a multicore processor to process the underline operations so that the computation resources can be effectively utilized to reduce execution time.
Parallel streams basically follow the Fork-Join principle to divide the larger tasks into smaller sub-tasks. It then executes all the sub-tasks in parallel using multiple threads and joins the results of all threads together. It internally uses java.util.concurrent.ForkJoinPool and gets the pool of parallel threads using ForkJoinPool.commonPool() method. The size of the thread pool may differ based on the available CPU cores.
For example, let’s use the stream chain on the collection of approx. 5.000.000 persons and running the stream in both sequential and parallel mode.
The collection provides the default method called parallelStream() that can be used to get the stream with parallel mode, while stream() method can be used to get the stream in sequential mode.
If we run this program, the sequential stream would print the statements from filter and map operations using Thread as main. While in the second part, the parallel stream will print the statements from filter and map operations using multiple threads. By default, it starts the ForkJoinPool with three threads.
We are in an era where computational power is growing exponentially, and we have the capability to train machines to learn by themselves and perform more complex computations. Streams API is one of the great efforts that has been introduced in Java 8 to meet the current programming needs and to help programmers build more efficient and complex algorithms.