Streams in Java 8

Introduction

Almost all Java developers have used Collections in their day to day programming. And we know the amount of code, temporary variables and loops which go in to process data in collections. Java 8 has come up to the rescue with concept of Streams. Streams are an easy way to process data in an collection in a declarative form like SQL.

1. So what are Streams after all ?

In plain words, Stream is update in Java API which can be used to process collection data in declarative form (just like SQL). Benefits are that you can have parallel streams to exploit multi-core architecture without being worried about it

1.1 Basic Stream Pipeline skeleton

Stream let’s you perform multiple operations in pipeline operation in the collection data. Generally we would write separate loops to create number of temporary variables to achieve this. A lot of code is reduced to possibly a single line statement by utilizing pipeline mechanism. Let’s see an example to exhibit a basic pipeline of Stream. Conventional pipeline of operations are:

  1. Filter
  2. Sort
  3. Map
  4. Collect
stream_flow

1.2 Define a Dataset

Since we consider Streams as utility to ease operations on collection, let’s take a dataset which has array list of some object. Here let’s take a list of restaurants as follows:

List<Restaurant> restaurants = Arrays.asList(new Restaurant[] {
      new Restaurant("Mikes Pizzaa", 3, false, 100),
      new Restaurant("Veg Delight", 2, true, 150),
      new Restaurant("Le Quello", 3, false, 400),
      new Restaurant("Subway", 2, false, 300),
      new Restaurant("Italion", 4, true, 400),
    });

where Class Restaurant is defined as :

public class Restaurant {

  private String name;
  private int rating;
  private boolean vegeterian;
  private int averageBill;

  public Restaurant() {
    // TODO Auto-generated constructor stub
  }

  public Restaurant(String name, int rating, boolean vegeterian, int averageBill) {
    super();
    this.name = name;
    this.rating = rating;
    this.vegeterian = vegeterian;
    this.averageBill = averageBill;
  }

  public String getName() {
    return name;
  }

  public void setName(String name) {
    this.name = name;
  }

  public int getRating() {
    return rating;
  }

  public void setRating(int rating) {
    this.rating = rating;
  }

  public boolean isVegeterian() {
    return vegeterian;
  }

  public void setVegeterian(boolean vegeterian) {
    this.vegeterian = vegeterian;
  }

  public int getAverageBill() {
    return averageBill;
  }

  public void setAverageBill(int averageBill) {
    this.averageBill = averageBill;
  }

  @Override
  public String toString() {
    
    return name;
  }
}

2. Getting started with Stream

Now lets see and an example where we can easily perform operation of:

  1. Filter
  2. Map

On elements of list of Restaurants, without many temporary variables and loops using Streams. To do this,¬†we shall try to retrieve only names of Restaurants where average bill is less then 200 (Yes we are looking for cheap restaurants ūüôā ) . Consider one possible code to do this

restaurants.stream()
           .filter((restaurant)->{
                             return (restaurant.getAverageBill()<=200);
                                 })
           .map(Restaurant::getName)
           .forEach((restString)->{System.out.println(restString);});

lets Break this code and see what is happening here…

restaurants.stream()

As we know, Java 8 has come up with a method called Stream in collections which will:

  1. Return an object of java.util.stream.Stream object.
  2. This stream object is basically the stream we are talking about which gives on-demand sequence of elements of collections for data processing. By On-demand means, unlike collections, streams will not load all elements at once but rather on-demand basis in sequence. ( Compare it with playing a music track on local machine and playing it online in Buffer mode)

  3. Provide with:
    1. Pipelining: operations one after the other and
    2. Internal Iterations: assign task on iterations of individual elements.

After we have stream object as in above code snippet, we will try to filter on some criteria. The filter object takes a predicate interface as an argument.

collection.stream().filter (Predicate <T>);

Predicate interface is Functional Interface which is used for testing a predicate test (as discussed in earlier section). For all elements the stream the condition returns true, will be collected in the resultant stream.

The next phase is map process. Here map process means converting one format of data into another. Since we are looking for only names of restaurants and not the whole Restaurant object, we have mapped only restaurant name by using:

Restaurant::getName

This is method reference usage as mentioned in earlier post.

The forEach gives us internal iteration facility. It takes the result of map operation and executes for each elements of the mapped-stream. Here we are simply printing the object for sake of an example. This could be extended to complex operations as well.  Generally, Streams are used for minor processing in pipeline fashion on Java collections.

Lets try to explore some other stream operations as well…

2.1 Limit Operation

Remember limit in MySQL ? Yes, you can perform operations similar to cursor operations using limit operations in streams. Lets see and example below:

//lets filter first two non veg restaurants
    
List<Restaurant> vegRestaurants = 
              restaurants.stream()
                         .filter((r)->{return !r.isVegeterian();})
                         .limit(2).collect(Collectors.toList());

for (Restaurant restaurant : vegRestaurants) {
      System.out.println(restaurant.getName());
}

Considering the same Restaurant list data in earlier example, the following code give us first two non-veg restaurants in the data set. Quite easy to relate with limit operation in SQL. That’s why it is said to operate on collection data in declarative way.

2.2 Map Operation

Let’s have one more look at map example. Considering the same Restaurant collection data as in earlier examples, let us say you “want to fetch only ratings of vegetarian restaurants” from the collection of Restaurants object. So basically we just want “rating” member variable from restaurant class rather than the whole object of Restaurant class. We can achieve this by using map operation in following way :

List<Restaurant> restaurants = Arrays.asList(new Restaurant[] {
        new Restaurant("Mikes Pizzaa", 3, false, 100),
        new Restaurant("Veg Delight", 2, true, 150),
        new Restaurant("Le Quello", 3, false, 400),
        new Restaurant("Subway", 2, false, 300),
        new Restaurant("Italion", 4, true, 400),
      });
    
//get ratings of vegeterian restaurants.
    
List<Integer> vegRestaurantsRatings = 
                  restaurants.stream()
                             .filter((r)->{return r.isVegeterian();})
                             .map(Restaurant::getRating)
                             .collect(Collectors.toList());

for (Integer rating: vegRestaurantsRatings) {
      System.out.println(rating);
}

In above code snippet, in the pipeline, we have first filtered the data, where we have used predicate interface implementation to check if the restaurant serves only vegetarian food, followed by which, we have used method reference for Function Functional Interface to map the Restaurant object with member variable rating (using getRating getter). Yes, map function takes object implementing Function Functional interface as an argument:

map(Function<? super T,? extends R> mapper)

Followed by this we have used we have used list collector to store the resultant elements of streams as list of integers which will have ratings of the restaurants.

3 Match Operation

This operation will help to assess whether the given predicate or condition is true in the given data set. For example, lets check if the given collection has at least one Vegetarian restaurant.

if (restaurants.stream().anyMatch(Restaurant::isVegeterian))

{ System.out.println(“Yes the list contains vegeterian restaurant”); }

 

4 Find operation

We can use find operation to find if any elements exist for given condition.

Optional<Restaurant> anyFound = restaurants.stream().filter((r)->{return r.isVegeterian();}).findAny();
    
if (anyFound.isPresent()){
  System.out.println("We found veg restaurants");
}

One may argue, that this is similar to match operation where instead of giving the condition in anyMatch, we are first putting predicate in filter argument and in pipeline we are using findAny().

However, in find operation, one can also use ifPresent function to access the elements matching those criteria, if at all they exist.

anyFound.ifPresent((r)->{System.out.println(r.getName());});

5. Collectors

Collectors are the end operation of stream pipeline defining how the result of stream operation is to be assigned or gathered. For example, in one the earlier code snippets we had seen how to collect result of stream operations into list of integers. There are many such collectors which could be explored from the Java API. We can also categorize some collectors in three categories:

  1. Summarizing Collectors
  2. Grouping Collectors
  3. Partitioning Collectors

5.1 Summarizing collectors

These collectors are used to perform some statistical or mathematical operation on the data collected at the end of stream pipeline.

Summarizing Collectors in Stream

Lets see an example code snippet for the same:

List<Restaurant> restaurants = Arrays.asList(new Restaurant[] {
        new Restaurant("Mikes Pizzaa", 3, false, 100),
        new Restaurant("Veg Delight", 2, true, 150),
        new Restaurant("Le Quello", 3, false, 400),
        new Restaurant("Subway", 2, false, 300),
        new Restaurant("Italion", 4, true, 400),
      });
    
// print total restaurants
Integer ratingSum = restaurants.stream().collect(Collectors.summingInt(Restaurant::getRating));

// print average of restaurant ratings.
Double collect = restaurants.stream().collect(Collectors.averagingInt(Restaurant::getRating));

In the above code snippet, we have made sum and average operations on the Ratings data and stores as single value in respective variables.

5.2 Grouping Collectors

Here the elements would be grouped by certain values in a Map instance. We can form clusters of elements of stream based on values of certain attributes of the object.

Grouping collectors in Streams

Lets see and example for the same:

// lets group restaurants based on ratings.
Map<Integer, List<Restaurant>> ratingsRestaurants = restaurants.stream().collect(Collectors.groupingBy(Restaurant::getRating));
    
for (Integer rating : ratingsRestaurants.keySet()){
  System.out.println(rating+":"+ratingsRestaurants.get(rating).toString());
}

Here  have grouped elements based on ratings of the restaurants.

5.3 Partitioning Collectors

This is similar as grouping collectors except for the fact that here there are only two groups: true and false.

partition collectors in streams

Lets see and example code snippet for for this as well.

// 3. Partitioning collectors
Map<Boolean, List<Restaurant>> vegRestaurants = restaurants.stream().collect(Collectors.partitioningBy(Restaurant::isVegeterian));
System.out.println("Veg Restaurants");
System.out.println(vegRestaurants.get(true).toString());
    
System.out.println("Non - Veg Restaurants");
System.out.println(vegRestaurants.get(false).toString());

Here we have two partitions. “True” partitions are the ones where we have Veg. restaurants. And “False” partitions are the ones with non-vegetarian restaurants.