Lambdas and Streams

Rule5:Use caution when making use of parallel


evolution in concurrency thread use:
1996 when java releasted it came with synchoronisation and wait/notify.
java 5 introduced java.util.concurrent library with concurrent collections and executor framework
java 7 introduced the fork-join package
java 8 introduced the streams
Writing CONCURRENT PROGRAM IN JAVA IS GETTING EASIER BUT WRITING CONCURRENT PROGRAM THAT IS CORRECT AND FAST Is As difficult as it was.
Not only CAN parallelizing a stream lead to poor performance, including liveness failures; it CAN lead to incorrect results and unpredictable behavior (safety failures). Safety failures may result from parallelizing a pipeline that uses mappers, filters, and other programmer-supplied function objects that fail to adhere to their specifications.
parallelizing a pipeline is unlikely to increase its performance if the source is from Stream.iterate, or the intermediate operation limit is used. Worse, the default parallelization strategy deals with the unpredictability of limit by assuming there’s no harm in processing a few extra elements and discarding any unneeded results. As a rule, performance gains from parallelism are best on streams over ArrayList, HashMap, HashSet, and ConcurrentHashMap instances; arrays; int ranges; and long ranges.
What these data structures have in common is that they can all be accurately and cheaply split into SUBRANGES of any desired sizes, which makes it easy to divide work among parallel threads. The abstraction used by the streams library to perform this task is the spliterator, which is returned by the spliterator method on Stream and Iterable.
Another common factor that these DS provide is LOCALITY-OF-REFERENCE.Locality-of-reference turns out to be critically important for parallelizing bulk operations: without it, threads spend much of their time idle, waiting for data to be transferred from memory into the processor’s cache.
The data structures with the best locality of reference are primitive arrays because the data itself is stored contiguously in memory.
TODO:Implement something paralelly(best use in ML) till then better implementation than sequential before putting in production otherwise use sequential stream

The Nature of terminal operation also effects the effictiveness of parallel execution. IF MOST COMPUTATION IS DONE IN TERMINAL OPERATION AND IT IS TIME TAKING SO THE PARALLELIZATION WOULD BE LEAST EFFECTIVE as, the parallelized steams have to walit at terminal to be computed.The best terminal operations for parallelism are reductions,
REDUTIN IS NOT JUST collect() infact The operations performed by Stream’s collect method, which are known as mutable reductions, are not good candidates for parallelism because the overhead of combining collections is costly.
best ones are where all of the elements emerging from the pipeline are combined using one of Stream’s reduce methods, or prepackaged reductions such as min, max, count, and sum. The shortcircuiting operations anyMatch, allMatch, and noneMatch are also amenable to parallelism.

Published by

Unknown's avatar

sevanand yadav

software engineer working as web developer having specialization in spring MVC with mysql,hibernate

Leave a comment