JAVA8学习——Stream底层的实现三（学习过程）

时间:2020-01-08 dawa大娃bigbaby 人气:1

Stream的深入（三）

心得：之前学习流，深入了流的底层。但是学的这些东西在平时日常开发的过程中，是根本不会用到的。只是为了更好帮助自己去理解流的底层设施。用起来也更自信，能够确定用的东西非常正确。

专注技术：这种纯技术的这种环境。

而不是说：专注业务开发了5年，技术没有长进。

这位张龙老师给讲课的方式，就是学习一门新技术的过程。如果觉得这种方式学习起来很有效的话。可以使用这种方式去学习一门新的技术。

lambda表达式和匿名内部类完全不同

之前虽然学了流了，但是还不太够。我们还缺少了一个能够把一个流从头到尾的执行过程给用起来的过程。

接下来会完成这个目的

用程序入门。

public class LambdaTest {
    //内部类和lambda表达式到底有什么关系
    Runnable r1 = () -> System.out.println(this);

    //匿名内部类 - 标识我生成了一个Runnable的实例 . 是一个类
    Runnable r2 = new Runnable() {
        @Override
        public void run() {
            System.out.println(this);
        }
    };

    public static void main(String[] args) {
        LambdaTest lambdaTest = new LambdaTest();
        Thread t1 = new Thread(lambdaTest.r1);
        t1.start();

        System.out.println("-------------");
        Thread t2 = new Thread(lambdaTest.r2);
        t2.start();
        //请问，输出结果一样吗？
    }
}

运行结果：

-------------
com.dawa.jdk8.LambdaTest@59a30351   (lambda表达式的结果)
com.dawa.jdk8.LambdaTest$1@2831008d  (匿名内部类的结果)

Process finished with exit code 0

LambdaTest$1，这个1就是匿名内部类的类名。（匿名内部类的名字）

经过对比，虽然说 lambda是匿名内部类的不同实现，但是他们两个是完全一样的。原理不同。

结论：

匿名内部类会开辟一个新的作用域
lambda是不会开辟新的作用域的

这里普及这个知识点，是为了以后在Debug的时候会发现匿名内部类和lambda表达式的类名不同、

系统的去走一遍stream的执行流程

public class StreamTest3 {
    public static void main(String[] args) {
        List<String> list = Arrays.asList("hello", "world", "welcome");
        list.stream().map(item->item+"_abc").forEach(System.out::println);
    }
}

map（）实现

返回值为StatelessOp

@Override
    @SuppressWarnings("unchecked")
    public final <R> Stream<R> map(Function<? super P_OUT, ? extends R> mapper) {
        Objects.requireNonNull(mapper);
        return new StatelessOp<P_OUT, R>(this, StreamShape.REFERENCE,
                                     StreamOpFlag.NOT_SORTED | StreamOpFlag.NOT_DISTINCT) {
            @Override
            Sink<P_OUT> opWrapSink(int flags, Sink<R> sink) {
                return new Sink.ChainedReference<P_OUT, R>(sink) {
                    @Override
                    public void accept(P_OUT u) {
                        downstream.accept(mapper.apply(u));
                    }
                };
            }
        };
    }

//StatelessOp类的定义和构造方法
    /**
     * Base class for a stateless intermediate stage of a Stream.
     *
     * @param <E_IN> type of elements in the upstream source
     * @param <E_OUT> type of elements in produced by this stage
     * @since 1.8
     */
abstract static class StatelessOp<E_IN, E_OUT>
            extends ReferencePipeline<E_IN, E_OUT> {
        /**
         * Construct a new Stream by appending a stateless intermediate
         * operation to an existing stream.
         *
         * @param upstream The upstream pipeline stage
         * @param inputShape The stream shape for the upstream pipeline stage
         * @param opFlags Operation flags for the new stage
         */
        StatelessOp(AbstractPipeline<?, E_IN, ?> upstream,
                    StreamShape inputShape,
                    int opFlags) {
            super(upstream, opFlags);
            assert upstream.getOutputShape() == inputShape;
        }

        @Override
        final boolean opIsStateful() {
            return false;
        }
    }

StatelessOp继承ReferencePipeline，而ReferencePipeline实现了Stream.

所以map方法返回new StatelessOp<P_OUT, R>就等于返回了一个Stream

返回的是继承了StatelessOp的子类的对象。完成了上游和下游流的互通.

Reference Pipeline 无非就是一个双向链表

操作包装：map（）方法中的 opWrapSink（）的ChainedReference，实现了流的包装操作。把剩下的流给warp到一起

然后就一个元素，同时经过了剩下的方法操作。

@Override
            Sink<P_OUT> opWrapSink(int flags, Sink<R> sink) {
                return new Sink.ChainedReference<P_OUT, R>(sink) {
                    @Override
                    public void accept(P_OUT u) {
                        downstream.accept(mapper.apply(u));
                    }
                };
            }

 Sink 类
 * <p>A sink may be in one of two states: an initial state and an active state.
 * It starts out in the initial state; the {@code begin()} method transitions
 * it to the active state, and the {@code end()} method transitions it back into
 * the initial state, where it can be re-used.  Data-accepting methods (such as
 * {@code accept()} are only valid in the active state.
 *

ChainedReference（）链接引用

    /**
     * Abstract {@code Sink} implementation for creating chains of
     * sinks.  The {@code begin}, {@code end}, and
     * {@code cancellationRequested} methods are wired to chain to the
     * downstream {@code Sink}.  This implementation takes a downstream
     * {@code Sink} of unknown input shape and produces a {@code Sink<T>}.  The
     * implementation of the {@code accept()} method must call the correct
     * {@code accept()} method on the downstream {@code Sink}.
     */
    static abstract class ChainedReference<T, E_OUT> implements Sink<T> {
        protected final Sink<? super E_OUT> downstream;

        public ChainedReference(Sink<? super E_OUT> downstream) {
            this.downstream = Objects.requireNonNull(downstream);
        }

        @Override
        public void begin(long size) {
            downstream.begin(size);
        }

        @Override
        public void end() {
            downstream.end();
        }

        @Override
        public boolean cancellationRequested() {
            return downstream.cancellationRequested();
        }
    }

Sink类中的end（），和 begin（）方法，切换两种状态：1.初始状态 2.激活状态

每一次accept（）方法执行之前，需要调用Sink中的begin（）方法，进入激活状态，执行完毕之后调用end()方法，进入初始状态。

涉及设计模式：模板方法模式。

opWrapSink（）的上级实现：

接受了一个Sink对象，这个对象接受了操作的结果，并且返回了一个Sink，还会执行这个操作，并将这个结果传递给所提供的sink。 *（输入参数才是带结果的sinK）
正是因为这种操作，才能将sink给包装起来。

    /**
     * Accepts a {@code Sink} which will receive the results of this operation,
     * and return a {@code Sink} which accepts elements of the input type of
     * this operation and which performs the operation, passing the results to
     * the provided {@code Sink}.
     接受了一个Sink对象，这个对象接受了操作的结果，并且返回了一个Sink，还会执行这个操作，并将这个结果传递给所提供的sink。  *（输入参数才是带结果的sinK）
     正式因为这种操作，才能将sink给包装起来。
     *
     * @apiNote
     * The implementation may use the {@code flags} parameter to optimize the
     * sink wrapping.  For example, if the input is already {@code DISTINCT},
     * the implementation for the {@code Stream#distinct()} method could just
     * return the sink it was passed.
     *
     * @param flags The combined stream and operation flags up to, but not
     *        including, this operation
     
     * @param sink sink to which elements should be sent after processing
     * @return a sink which accepts elements, perform the operation upon
     *         each element, and passes the results (if any) to the provided
     *         {@code Sink}.
     参数本身是用来接收结果的，而不是用返回值来返回结果的。
     */
    abstract Sink<E_IN> opWrapSink(int flags, Sink<E_OUT> sink);

        @Override
        final Sink<E_IN> opWrapSink(int flags, Sink<E_OUT> sink) {
            throw new UnsupportedOperationException();
        }

流的特性：惰性求值和延迟求值。

map()方法，包括其他的peek(),filter()等等中间操作的这些方法。只是完成了返回了一个StatelessOp对象。

所以中间操作返回一个终止对象可能执行的StatelessOp，没有终止操作，所以流不会被处理。

那么终止操作。我们要去追一下了。

拿代码中写的 forEach（）方法开始去追

    // Terminal operations from Stream

    @Override
    public void forEach(Consumer<? super P_OUT> action) {
        evaluate(ForEachOps.makeRef(action, false));
    }

这是调用了makeRef()方法.方法在ForEachOps类中.

先看ForEachOps类的javadoc

/**
 * Factory for creating instances of {@code TerminalOp} that perform an
 * action for every element of a stream.  Supported variants include unordered
 * traversal (elements are provided to the {@code Consumer} as soon as they are
 * available), and ordered traversal (elements are provided to the
 * {@code Consumer} in encounter order.)
 这是一个工厂，用来创建 TerminalOp 对象，（终止操作。）这个对象会对每一个元素执行一个动作。
 所支持的变化包括：无序的遍历，有序的遍历（按照所提供的的顺序来遍历）。
 *
 * <p>Elements are provided to the {@code Consumer} on whatever thread and
 * whatever order they become available.  For ordered traversals, it is
 * guaranteed that processing an element <em>happens-before</em> processing
 * subsequent elements in the encounter order.
 元素被提供被一个任何可用的Consumer队形。
 处理一个元素，一定是发生在 另外一件事之前 (happens-before)。
也就事 先遇到的元素先处理，后遇到的元素后处理。
 
 *
 * <p>Exceptions occurring as a result of sending an element to the
 * {@code Consumer} will be relayed to the caller and traversal will be
 * prematurely terminated.
 提供了大量的  静态方法。
 *
 * @since 1.8
 */
final class ForEachOps {
    
}

如makeRef()

   /**
     * Constructs a {@code TerminalOp} that perform an action for every element
     * of a stream.
     *
     * @param action the {@code Consumer} that receives all elements of a
     *        stream
     * @param ordered whether an ordered traversal is requested
     * @param <T> the type of the stream elements
     * @return the {@code TerminalOp} instance
     */
    public static <T> TerminalOp<T, Void> makeRef(Consumer<? super T> action,
                                                  boolean ordered) {
        Objects.requireNonNull(action);
        return new ForEachOp.OfRef<>(action, ordered);
    }

TerminalOp说明

默认执行的是串行的。

/**
 * An operation in a stream pipeline that takes a stream as input and produces
 * a result or side-effect.  A {@code TerminalOp} has an input type and stream
 * shape, and a result type.  A {@code TerminalOp} also has a set of
 * <em>operation flags</em> that describes how the operation processes elements
 * of the stream (such as short-circuiting or respecting encounter order; see
 * {@link StreamOpFlag}).
 流管道中的一个操作。会接受一个流作为输入，  产生的结果，是有副作用的（副作用：你传递了一个引用，你修改了这个引用）。
 一个 TerminalOp 会有一个输入类型，和流的shape 和一个结果类型。
 TerminalOp 还会有一个 如何处理流中的元素  的标识。
 TerminalOp 必须要提供一种 串行的和并行的 实现。
 
 *
 * <p>A {@code TerminalOp} must provide a sequential and parallel implementation
 * of the operation relative to a given stream source and set of intermediate
 * operations.
 *
 * @param <E_IN> the type of input elements
 * @param <R>    the type of the result
 * @since 1.8
 */
interface TerminalOp<E_IN, R> {
        /**
     * Gets the shape of the input type of this operation.
     *
     * @implSpec The default returns {@code StreamShape.REFERENCE}.
     *
     * @return StreamShape of the input type of this operation
     */
    default StreamShape inputShape() { return StreamShape.REFERENCE; }

    /**
     * Gets the stream flags of the operation.  Terminal operations may set a
     * limited subset of the stream flags defined in {@link StreamOpFlag}, and
     * these flags are combined with the previously combined stream and
     * intermediate operation flags for the pipeline.
     *
     * @implSpec The default implementation returns zero.
     *
     * @return the stream flags for this operation
     * @see StreamOpFlag
     */
    default int getOpFlags() { return 0; }

    /**
     * Performs a parallel evaluation of the operation using the specified
     * {@code PipelineHelper}, which describes the upstream intermediate
     * operations.
     *
     * @implSpec The default performs a sequential evaluation of the operation
     * using the specified {@code PipelineHelper}.
     *
     * @param helper the pipeline helper
     * @param spliterator the source spliterator
     * @return the result of the evaluation
     */
    default <P_IN> R evaluateParallel(PipelineHelper<E_IN> helper,
                                      Spliterator<P_IN> spliterator) {
        if (Tripwire.ENABLED)
            Tripwire.trip(getClass(), "{0} triggering TerminalOp.evaluateParallel serial default");
        return evaluateSequential(helper, spliterator);
    }

    /**
     * Performs a sequential evaluation of the operation using the specified
     * {@code PipelineHelper}, which describes the upstream intermediate
     * operations.
     *
     * @param helper the pipeline helper
     * @param spliterator the source spliterator
     * @return the result of the evaluation
     */
    <P_IN> R evaluateSequential(PipelineHelper<E_IN> helper,
                                Spliterator<P_IN> spliterator);
}

终止操作的实现就4类：

1.find

2.match

3.forEach 遍历

4.reduce

返回去： forEach（）操作就是返回了一个终止操作对象。

然后：evaluate（）方法，执行那个终止操作对象。


    // Terminal evaluation methods

    /**
     * Evaluate the pipeline with a terminal operation to produce a result.
     *
     * @param <R> the type of result
     * @param terminalOp the terminal operation to be applied to the pipeline.
     * @return the result
     */
final <R> R evaluate(TerminalOp<E_OUT, R> terminalOp) {
        assert getOutputShape() == terminalOp.inputShape();
        if (linkedOrConsumed)
            throw new IllegalStateException(MSG_STREAM_LINKED);
        linkedOrConsumed = true;

        return isParallel()
               ? terminalOp.evaluateParallel(this, sourceSpliterator(terminalOp.getOpFlags()))
               : terminalOp.evaluateSequential(this, sourceSpliterator(terminalOp.getOpFlags()));
    }

PipelineHelper类

用来描述流的各种信息。

/**
 * Helper class for executing <a href="package-summary.html#StreamOps">
 * stream pipelines</a>, capturing all of the information about a stream
 * pipeline (output shape, intermediate operations, stream flags, parallelism,
 * etc) in one place.
 Helper是一个帮助类，用于执行流管道
 包含流管道的所有信息：源数据。输出类型，操作，流标识，并行标记等。
 
 *
 * <p>
 * A {@code PipelineHelper} describes the initial segment of a stream pipeline,
 * including its source, intermediate operations, and may additionally
 * incorporate information about the terminal (or stateful) operation which
 * follows the last intermediate operation described by this
 * {@code PipelineHelper}. The {@code PipelineHelper} is passed to the
 * {@link TerminalOp#evaluateParallel(PipelineHelper, java.util.Spliterator)},
 * {@link TerminalOp#evaluateSequential(PipelineHelper, java.util.Spliterator)},
 * and {@link AbstractPipeline#opEvaluateParallel(PipelineHelper, java.util.Spliterator,
 * java.util.function.IntFunction)}, methods, which can use the
 * {@code PipelineHelper} to access information about the pipeline such as
 * head shape, stream flags, and size, and use the helper methods
 * such as {@link #wrapAndCopyInto(Sink, Spliterator)},
 * {@link #copyInto(Sink, Spliterator)}, and {@link #wrapSink(Sink)} to execute
 * pipeline operations..
 一个流管道的最初的分块，包含源，中间操作和增加的操作。等
 
 PipelineHelper会被传递给。。。。 方法， 就可以通过PipelineHelper来访问管道的各种信息。
 
 *
 * @param <P_OUT> type of output elements from the pipeline
 * @since 1.8
 */
abstract class PipelineHelper<P_OUT> {
    ...
}

PipelineHelper类里的方法：wrapAndCopyInto()

   /**
     * Applies the pipeline stages described by this {@code PipelineHelper} to
     * the provided {@code Spliterator} and send the results to the provided
     * {@code Sink}.
     将调用了这个方法的pipeline所描述的管道的各个阶段，同时 应用到Spliterator和发送给Sink对象
     *
     * @implSpec
     * The implementation behaves as if:
     * <pre>{@code
     *     intoWrapped(wrapSink(sink), spliterator);
     * }</pre>
     *
     * @param sink the {@code Sink} to receive the results
     * @param spliterator the spliterator describing the source input to process
     */
    abstract<P_IN, S extends Sink<P_OUT>> S wrapAndCopyInto(S sink, Spliterator<P_IN> spliterator);

wrapAndCopyInto具体实现：

    @Override
    final <P_IN, S extends Sink<E_OUT>> S wrapAndCopyInto(S sink, Spliterator<P_IN> spliterator) {
        copyInto(wrapSink(Objects.requireNonNull(sink)), spliterator);
        return sink;
    }

Sink中的wrapSink（）方法

    /**
     * Takes a {@code Sink} that accepts elements of the output type of the
     * {@code PipelineHelper}, and wrap it with a {@code Sink} that accepts
     * elements of the input type and implements all the intermediate operations
     * described by this {@code PipelineHelper}, delivering the result into the
     * provided {@code Sink}.
     接受了一个Sink， Sink接受了PipelineHelper的所有输出类型。
     
     *
     * @param sink the {@code Sink} to receive the results
     * @return a {@code Sink} that implements the pipeline stages and sends
     *         results to the provided {@code Sink}
     */
    abstract<P_IN> Sink<P_IN> wrapSink(Sink<P_OUT> sink);

wrapSink（）方法具体实现（完成了对于多个流操作的串联。）

    @Override
    @SuppressWarnings("unchecked")
    final <P_IN> Sink<P_IN> wrapSink(Sink<E_OUT> sink) {
        Objects.requireNonNull(sink);

        //根据depth判断是否有中间操作。 从后往前的去走。
        for ( @SuppressWarnings("rawtypes") AbstractPipeline p=AbstractPipeline.this; p.depth > 0; p=p.previousStage) {
            sink = p.opWrapSink(p.previousStage.combinedFlags, sink);
        }
        return (Sink<P_IN>) sink;
    }

wrapSink（）

自我总结：Stream的执行流程。

源数据-中间操作-中间操作-终止操作

1.串联起来所有的操作。（中间操作和终止操作）

2.让流中的元素，一个一个的执行所含有的所有操作。

最核心的方法：copyInto()中的：spliterator.forEachRemaining(wrappedSink); //最最核心的一步

 @Override
    final <P_IN> void copyInto(Sink<P_IN> wrappedSink, Spliterator<P_IN> spliterator) {
        Objects.requireNonNull(wrappedSink);

        if (!StreamOpFlag.SHORT_CIRCUIT.isKnown(getStreamAndOpFlags())) {
            wrappedSink.begin(spliterator.getExactSizeIfKnown());
            spliterator.forEachRemaining(wrappedSink); //最最核心的一步
            wrappedSink.end();
        }
        else {
            copyIntoWithCancel(wrappedSink, spliterator);
        }
    }

wrappedSink : 所有的中间操作，封装到了这个 sink对象

spliterator：源数据- 执行forEachRemaining 遍历，执行每一次这过sink对象封装的操作。

上面是静态分析（通过源码分析）

自行通过动态分析（程序Debug分析）

t通过Debug去跟一遍代码。

public class StreamTest3 {
    public static void main(String[] args) {
        List<String> list = Arrays.asList("hello", "world", "welcome");
//        list.stream().map(item->item+"_abc").forEach(System.out::println);

        Stream<String> stream = list.stream(); 
        System.out.println("1");//断点
        Stream<String> stream1 = stream.map(item -> item + "_abc");
        System.out.println("2");//断点
        stream1.forEach(System.out::println);
    }
}

加载全部内容