/** * A {@code WindowAssigner} assigns zero or more {@link Window Windows} to an element. * * <p>In a window operation, elements are grouped by their key (if available) and by the windows to * which it was assigned. The set of elements with the same key and window is called a pane. * When a {@link Trigger} decides that a certain pane should fire the * {@link org.apache.flink.streaming.api.functions.windowing.WindowFunction} is applied * to produce output elements for that pane. * * @param <T> The type of elements that this WindowAssigner can assign windows to. * @param <W> The type of {@code Window} that this assigner assigns. */ @PublicEvolving publicabstractclassWindowAssigner<T, WextendsWindow> implementsSerializable{ privatestaticfinallong serialVersionUID = 1L;
/** * Returns a {@code Collection} of windows that should be assigned to the element. * * @param element The element to which windows should be assigned. * @param timestamp The timestamp of the element. * @param context The {@link WindowAssignerContext} in which the assigner operates. */ publicabstract Collection<W> assignWindows(T element, long timestamp, WindowAssignerContext context);
/** * Returns the default trigger associated with this {@code WindowAssigner}. */ publicabstract Trigger<T, W> getDefaultTrigger(StreamExecutionEnvironment env);
/** * Returns a {@link TypeSerializer} for serializing windows that are assigned by * this {@code WindowAssigner}. */ publicabstract TypeSerializer<W> getWindowSerializer(ExecutionConfig executionConfig);
/** * Returns {@code true} if elements are assigned to windows based on event time, * {@code false} otherwise. */ publicabstractbooleanisEventTime();
/** * A context provided to the {@link WindowAssigner} that allows it to query the * current processing time. * * <p>This is provided to the assigner by its containing * {@link org.apache.flink.streaming.runtime.operators.windowing.WindowOperator}, * which, in turn, gets it from the containing * {@link org.apache.flink.streaming.runtime.tasks.StreamTask}. */ publicabstractstaticclassWindowAssignerContext{
/** * Returns the current processing time. */ publicabstractlonggetCurrentProcessingTime();
接下来看一下大家用的比较多的 TumblingEventTimeWindows 和 SlidingEventTimeWindows 的源码(processing time 的实现类似) 看下窗口的划分到底是怎么实现的?
TumblingEventTimeWindows 源码
@Override public Collection<TimeWindow> assignWindows(Object element, long timestamp, WindowAssignerContext context){ if (timestamp > Long.MIN_VALUE) { if (staggerOffset == null) { staggerOffset = windowStagger.getStaggerOffset(context.getCurrentProcessingTime(), size); } // Long.MIN_VALUE is currently assigned when no timestamp is present long start = TimeWindow.getWindowStartWithOffset(timestamp, (globalOffset + staggerOffset) % size, size); return Collections.singletonList(new TimeWindow(start, start + size)); } else { thrownew RuntimeException("Record has Long.MIN_VALUE timestamp (= no timestamp marker). " + "Is the time characteristic set to 'ProcessingTime', or did you forget to call " + "'DataStream.assignTimestampsAndWatermarks(...)'?"); } }
@Override public Collection<TimeWindow> assignWindows(Object element, long timestamp, WindowAssignerContext context){ if (timestamp > Long.MIN_VALUE) { List<TimeWindow> windows = new ArrayList<>((int) (size / slide)); long lastStart = TimeWindow.getWindowStartWithOffset(timestamp, offset, slide); for (long start = lastStart; start > timestamp - size; start -= slide) { windows.add(new TimeWindow(start, start + size)); } return windows; } else { thrownew RuntimeException("Record has Long.MIN_VALUE timestamp (= no timestamp marker). " + "Is the time characteristic set to 'ProcessingTime', or did you forget to call " + "'DataStream.assignTimestampsAndWatermarks(...)'?"); } }
/** * Method to get the window start for a timestamp. * * @param timestamp epoch millisecond to get the window start. * @param offset The offset which window start would be shifted by. * @param windowSize The size of the generated windows. * @return window start */ publicstaticlonggetWindowStartWithOffset(long timestamp, long offset, long windowSize){ return timestamp - (timestamp - offset + windowSize) % windowSize; }
首先会根据元素的 timestamp offset slide 计算出窗口开始的时间戳,然后循环初始化给定的size内不同slide的窗口对象,最后返回一个 ListSession windows 和 Global windows 的实现相对简单这里就不在展开分析了,感兴趣的同学可以自己去看一下.