流處理中時間本質上就是一個普通的遞增欄位(long型,自1970年算起的微秒數),不一定真的表示時間。 watermark只是應對亂序的辦法之一,大多是啟髮式的,在延遲和完整性之間抉擇。(如果沒有延遲,就不夠完整;如果有延遲,極端情況就是批處理,當然完整性足夠高) org.apache.flink. ...
流處理中時間本質上就是一個普通的遞增欄位(long型,自1970年算起的微秒數),不一定真的表示時間。
watermark只是應對亂序的辦法之一,大多是啟髮式的,在延遲和完整性之間抉擇。(如果沒有延遲,就不夠完整;如果有延遲,極端情況就是批處理,當然完整性足夠高)
org.apache.flink.streaming.api.watermark
Class Watermark
java.lang.Object
org.apache.flink.streaming.runtime.streamrecord.StreamElement
org.apache.flink.streaming.api.watermark.Watermark
@PublicEvolving
public final class Watermark extends StreamElement
A Watermark tells operators that no elements with a timestamp older or equal to the watermark timestamp should arrive at the operator. Watermarks are emitted at the sources and propagate through the operators of the topology. Operators must themselves emit watermarks to downstream operators using Output.emitWatermark(Watermark). Operators that do not internally buffer elements can always forward the watermark that they receive. Operators that buffer elements, such as window operators, must forward a watermark after emission of elements that is triggered by the arriving watermark.
In some cases a watermark is only a heuristic and operators should be able to deal with late elements. They can either discard those or update the result and emit updates/retractions to downstream operations.
When a source closes it will emit a final watermark with timestamp Long.MAX_VALUE. When an operator receives this it will know that no more input will be arriving in the future.
Modifier and Type Field and Description
static Watermark MAX_WATERMARK
The watermark that signifies end-of-event-time.
reference:
https://www.bilibili.com/video/av53193640/
https://ci.apache.org/projects/flink/flink-docs-release-1.9/api/java/