Can Kafka Streams be configured to wait for KTable to load?

Processing is time synchronized in Kafka Streams. Hence, the table input topic and stream input topic are processed based on record timestamp order. This is semantically sound, because on a stream-table join, you don't want to join a stream record with an older version nor with a newer version of the KTable, but with the right version based on the stream record timestamp.

If your data is not properly timestamped, you can try to specify a custom timestamp extractor for via builder.table(..., Consumed.with(...)) to return timestamps that ensure proper behavior (ie, maybe smaller than timestamp of the first stream record?)

  • https://docs.confluent.io/current/streams/developer-guide/config-streams.html#streams-developer-guide-timestamp-extractor

Note, that a proper timestamp synchronization requires Kafka Streams 2.1. Older version synchronize time in best effort manner only and may not provide the behavior you want. For more details, see KIP-353.

  • https://cwiki.apache.org/confluence/display/KAFKA/KIP-353%3A+Improve+Kafka+Streams+Timestamp+Synchronization