Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Now that we understand how exactly-once state consistency works, you might think what about side effects, such as sending out an email, or write to database. That is a valid concern, because Flink's recovery mechanism are not sufficient to provide end to end exactly once guarantees even though the application state is exactly once consistent, for example, if message x and y from above contains info and action to send out email, during failure recovery, these messages will be processed more than once which results in duplicated emails being sent. Therefore other techniques must used, such as idempotent writes, transaction writes etc. To avoid the duplicated email issue, we could save the key of the message that have been processed to a key-value data storage, and use that to check the key, however, since steam processing means unbounded message, it is tricky to manage this key-value checker with large throughput or unbounded time window by yourself. Here I will explain one approach to guarantee end to end exactly once such as only sending out email once with Apache Beam.