Session

Speed up(3X) Stateful Streaming using EFS.

Real time streaming spark apps especially stateful streaming cloud apps can take up significant times with state checkpointing to S3.Here at IMG Arena, we used EFS backed PVC(ReadWriteMany) mounted on to driver and executors as checkpoint storage which improved our latencies up to 2 seconds.We use RocksDBStateStoreProvider and gained a significant improvement in rocksdbCommitFileSyncLatencyMs metric from 2.9sec to 0.76 sec gaining around ~2 sec. The EFS state store seamlessly works during auto scaling , spot executor termination ,state load up time on new/different executors .The EFS backed PVC also speeds up meta data checkpointing for commits and offsets tracking.

Aravind Brahmadevara

Principal Data Engineer,IMG Arena

London, United Kingdom

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top