© Mapbox, © OpenStreetMap

Speaker

Aravind Brahmadevara

Aravind Brahmadevara

Principal Data Engineer,IMG Arena

London, United Kingdom

Actions

18 years of industry experience as an Architect , Data Scientist and Data Engineer in Health Insurance, Sports Betting Technology and CRM domains

Area of Expertise

  • Information & Communications Technology

Speed up(3X) Stateful Streaming using EFS.

Real time streaming spark apps especially stateful streaming cloud apps can take up significant times with state checkpointing to S3.Here at IMG Arena, we used EFS backed PVC(ReadWriteMany) mounted on to driver and executors as checkpoint storage which improved our latencies up to 2 seconds.We use RocksDBStateStoreProvider and gained a significant improvement in rocksdbCommitFileSyncLatencyMs metric from 2.9sec to 0.76 sec gaining around ~2 sec. The EFS state store seamlessly works during auto scaling , spot executor termination ,state load up time on new/different executors .The EFS backed PVC also speeds up meta data checkpointing for commits and offsets tracking.

Advanced debugging to run Spark Operator(Helm) streaming apps on EKS

Amazon EMR releases 6.10.0 and higher support the Kubernetes operator for Apache Spark on EKS. However , it was challenging to deploy our structured stateful streaming apps using Spark Operator.There were a chain of classpath errors and/or class version errors repeating themselves despite multiple attempts such as including additional jars using sbt build tool , adding jars in docker image,adding a few libraries in spark extra classpath,a few jvm options, adding spark operator app's dependencies, permissions settings. At IMG Arena,we have systematically troubleshot this problem by adding debugging Scala code to print out the classloaders and the files loaded. We got a good understanding why the above options were failing and after the changes we are able to migrate all our stateful and real time apps to Spark Operator on EKS in production
Here is a medium blog on the same https://aravind-deva.medium.com/spark-operator-on-aws-eks-systematic-troubleshooting-120bad35de74

Aravind Brahmadevara

Principal Data Engineer,IMG Arena

London, United Kingdom

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top