Tejas Chopra
Senior Software Engineer, Netflix
San Jose, California, United States
Actions
Tejas Chopra is a Senior Software Engineer, working in the Data Storage Platform team at Netflix, where he is responsible for architecting storage solutions to support Netflix Studios and Netflix Streaming Platform. Prior to Netflix, Tejas was working on designing and implementing the storage infrastructure at Box, Inc. to support a cloud content management platform that scales to petabytes of storage & millions of users. Tejas has worked on distributed file systems & backend architectures, both in on-premise and cloud environments as part of several startups in his career. Tejas is an International Keynote Speaker and periodically conducts seminars on Micro services, NFTs, Software Development & Cloud Computing and has a Masters Degree in Electrical & Computer Engineering from Carnegie Mellon University, with a specialization in Computer Systems.
Links
Area of Expertise
Topics
Memory Optimizations for Machine Learning
As Machine Learning continues to forge its way into diverse industries and applications, optimizing computational resources, particularly memory, has become a critical aspect of effective model deployment. This session, "Memory Optimizations for Machine Learning," aims to offer an exhaustive look into the specific memory requirements in Machine Learning tasks, including Large Language Models (LLMs), and the cutting-edge strategies to minimize memory consumption efficiently.
We'll begin by demystifying the memory footprint of typical Machine Learning data structures and algorithms, elucidating the nuances of memory allocation and deallocation during model training phases. The talk will then focus on memory-saving techniques such as data quantization, model pruning, and efficient mini-batch selection. These techniques offer the advantage of conserving memory resources without significant degradation in model performance.
A special emphasis will be placed on the memory footprint of LLMs during inferencing. LLMs, known for their immense size and complexity, pose unique challenges in terms of memory consumption during deployment. We will explore the factors contributing to the memory footprint of LLMs, such as model architecture, input sequence length, and vocabulary size. Additionally, we will discuss practical strategies to optimize memory usage during LLM inferencing, including techniques like model distillation, dynamic memory allocation, and efficient caching mechanisms.
By the end of this session, attendees will have a comprehensive understanding of memory optimization techniques for Machine Learning, with a particular focus on the challenges and solutions related to LLM inferencing.
Designing media optimized byte transfer and storage at Netflix
Netflix is a media streaming company and a movie studio with data at exabyte scale. Most of the data generated, transferred and stored at Netflix is very media specific, for example, raw camera footage, or data generated as a result of encoding and rendering for different screen types.
In this session, I will throw light on how we design a media aware and optimized transfer, storage and presentation layer for data.
By leveraging this architecture at Netflix scale, we provide a scalable, reliable, and optimized backend layer for media data.
Major takeaways from this session
- Learn about the challenges of designing a scalable object storage layer for data while adhering to the file system POSIX semantics of media applications
- Learn about the optimizations applied to reduce cloud storage footprint, such as chunking, deduplication
- Learn about how different applications expect data to be presented at different locations and in different formats.
DevOps at Netflix
Netflix is a global leader in video streaming and has always been known in Silicon Valley for its culture document; a seminal work in setting the context about culture.
In this session, I will shed light on how Netflix thinks about DevOps, how our culture permeates our thought on Agile practices, DevOps, and development.
It will be a great way to get a glimpse of how loosely coupled and highly aligned Netflix is, and participants can apply some parts of our culture to their organizations.
Demystifying Privacy Preserving Computing
When it comes to privacy, encrypted data-in-transit (eg: HTTPS) or encrypted data-at-rest (eg: encrypted hard-disks) schemes provide sufficient cryptographic guarantees in the battle to protect it. The unresolved problem is encrypting data-in-use. Currently, in order to process data, we need to decrypt, process, and re-encrypt. Computation over unencrypted data may compromise the confidentiality of data and suffer various security attacks
Privacy-Preserving Computing (PPC) has emerged in recent years to enable the secure computation of the data without revealing the content of the data. These techniques look at how to represent data in a form that can be shared, analyzed, and operated on without exposing the raw information
We will discuss current state-of-the-art PPC techniques and the distinct threat models and business use cases they address. The techniques we will cover are: Secure multiparty computation (SMPC), Fully homomorphic encryption (FHE), Differential privacy (DP)
DevOps at Netflix
Netflix is a global leader in video streaming and has always been known in the valley for its culture document, a seminal work in setting the context about culture.
In this session, I will shed light on how Netflix thinks about DevOps, how our culture permeates our thought on Agile practices, DevOps, and development.
It would be a great way to get a glimpse of how loosely coupled and highly aligned Netflix is, and participants can apply some parts of our culture to their organizations.
Upskilling for a multi-cloud world
In this session, I would like to focus on the basic concepts of Cloud computing that apply to different cloud environments.
Azure, AWS, GCP, etc. share a lot of commonalities in how they're implemented on the backend, and the services and SLAs they expose to applications built on top of them.
Different organizations use different cloud offerings, for example, Box is built on top of GCP, Netflix heavily uses AWS, etc. In such a scenario, it is always best for a software developer to be aware of the different clouds, and learn about their nuances. In this session, I will cover some such basic ideas around Storage, Compute, Networking, and Serverless, and provide a flavor for how software development in cloud varies between organizations such as Box, and Netflix.
Using Kafka to solve food wastage for millions.
The food wastage in India is 70 tonnes per year, and there is mismanagement at several layers. Approximately 20-30% of the wastage happens in the last mile, between wholesale traders, and retail mom-and-pop stores.
Is there something we can do about it?
This was the problem statement I attempted to solve as a first engineering hire at a startup. Our customers were 12.8 million retail owners that deal in FMCG (Fast-moving consumer goods, such as food grains, tooth paste, etc.). The goal was to develop a platform for retail traders (mom and pop shop owners / small and medium business owners) to buy FMCG products from wholesale traders using an Android app.
We used cloud extensively to develop micro-services for a section of the population which is not very well versed with smartphones and technology. In this domain, data is very unstructured and we had to come up with challenging ways to representing this unstructured data for a slick browse and search experience.
We used Kafka as the event streaming platform in conjunction with DynamoDB, Elastic Search, and Redshift to build pipelines and systems on Cloud.
Data ingested by sellers in DynamoDB would be transformed and published in ElasticSearch & inserted into Redshift tables for analysis. We explored several alternatives such as using Kinesis Streams, but settled on AWS managed Kafka for cost & performance. I'd like to delve into these details as part of this session.
Going from three nines to four nines using Kafka
Many organizations have chosen to go with a hybrid cloud architecture to give them the best of both worlds: the scalability and ease of deployment of cloud, and the security, latency & egress benefits of local storage.
Persistence of data on such an architecture can follow a write-back mode, where data is first written to local storage, and then uploaded to cloud asynchronously. However, this means that the applications cannot utilize the availability and durability guarantees of cloud, and the availability of storage is the availability SLA of on-premise storage, which is almost always less than the availability SLA of Cloud.
By switching the order, i.e. performing uploads to cloud, and then hydrating on-premise storage, applications get the benefit of availability SLAs of cloud. In our case, this allowed us to move from three 9’s of availability (99.9%) of local storage to four 9’s (99.99%).
Instead of uploading in write-back mode, we duplicated the incoming stream to upload to both cloud and on-premise. For on-premise uploads that failed, we leveraged Kafka’s event processing to queue up objects that need to be egressed out of Cloud into the local storage.
This architecture allowed us to hydrate the local storage with objects uploaded to Cloud. Furthermore, since local storage space is limited, we periodically purged data out of local storage and created a secondary copy of the data on cloud by leveraging Kafka event processing.
Blockchain Storage Landscape
Blockchain has revolutionized decentralized finance, and with smart-contracts has enabled the world of Non-Fungible Tokens, set to revolutionize industries such as art, collectibles and gaming.
At the very core, blockchains are distributed chained hashes. They can be leveraged to store information in a decentralized, secure, encrypted, durable and available format. However, some of the challenges in Blockchain stem from the bloat of storage. Since each participating node will keep a copy of the entire chain, the same data gets replicated on each node, and even a 5MB file stored on the chain can exhaust systems.
In this session, Tejas will compare and contrast different blockchain implementations such as Storj, IPFS, YottaChain, and ILCOIN that allow data storage on the chain. Some techniques leveraged by these projects are to store the data off-chain and store the transactions metadata on the blockchain, enable multi-level blockchains, or use deduplication after encryption to reduce footprint.
Compared to traditional cloud storage, these alternatives provide decentralization, security, and cost benefits to organizations such as Netflix, Box, etc.
Carbon Footprint Aware Software Development
Liking someone’s status on Facebook or quickly searching Yelp for breakfast options nearby are some of the things we start our days with. Did you know that performing a simple Google Search generates 0.2g of carbon dioxide if you consider the carbon footprint of all the components in the software stack - which includes, browser, backend servers, networking calls, running algorithms, etc.
In this talk, I would like to initiate a thought of Carbon footprint aware software development.
Software development includes writing massive amounts of code that can run in parallel, and that scales to millions of devices and handles petabytes to exabytes of data. With blockchain technologies such as Bitcoin gaining notoriety for increased usage of electricity, and therefore increased carbon footprint, it has generated interest in the software community to revisit non-blockchain code and check its impact on the environment.
For example, as software developers, writing clean and efficient code helps with debuggability and readability, but removing dead code is one of the most efficient ways to reduce the CPU cycles needed to process that code, and the memory taken up by the code. This inturn reduces the electricity used by the chunk of code. In this talk, I would like to also cover some of the techniques that can be leveraged by Software developers to write code with less impact on the environment. Furthermore, just like we write unit tests to check functionality and catch regressions, I propose having environment regression analysis as a part of the CI/CD pipelines for each code check-in. We’d explore the tools available to do so. These would help us quantify the impact of a code change, and potentially point to “hot” spots in our code. I will briefly cover a sample use case of uploading a file in Google Drive, and the amount of energy expended by the software stack.
Code uses energy, bad code uses more energy. I just want to put a thought out there for software developers to be aware of what they write and check its impact beyond their project alone.
Demystifying NFTs
Non-fungible tokens have taken the world of digital art by storm. The growth in NFTs this year is a staggering 1500%+, and is set to revolutionize art & collectibles. In this session, I present the basics of NFTs, and delve into industries disrupted by NFTs: art, collectibles, gaming. I will discuss how to dip your toes into the NFT world by minting your unique collectible as an NFT on platforms such as OpenSea.
NFTs started off on Ethereum blockchain, but soon, Ethereum ran into scalability issues since it only allowed 10-15 TPS. This sparked a surge in different blockchains designed specifically for NFTs, such as WAX, FLOW, etc. We will discuss about how they open up new types of dApps and marketplaces, such as NBA Top Shots.
Finally, we will touch upon some of the perceived environmental impacts of NFTs for users to make informed decisions about platforms they use for trading.
Learning about Ethereum Smart Contracts
In this session, I will present the basics of Ethereum technology and how it enables embedding code via smart contracts.
We will then delve into how smart contracts can be developed using Solidity language, and cover the basics of Solidity with examples.
Finally, we will check some apps that are built using smart contracts and provide ideas for the future.
Building Key Value Store using LSM trees
Key Value stores are extensively used in storage systems, both on-premise and cloud, to store metadata. Several key value stores such as RocksDB, BigTable, Cassandra are built on top of LSM trees. LSM trees are a widely used data structure for write-heavy workloads.
In this talk, I’d like to present a brief background on LSM trees, the advantages provided by LSM trees for performing faster writes and avoiding random I/O by only supporting sequential writes, and how we can build a key value store using LSM tree levels, that is memory and disk aware.
We will discuss the choice of data structures such as B-Trees, Trie, Skip lists and bloom filters for supporting lookups and reads, and implementation of put, get, get-next, iterate and delete operations in our Key value store, and delve into consistency, crash recovery and rollback of data stored in the key value store.
Using Kafka to solve food wastage for millions.
The food wastage in India is 70 tonnes per year, and there is mismanagement at several layers. Approximately 20-30% of the wastage happens in the last mile, between wholesale traders, and retail mom-and-pop stores.
Is there something we can do about it?
This was the problem statement I attempted to solve as a first engineering hire at a startup. Our customers were 12.8 million retail owners that deal in FMCG (Fast-moving consumer goods, such as food grains, tooth paste, etc.). The goal was to develop a platform for retail traders (mom and pop shop owners / small and medium business owners) to buy FMCG products from wholesale traders using an Android app.
We used cloud extensively to develop micro-services for a section of the population which is not very well versed with smartphones and technology. In this domain, data is very unstructured and we had to come up with challenging ways to representing this unstructured data for a slick browse and search experience.
We used Kafka as the event streaming platform in conjunction with DynamoDB, Elastic Search, and Redshift to build pipelines and systems on Cloud.
Data ingested by sellers in DynamoDB would be transformed and published in ElasticSearch & inserted into Redshift tables for analysis. I'd like to delve into these details as part of this session.
Going from three nines to four nines using Kafka
Many organizations have chosen to go with a hybrid cloud architecture to give them the best of both worlds: the scalability and ease of deployment of cloud, and the security, latency & egress benefits of local storage.
Persistence of data on such an architecture can follow a write-back mode, where data is first written to local storage, and then uploaded to cloud asynchronously. However, this means that the applications cannot utilize the availability and durability guarantees of cloud, and the availability of storage is the availability SLA of on-premise hardware, which is almost always less than the availability SLA of Cloud.
By switching the order, i.e. performing uploads to cloud, and then hydrating on-premise storage, applications get the benefit of availability SLAs of cloud. In our case, this allowed us to move from three 9’s of availability (99.9%) of local storage to four 9’s (99.99%).
Instead of uploading in write-back mode, we duplicated the incoming stream to upload to both cloud and on-premise. For on-premise uploads that failed, we leveraged Kafka’s event processing to queue up objects that need to be egressed out of Cloud into the local storage.
This architecture allowed us to hydrate the local storage with objects uploaded to Cloud. Furthermore, since local storage space is limited, we periodically purged data out of local storage and created a secondary copy of the data on cloud by leveraging Kafka event processing.
Compression, Dedupe, Encryption Conundrums in Cloud
Cloud storage footprint is in exabytes and exponentially growing and companies pay billions of dollars to store and retrieve data. In this talk, we will cover some of the space and time optimizations, which have historically been applied to on-premise file storage, and how they would be applied to objects stored in Cloud
Deduplication and compression are techniques that have been traditionally used to reduce the amount of storage used by applications. Data encryption is table stakes for any remote storage offering and today, we have client-side and server-side encryption support by Cloud providers.
Combining compression, encryption, and deduplication for object stores in Cloud is challenging due to the nature of overwrites and versioning, but the right strategy can save millions for an organization. We will cover some strategies for employing these techniques depending on whether an organization prefers client side or server side encryption, and discuss online and offline deduplication of objects.
Companies such as Box, and Netflix, employ a subset of these techniques to reduce their cloud footprint and provide agility in their cloud operations.
How to crack software engineering interviews at FAANG
Having gone through interviews and successfully received job offers from Amazon, Apple, Netflix, LinkedIn, Square, etc., I'd like to share ways in which software developers can prepare for interviewing.
This session would cover technical knowledge, system design, and soft skills, such as leveraging previous work experiences and life experiences to demonstrate qualities that most of these companies seek.
I'd provide samples from my own interviewing experience at these companies, as well as several Bay area startups.
Object Compaction in Cloud for High Yield
In file systems, large sequential writes are more beneficial than small random writes, and hence many storage systems implement a log structured file system. In the same way, the cloud favors large objects more than small objects. Cloud providers place throttling limits on PUTs and GETs, and so it takes significantly longer time to upload a bunch of small objects than a large object of the aggregate size. Moreover, there are per-PUT calls associated with uploading smaller objects.
In Netflix, a lot of media assets and their relevant metadata is generated and pushed to cloud. Most of these files are between 10s of bytes to 10s of kilobytes and are saved as small objects on Cloud.
In this talk, we would like to propose a strategy to compact these small objects into larger blobs before uploading them to Cloud. We will discuss the policies to select relevant smaller objects, and how to manage the indexing of these objects within the blob. We will also discuss how different cloud storage operations such as reads and deletes would be implemented for such objects. This includes recycling blobs that have dead small objects - due to overwrites, etc.
Finally, we would showcase the potential impact of such a strategy on Netflix assets in terms of cost and performance.
Micro-services for a Billion people
The food wastage in India is 70 tonnes per year, and there is mismanagement at several layers. Approximately 20-30% of the wastage happens in the last mile, between wholesale traders, and retail mom-and-pop stores. Is there something we can do about food wastage?
This was the problem statement I attempted to solve as a first engineering hire at a startup. Our customers were 12.8 million retail owners that deal in FMCG (Fast-moving consumer goods, such as food grains, tooth paste, etc.). The goal was to develop a platform for retail traders (mom and pop shop owners / small and medium business owners) to buy FMCG products from wholesale traders using an Android app.
We were attacking a deeply entrenched business practice to help solve a societal goal. For a section of the population which is not very well versed with smartphones and technology, the user experience had to be designed from the ground up to be multi-lingual, fungible, unstructured, and relevant. In this talk, I cover how we went about iterating the solution from a simple SMS based system to a full fledged app backed by micro-services. Having a micro-service architecture provided us the agility to experiment and iterate quickly, and we were able to push out changes much faster, and help solve wastage problems even sooner.
I will discuss the several problems we faced in this segment with regards to unstructured data, and how our data models had to adapt. We used cloud services extensively, so I will also cover how different pieces came together in a cogent form to build better experience for our customers.
After having worked in bigger companies on software projects that scale to millions of devices, this was a unique challenge for me, and something I am very proud of. I would like to share my experience in building empathetic software for the masses.
CloudBrew 2024 - A two-day Microsoft Azure event Sessionize Event Upcoming
Update Conference Prague 2024 Sessionize Event
Build Stuff 2024 Lithuania Sessionize Event
MLOps + Generative AI World 2024 Sessionize Event
DevOps Vision 2023 Sessionize Event
Cloud With Chris Sessionize Event
DevOps Days Buffalo 2023 Sessionize Event
swampUP 2023 Sessionize Event
KCDC 2023 Sessionize Event
CloudConnect 2023 Sessionize Event
swampUP 2022 City Tour - New York City Sessionize Event
Strange Loop 2022 Sessionize Event
JCON 2022 ONLINE (virtual) Sessionize Event
KCDC 2022 Sessionize Event
DevOpsDays Seattle 2022 Sessionize Event
DataGrillen 2022 Sessionize Event
Zero Gravity Sessionize Event
swampUP 2022 Sessionize Event
Stir Trek 2022 Sessionize Event
NDC Porto 2022 Sessionize Event
CodeStock 2022 Sessionize Event
InnoVAte Virginia 2022 Sessionize Event
Web Day 2022 Sessionize Event
DeveloperWeek 2022 Sessionize Event
DevFest Port Harcourt 2021 Sessionize Event
JVM-Con '21 Sessionize Event
Git Commit Show - Season 03 Sessionize Event
Porto Tech Hub Conference 2021 Sessionize Event
Build Stuff 2021 Lithuania Sessionize Event
Automation + DevOps Summit Sessionize Event
PASS Data Community Summit 2021 Sessionize Event
Azure Community Conference 2021 Sessionize Event
2021 All Day DevOps Sessionize Event
API World 2021 Sessionize Event
Cloud Day 2021 Sessionize Event
Tech Con '21 Sessionize Event
TechBash 2021 Sessionize Event
iSAQB Software Architecture Gathering 2021 Sessionize Event
EventSourcing Live 2021 Sessionize Event
P99 CONF 2021 Sessionize Event
JCON 2021 Sessionize Event
DataEngBytes 2021 Sessionize Event
KCDC 2021 Sessionize Event
Kafka Summit Americas 2021 Sessionize Event
SPA Conference 2021 Sessionize Event
Azure Summit Sessionize Event
Data Saturday Oslo - Virtual Sessionize Event
WeAreDevelopers Live Sessionize Event
WorldFestival 2021 Sessionize Event
Subsurface LIVE Summer 2021 Sessionize Event
Agi'Lille 2021 Sessionize Event
Tejas Chopra
Senior Software Engineer, Netflix
San Jose, California, United States
Links
Actions
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top