Samir Sengupta
AI/ML ENGINEER, BUILDING AGI
New City, New York, United States
Actions
I'm an AI/ML Engineer passionate about building intelligent systems that solve real-world problems using Large Language Models, RAG architectures, and production-grade MLOps.
SyGenticAI (It's an AI Agent Start-up Idea which I am planning to pitch to YC)
It focuses on building Agentic AI systems, Imagine AI making MicroAgents and These Agents building entire software's
WHAT I DO:
• Design and deploy production-ready LLM applications using RAG, embeddings, and semantic search
• Build scalable ML pipelines with distributed training and model optimization (quantization, LoRA/QLoRA)
• Implement cloud-native AI solutions on AWS Bedrock, Azure OpenAI, and GCP Vertex AI
• Develop MLOps workflows with Kubernetes, Docker, MLflow, and CI/CD automation
TECHNICAL EXPERTISE:
Languages & Frameworks: Python, PyTorch, TensorFlow, LangChain, HuggingFace
LLM Technologies: RAG systems, prompt engineering, fine-tuning, quantization (GGUF/GPTQ), vLLM
Cloud & MLOps: AWS (Bedrock, SageMaker), Azure OpenAI, Kubernetes, Docker, Terraform
Data Engineering: Apache Spark, Kafka, Airflow, Databricks, Snowflake
Vector Databases: Pinecone, Weaviate, Chroma, FAISS
RECENT ACHIEVEMENTS:
→ Improved LLM reasoning accuracy by 35% on GSM8K and HumanEval benchmarks
→ Reduced ML inference costs by 50% through quantization and GPU optimization
→ Built microservices handling 10M+ daily events with 99.9% uptime
→ Designed enterprise RAG systems for large-scale semantic search
FEATURED PROJECTS:
• PrometheusAI: Privacy-first offline LLM assistant for Android (LLaMA, Granite, GGUF)
• NeuralScale: Distributed training framework with multi-GPU acceleration
• OmniGen: Multi-modal generation engine using LoRA/LCM
I'm actively seeking full-time AI/ML Engineering opportunities where I can leverage my expertise in LLMs, RAG systems, and production AI to drive business impact.
samir.s.sengupta@gmail.com
https://www.github.com/SamirSengupta
https://www.samcodeman.com
https://www.linkedin.com/in/samirsengupta/
Let's connect if you're working on cutting-edge AI problems or hiring for ML/AI roles!
Area of Expertise
Topics
Real-Time AI Pipelines at Scale: Embedding LLMs into Apache Beam for Live Inference
As AI moves from experimentation to production, the hardest challenge isn't building a model. It's getting it to run reliably on live data at scale. In this talk, I'll walk through how I architected production-grade pipelines that embed LLMs and RAG systems directly into Apache Beam, enabling real-time inference on high-velocity data streams.
We'll cover:
How to integrate HuggingFace and vLLM models into Beam transforms for low-latency inference
Designing a RAG pipeline inside Beam using vector databases (Pinecone, FAISS) for semantic search on streaming data
Handling the cost and throughput challenges of running LLMs in a pipeline (quantization, batching, GPU optimization)
Deploying the full stack on AWS Bedrock + SageMaker with Kubernetes orchestration
Real benchmark results: how we cut inference costs by 50% while improving reasoning accuracy by 35%
This isn't a toy demo. It's a battle-tested architecture handling 10M+ daily events with 99.9% uptime. Attendees will leave with concrete patterns they can apply to fraud detection, anomaly detection, semantic search, and personalized recommendation systems.
Scaling Production RAG Systems with Kubernetes
Deploying LLM applications at scale requires reliable cloud native infrastructure. This talk explores how to run production RAG pipelines using Kubernetes, covering vector search services, scalable inference with vLLM, and distributed embedding pipelines. We will discuss observability, cost optimization, and autoscaling strategies for real world AI workloads running in containerized environments.
Beam Summit 2026 Sessionize Event Upcoming
KCD New York 2026 Sessionize Event
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top