Samir Sengupta

AI/ML ENGINEER, BUILDING AGI

New City, New York, United States

Actions

I'm an AI/ML Engineer passionate about building intelligent systems that solve real-world problems using Large Language Models, RAG architectures, and production-grade MLOps.

SyGenticAI (It's an AI Agent Start-up Idea which I am planning to pitch to YC)
It focuses on building Agentic AI systems, Imagine AI making MicroAgents and These Agents building entire software's

WHAT I DO:
• Design and deploy production-ready LLM applications using RAG, embeddings, and semantic search
• Build scalable ML pipelines with distributed training and model optimization (quantization, LoRA/QLoRA)
• Implement cloud-native AI solutions on AWS Bedrock, Azure OpenAI, and GCP Vertex AI
• Develop MLOps workflows with Kubernetes, Docker, MLflow, and CI/CD automation

TECHNICAL EXPERTISE:
Languages & Frameworks: Python, PyTorch, TensorFlow, LangChain, HuggingFace
LLM Technologies: RAG systems, prompt engineering, fine-tuning, quantization (GGUF/GPTQ), vLLM
Cloud & MLOps: AWS (Bedrock, SageMaker), Azure OpenAI, Kubernetes, Docker, Terraform
Data Engineering: Apache Spark, Kafka, Airflow, Databricks, Snowflake
Vector Databases: Pinecone, Weaviate, Chroma, FAISS

RECENT ACHIEVEMENTS:
→ Improved LLM reasoning accuracy by 35% on GSM8K and HumanEval benchmarks
→ Reduced ML inference costs by 50% through quantization and GPU optimization
→ Built microservices handling 10M+ daily events with 99.9% uptime
→ Designed enterprise RAG systems for large-scale semantic search

FEATURED PROJECTS:
• PrometheusAI: Privacy-first offline LLM assistant for Android (LLaMA, Granite, GGUF)
• NeuralScale: Distributed training framework with multi-GPU acceleration
• OmniGen: Multi-modal generation engine using LoRA/LCM

I'm actively seeking full-time AI/ML Engineering opportunities where I can leverage my expertise in LLMs, RAG systems, and production AI to drive business impact.

samir.s.sengupta@gmail.com
https://www.github.com/SamirSengupta
https://www.samcodeman.com
https://www.linkedin.com/in/samirsengupta/

Let's connect if you're working on cutting-edge AI problems or hiring for ML/AI roles!

Area of Expertise

Business & Management
Finance & Banking
Health & Medical
Information & Communications Technology
Media & Information

Topics

Artificial Intelligence (AI)
Machine Learning and Artificial Intelligence
Artificial Intelligence (AI) and Machine Learning
Generative AI

Real-Time AI Pipelines at Scale: Embedding LLMs into Apache Beam for Live Inference

As AI moves from experimentation to production, the hardest challenge isn't building a model. It's getting it to run reliably on live data at scale. In this talk, I'll walk through how I architected production-grade pipelines that embed LLMs and RAG systems directly into Apache Beam, enabling real-time inference on high-velocity data streams.

We'll cover:
How to integrate HuggingFace and vLLM models into Beam transforms for low-latency inference
Designing a RAG pipeline inside Beam using vector databases (Pinecone, FAISS) for semantic search on streaming data
Handling the cost and throughput challenges of running LLMs in a pipeline (quantization, batching, GPU optimization)
Deploying the full stack on AWS Bedrock + SageMaker with Kubernetes orchestration
Real benchmark results: how we cut inference costs by 50% while improving reasoning accuracy by 35%

This isn't a toy demo. It's a battle-tested architecture handling 10M+ daily events with 99.9% uptime. Attendees will leave with concrete patterns they can apply to fraud detection, anomaly detection, semantic search, and personalized recommendation systems.

Scaling Production RAG Systems with Kubernetes

Deploying LLM applications at scale requires reliable cloud native infrastructure. This talk explores how to run production RAG pipelines using Kubernetes, covering vector search services, scalable inference with vLLM, and distributed embedding pipelines. We will discuss observability, cost optimization, and autoscaling strategies for real world AI workloads running in containerized environments.

Beam Summit 2026 Sessionize Event Upcoming

June 2026 New York City, New York, United States

KCD New York 2026 Sessionize Event

June 2026 New York City, New York, United States

Samir Sengupta

AI/ML ENGINEER, BUILDING AGI

New City, New York, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Speaker

Samir Sengupta

Actions

Links

Area of Expertise

Topics

Sessions

Real-Time AI Pipelines at Scale: Embedding LLMs into Apache Beam for Live Inference

Scaling Production RAG Systems with Kubernetes

Events

Beam Summit 2026 Sessionize Event Upcoming

KCD New York 2026 Sessionize Event

Samir Sengupta

Links

Actions