Engineering Agentic Workflows: Lessons from Frontier Model Evaluation and Production CI/CD

While AI code assistants have become standard, the true engineering leap lies in transitioning to fully agentic workflows. Drawing from hands-on experience conducting advanced evaluation and supervised training of Frontier Models, including Anthropic architectures , this session bridges the gap between raw LLM capabilities and reliable software engineering lifecycles.

We will explore how to move beyond basic autocomplete to design structured scenarios that test Agentic AI tools across distributed codebases and APIs. Attendees will learn how to develop robust evaluation rubrics using JSON and YAML formats to ensure models produce secure, production-ready code. Furthermore, the talk will demonstrate practical applications, such as championing GitHub Copilot within daily engineering workflows to accelerate the creation of backend services , and utilizing Generative AI for faster root-cause analysis of CI/CD pipeline failures. We will also touch upon implementing these techniques in end-to-end architectures, drawing on practical paradigms like computer vision integrations for real-world systems such as BioVerify AI. By understanding the reasoning limitations of frontier models, teams can leverage advanced prompt engineering to build resilient, AI-assisted development pipeline

Mukul Gupta

MukulGupta | Software Engineer/AI Consultant

Kanpur, India

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Engineering Agentic Workflows: Lessons from Frontier Model Evaluation and Production CI/CD

Mukul Gupta

Links

Actions