Session
LLMs as a Judge: using LLMs and Evaluation frameworks for model improvement
In this 1.5 hours hands on workshop, you will learn how to design and implement a evaluative LLMOps pipeline. You will learn how to Implement feedback loop from live monitoring, back into the evaluation pipeline.
We will do a hands on workshop on a real world example of a RAG pipeline and integrate open source evaluation framework like RAGas, Evidently, Langsmith and Opik.
We will also demonstrate how to rigorously evaluate the model outputs, monitor their behavior and implement human in the loop assessment for continuous model improvement.
We will also cover:
1. Why LLM as a Judge works and when (and when not) to use it.
2. How to write evaluation prompts for binary scoring, chain of thoughts and structured output.
3. How to manage bias, verify against human ground truth and pitfalls in scaling and implementing evaluation methods in LLMOps pipeline.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top