Emanuele Fabbiani
Head of AI at xtream, Professor at Catholic University of Milan
Milan, Italy
Actions
Emanuele is an engineer, researcher, and entrepreneur with a passion for artificial intelligence.
He earned his PhD by exploring time series forecasting in the energy sector and spent time as a guest researcher at EPFL in Lausanne. Today, he is co-founder and Head of AI at xtream, a boutique company that applies cutting-edge technology to solve complex business challenges.
Emanuele is also a contract professor in AI at the Catholic University of Milan. He has published eight papers in international journals and contributed to over 30 international conferences worldwide. His engagements include AMLD Lausanne, ODSC London, WeAreDevelopers Berlin, PyData Berlin, PyData Paris, PyCon Florence, the Swiss Python Summit in Zurich, and Codemotion Milan.
Emanuele has been a guest lecturer at Italian, Swiss, and Polish universities.
Area of Expertise
Topics
Inside the Mind of an LLM
Recent studies in 2024 have revolutionised our understanding of large language models (LLMs).
This talk explores three key discoveries.
First, research shows Llama 2 models use English as their internal representation regardless of input/output language, explaining certain biases.
Second, breakthroughs by Anthropic and OpenAI have revealed monosemantic features in Claude 3 and GPT-4, enabling better understanding and adjustment of topic-specific behaviours.
Third, studies demonstrate why LLMs memorise outlier data, particularly unique strings and personal information, explaining instances of privacy breaches. We'll discuss implications for LLM privacy and security.
Attendees will walk away with a deeper understanding of the inner workings of LLMs, and with hints to mitigate their intrinsic limitations.
First delivered at Codemotion Conference 2024, Milan, Italy
Also held at WeAreDevelopers 2025, Berlin, Germany
Also held at PapersWeLove Milan 2024, Milan, Italy
Recorded session at https://youtu.be/m5qY4GNFEsA?si=a1SvJQYFVeQIcKZo
From SHAP to EBM: Explain your Gradient Boosting Models in Python
Imagine you’ve just developed a credit rating model for a bank. The backtesting metrics look fantastic, and you get the green light to go live. But then, the first loan applications start rolling in, and suddenly your model’s decisions are under scrutiny.
Why was this application rejected? Why was that one approved? You must have answers—and fast.
This session has you covered.
We’ll dive into SHAP (SHapley Additive exPlanations) and EBM (Explainable Boosting Machine), two widely used methods for interpreting tree-based ensemble models like XGBoost. You’ll learn about their theory, strengths, and limitations, as well as see them in action with Python examples using the `shap` and `interpret-ml` libraries.
From understanding feature contributions to hands-on coding, this talk will equip you with practical tools to make complex models transparent, understandable, and ready for critical applications.
First delivered at Swiss Python Summit 2024, Zurich, Switzerland
Also held at Kaggle Days Milan 2024, Italy
Recorded session at: https://youtu.be/hnZjw77-1rE?si=9iz2KXMBoIDPQ-9a
How to Behave in a Team? Heuristics from Game Theory
Everyone knows about the prisoner's dilemma, where two prisoners choose to either work together or betray each other. Choosing betrayal is proven to be the optimal decision, but working together would offer greater rewards.
But what if they have to repeat the choice many times? A famous 1984 study by Axelrod found that the best strategies were kind, retaliatory, forgiving, and not jealous.
Using this study, we'll explore ways to work well in a team. We'll see why it's better to work together than compete, and why solving conflicts immediately beats waiting for scheduled meetings, such as 1-on-1 or performance reviews.
We'll discuss the challenge of poor communication, its theoretical and practical effects, and offer strategies for managers and team members to mitigate the issue.
There's no magic formula to create team harmony, but we'll provide practical, math-backed tips to help improve team dynamics.
First delivered at Futuru 2024, Iglesias, Italy
Recorded session at: https://youtu.be/ua6Pcpf0fAQ?si=_LYvJ5Ts2_9HRyf4
System Design for the GenAI Era
We all learnt the basic modules of system design. Relational database, Blob storage, API Gateway, and Distributed Logging are familiar concepts. However, the rise of Generative AI applications brings a new set of challenges and tools to solve them.
In this talk, we will explore the new modules essential for developing Generative AI (GenAI) applications, addressing their unique challenges.
We will begin by talking about why using Large Language Models (LLMs) is insufficient. Next, we'll introduce guardrails, which are crucial for sanitizing both the input and output of LLMs and diffusion models. We'll then cover prompt compression techniques, designed to reduce the cost of generation.
Following this, we'll present prompt registries, which enable over-the-air A/B testing and updates of prompts, and the role of observability, highlighting tools such as openllmetry, a new standard based on opentelemetry.
We'll also examine how to enhance LLM capabilities with function calls, allowing LLMs to interact as agents and communicate with other external systems.
During the discussion, we will show code examples in Python.
By the end of this talk, attendees will have a clear understanding of the main modules used in building GenAI applications and gain valuable insights into designing their own.
First Delivered at AI Heroes 2024, Turin, Italy
Beyond ChatGPT: RAG and Fine-Tuning
In most real-world applications, ChatGPT alone is insufficient. Businesses seek to utilize their own private documents to obtain factually accurate answers. Over the past year, two techniques have emerged to address this issue.
Retrieval Augmented Generation (RAG) employs text embedding to identify relevant snippets and incorporate them into a prompt for a Large Language Model (LLM) to expand upon. Conversely, fine-tuning involves updating the weights of the LLM with training episodes based on specific documents. Since training LLMs is notoriously costly, fine-tuning often incorporates advanced methods such as low-rank adaptation and quantization.
This lecture will delve into both RAG and fine-tuning, discussing the latest techniques for achieving optimal results. We will examine the pros and cons of each technique and discuss real-world applications for both.
Attendees will leave with a thorough understanding of the primary techniques used to enhance and ground LLM knowledge, along with insights into their main industry applications.
First held at UniPV guest lecture 2024, Pavia, Italy
From Embeddings to Transformers: Build Your Own GPT
ChatGPT has garnered immense success and is already transforming the daily routines of numerous professionals across various industries. But with great power comes great responsibility, which can only be achieved by knowing the tools we use.
We invite you to embark on a journey with us, where we demystify one such tool – the Generative Pre-trained Transformer, or GPT.
You might have heard of its capabilities, but what exactly is it? More importantly, what is it not? Our workshop will first lay the foundation by providing a clear, synthetic explanation of what language models essentially are: we'll delve into the intricacies of their inner workings, separating myths from facts. We will cover all the bases: from tokenisation and word embeddings to the training and fine-tuning of state-of-the-art language models, such as OpenAI’s latest GPT4.
While theory is a powerful first step, we believe that true understanding comes from hands-on experience. What if we told you that by the end of our session, you would be able to code your very own GPT? Our expert-guided workshop will ultimately have you build your own generative model. While your model will be simplified, small-scale, and not nearly as capable as the state of the art, this experience will equip you with a deep understanding of the potential and limitations of this technology.
Workshop held at Applied Machine Learning Days 2024, Lausanne, Switzerland
Academic and Business Research in AI for Energy
Extensive academic literature covers the topic of power and natural gas demand forecasting. Years of study and hundreds of students proposed many forecasting approaches, providing a solid foundation for practitioners.
Unfortunately, constraints imposed by real-world scenarios often prevent a straightforward application of existing contributions. For instance, reliable demand data may be published with a delay of weeks or even months, thus compromising the application of many short-term models, and forecasts must be used for weather features, impacting the overall performance.
Using case studies from real projects, we point out the most important gaps between requirements from the industry and academic works and propose solutions to mitigate the issues. We report here three examples.
Concerning gas demand forecasting, we show how temperature forecasting errors impact the accuracy of different models, with both theoretical derivation and practical experiments. We propose an algorithm for the computation of a "similar day", to compensate for the lack of recent data using seasonality. Finally, we show how different models are better at capturing different properties of the series, and how ensembling approaches may improve the overall accuracy.
Besides the proposed examples, the talk is intended to point out that the literature survey is a critical yet preliminary phase of a Machine Learning project aiming at creating business value. Existing work shall often be modified and extended to comply with extended constraints and more complex datasets.
Delivered at Applied Machine Learning Days, 25-29 January 2020, Lausanne, Switzerland
Recorded session at https://www.youtube.com/watch?v=FZwOrGD3agA
Slides at https://www.slideshare.net/EmanueleFabbiani/academic-and-business-research-in-ai-for-energy
tsviz: a Data-Scientist-Friendly Addin for RStudio
In recent years, charting libraries have evolved following two main directions. First, they provided users with as many features as possible and second, they added high-level APIs to easily create the most frequent visualizations. RStudio, with its addins, offers the opportunity to further ease the creation of common plots.
Born as an internal project in xtream, tsviz is an open-source Shiny-based addin which contains powerful tools to perform explorative analysis of multivariate time series.
Its usage is dead simple. Once launched, it scans the global environment for suitable variables. You chose one, and several plots of the time series are shown. Line charts, scatter plots, autocorrelogram, periodogram are only a few examples. Interactivity is achieved by the miniUI framework and the adoption of Plotly charts.
Its wide adoption among our customers and the overall positive feedback we received demonstrate how addins, usually thought of as shortcuts for developers, may provide effective support to data scientists in performing their routine tasks.
Delivedered at erum 2020, 12-20 June 2020, Milan, Italy
Recorded session: https://youtu.be/t8PZbP5b8EM
Reference post: https://towardsdatascience.com/introducing-tsviz-interactive-time-series-visualization-in-r-studio-a96cde507a14?gi=ac1810e1f6df
Reference repository: https://github.com/xtreamsrl/tsviz
Forecasting Gas Demand: a Machine Learning Approach
Forecasting gas demand is critical to pipe reservation and price forecasting.
After presenting statistical characterization of residential, industrial and thermoelectric gas demand, several statistical learning models are applied and compared to perform day-ahead forecasting. Different ensemble models are also considered.
A considerable improvement over the forecasts performed by SNAM, the Italian transmission system operator, is achieved.
Delivered at P-Value Meetup, Pavia, Italy, 29 June 2020
Summary and slides at https://www.meetup.com/it-IT/Data-Science-Meetup-Pavia/events/262410495/
Machine Learning methods for the Identification of Power Grids
The increasing integration of intermittent renewable generation in power networks calls for novel planning and control methodologies, which hinge on detailed knowledge of the grid.
However, reliable information concerning the system topology and parameters may be missing or outdated for temporally varying AC networks. The talk proposes an online learning procedure to estimate the admittance matrix of an AC network capturing topological information and line parameters.
We start off by providing a recursive identification algorithm that exploits phasor measurements of voltages and currents. With the goal of accelerating convergence, we subsequently complement our base algorithm with a design-of-experiment procedure, which maximizes the information content of data at each step by computing optimal voltage excitations.
Our approach improves on existing techniques and its effectiveness is substantiated by numerical studies.
First delivered at End of Semester Talks, 2020, Lucerne University of Applied Sciences and Arts, Lucerne, Switzerland
AI & Psychology: can they nudge towards wellbeing?
Documentaries like ‘The Social Dilemma’ (Netflix, 2020) raise an important question: can machine-mediated interaction lead to manipulation? The risk is there, but the mix of social psychology and artificial intelligence can also result in meaningful benefits for individuals and communities. Inclusivity, transparency and explainability are key to mitigating the risks while maximising the positive effects.
How can researchers and practitioners follow this path? Psychologists, AI researchers, and engineers from both the academy and the industry discuss the topic, starting from concrete cases of applied research in the fields of nutrition, medicine, and financial services.
Delivered at Milan Digital Week, 17-21 March 2021, Milan, Italy
AI & Sustainable Energy
Sustainable energy is going to be a critical challenge in the next few years. There’s no denying that. Machine Learning and AI have already shown their potential in helping the transition towards new energy paradigms, but many problems are still unsolved and innovation needs to progress at a fast pace.
This is why the collaboration between researchers and practitioners is crucial and this is why we organized a proper track at the Applied Machine Learning Days 2021.
In one of the leading conferences in Europe for Machine Learning, I was in charge of the track: choosing and inviting keynote speakers, writing, publishing and spreading the public call for speakers and managing the sessions during the conference.
Held at Applied Machine Learning Day, on 26th April 2021 and 29th March 2022, Lausanne, Switzerland
Recording available at https://www.youtube.com/c/AppliedMachineLearningDays/videos (track AI & Sustainable Energy)
MLOps on AWS: a Hands-On Tutorial
Applying machine learning in the real world is hard: reproducibility gets lost, datasets are dirty, data flows break down, and the context where models operate keeps evolving. In the last 2-3 years, the emerging MLOps paradigm provided a strong push towards more structured and resilient workflows.
MLOps is about supporting and automating the assessment of model performance, model deployment and the following monitoring. Valuable tools for an effective MLOps process are data version trackers, model registries, feature stores, and experiment trackers.
During the workshop, we will showcase the challenges of “applied” machine learning and the value of MLOps with a practical case study. We will develop an ML model following MLOps best practices, from raw data to production deployment. Then, we will simulate a further iteration of development, resulting in better performance, and we will appreciate how MLOps allows for easy comparison and evolution of models.
AWS will provide the tools to effectively implement MLOps: the workshop is also intended to offer an overview of the main resources of the cloud platform and to show how they can support model development and operation.
Held at Applied Machine Learning Day, 26th March 2022, Lausanne, Switzerland
Repo: https://github.com/xtreamsrl/amld22-mlops-on-aws
The Hitchhiker's Guide to asyncio
asyncio is the de-facto standard for asynchronous programming in Python and enables concurrent operations without using threads or processes.
In this talk, we will delve into the technical details of asyncio and show how it can be used to improve the performance of Python applications. We will start by discussing the difference between threading, multiprocessing and async programming. Then, we will introduce the basic building blocks of asyncio: Event loops and Coroutines. We will dive deep into the way Coroutines work, discussing their origins and how they are linked to Generators.
Next, we will look at Tasks, which are a higher-level abstraction built on top of Coroutines. Tasks make it easy to schedule and manage the execution of Coroutines. We will cover how to create and manage Tasks and how they can be used to write concurrent code.
Finally, we will also cover some more advanced topics such as Async Loops and Context Managers, and how to handle errors and cancellations in asyncio.
First delivered at Python Milan, 8 March 2023, Milan, Italy
Also held at PyCon Italy, 24-28 May 2023, Florence, Italy
Also held at the Swiss Python Summit 2024, Zurich, Switzerland
Recorded session at https://youtu.be/q3nTbrLp4Mc?si=U9N7pWBPQ3vwJbJl
Time Series Forecasting on AWS
Applying machine learning in the real world is hard: reproducibility gets lost, datasets are dirty, data flows break down, and the context where models operate keeps evolving. In the last 2-3 years, the emerging MLOps paradigm provided a strong push towards more structured and resilient workflows.
In this talk, we show how to build custom models using SageMaker and its MLOps facilities. We will explore modules such as the experiment tracker, model register, feature storage, and model deployment. As a case study, we will use the forecasting of the Italian power load.
First delivered at AWS User Group Meetup, 18 April 2023, Milan, Italy
Serverless Computing for Mathematical Optimization
Solving large optimization problems is challenging, yet when parallelization is possible, serverless computing can drastically reduce the elapsed time.
In this talk, we consider a real case study: creating power market scenarios by finding the optimal way of using power storage facilities.
This results in a convex optimization problem with hundreds of thousands of variables. Despite parallelization, a conventional solution requires around two hours on a 16-core virtual machine.
We discuss how different serverless approaches can be adopted to reduce the elapsed time and demonstrate how cloud functions help cut it by 96% while keeping costs as low as 3€ per run. Additionally, we show how to connect the optimization module to the rest of the data processing pipeline, implemented in AirFlow.
By the end of this talk, you will have a better understanding of the capabilities of serverless for high-performance computing.
First delivered at PyCon Italy, 24-28 May 2023, Florence, Italy
Also held at Cloud Day 2024, Milan, Italy
Also held at Serverless Dys Rome 2024, Rome, Italy
Recorded session at https://youtu.be/JAOK9Zut_R4
Should You Trust Your Copilot? Limitations and Merits of AI Coding Assistants
More and more developers are using AI coding assistants in their daily work. GitHub Copilot got 400.000 subscribers in the first month and was praised by influential engineers, including Guido van Rossum.
Several studies confirm that AI assistants increase productivity, but other works raise concerns about their adoption.
The Free Software Foundation complained about the training process of Copilot, which allegedly violated the licensing of open-source code. The ACM warned that adopting coding assistants in education may lead to over-dependence and a decrease in learners' understanding. Finally, in a recent preprint, researchers from Stanford describe a controlled experiment suggesting that coding assistants may increase security-related bugs.
By providing an unbiased and independent review of the literature, this talk aims to inform the debate about the trustworthiness of AI coding assistants and to provide insights into their future evolution.
First delivered at Pisa.dev, 19th June 2023, Pisa, Italy
Held also at ODSC Europe, 14-15 June 2023, London, UK
Held also at WeAreDevelopers World Congress, 26-28 July 2023, Berlin, Germany
Recorded session at https://www.youtube.com/live/JHAqIgBo-q8?feature=share
Embeddings, Transformers, RLHF: Three Key Ideas to Understand ChatGPT
ChatGPT has achieved tremendous success and is already transforming the daily routines of many professionals across various industries.
While countless articles highlight the "30 must-know commands," few delve into the actual workings of the technology behind ChatGPT. To understand it, it's essential to grasp three key concepts:
- Embeddings: These represent words and phrases numerically, allowing large language models like GPT to process natural language.
- Transformers: The core component of large language models. Using the attention mechanism, they can focus on semantically related words even when they appear distant from one another.
- RLHF (Reinforcement Learning with Human Feedback): This technique is employed to train models on extensive datasets of questions and answers with minimal human supervision.
In this talk, Emanuele Fabbiani from xtream will provide a concise yet thorough introduction to embeddings, transformers, and RLHF. He'll describe the technology powering ChatGPT, enabling the audience to harness the tool to its fullest potential.
First delivered at Talks at Buildo, 12 July 2023, Milan, Italy
Also held at BI Digital, 7 October 2023, Biella, Italy
Also held at Talks at BitRocket, 10 November 2023, Palermo, Italy
Also held at Boolean Masterclass, 21 November 2023, Milan, Italy
Also held at SIIAM Congress, 7 December 2023, Rome, Italy
Also held at Futuru 2024, 11 May 2024, Iglesias, Sardinia
Recorded session at https://youtu.be/m5qY4GNFEsA?si=a1SvJQYFVeQIcKZo
Tool, author, or danger? The role of generative AI in academic research
In the 2023 study titled "Can Linguists Distinguish Between ChatGPT/AI and Human Writing? A Study of Research Ethics and Academic Publishing," a survey revealed that 22% of academic editors are opposed to incorporating Generative AI in scholarly research. However, the same study also indicated that distinguishing AI-generated text from human-written text remains a challenge for both individuals and machines.
Responding to instances where ChatGPT was credited as an author, the prestigious journal 'Nature' revised its authorship guidelines in mid-2023. The new policy states that "Large Language Models do not currently meet our criteria for authorship," effectively disallowing non-human authors.
This talk delves into the controversial role of generative AI in academic research. Is it merely a tool, a potential author, or a threat to be avoided? We will examine recent studies and present diverse viewpoints to shed light on this issue. Our discussion will argue that while generative AI cannot be regarded as an author, at least in the foreseeable future, its value as a research tool is undeniable and cannot be ignored. The industry's use of AI underscores its significant utility, emphasizing the need for its strategic integration into research methodologies.
Participants of this talk will gain insights into incorporating generative AI tools in their research, along with an understanding of the strengths and limitations of current AI technologies.
First delivered at SIIAM Congress, 7 December 2023, Rome, Italy
Recorded session at: https://youtu.be/7iDjcGvc5_Y?si=ufSKJISxtVfiR7nb
Ordo Ab Chao: the magic of DALL-E and Midjourney
Have you ever laughed at a picture of the Pope drinking beer or holding a gun? Then you've witnessed the capabilities of Diffusion Models such as Midjourney, Stable Diffusion, and DALL-E.
This talk will discuss why Diffusion Models were developed, how they work, and what makes them so effective at creating images based on a prompt in natural language.
We will delve into the processes of forward and reverse diffusion, explaining how the AI models build realistic pictures from nothing but noise. We will also explain some of their weird features: for instance, why it is so hard for such models to reproduce text and create realistic hands.
At the end of this talk, you will have a clear idea of the main concept powering diffusion models and understand their strengths and limitations. You will also learn how to use Diffusion Models for practical applications.
First delivered at Codemotion Meetup, 13 February 2024, Milan, Italy
Delivered also at Py4AI Pavia 2024, 16 March 2024, Pavia, Italy
Delivered also at Papers We Love Meetup, 11 April 2024, Milan, Italy
Recorded session at https://youtu.be/Svxb36_FaUk?si=27SVZS1gkVycLkK2
Is your marketing effective? Let Bayes decide!
Understanding the effectiveness of various marketing channels is crucial to maximise the return on investment (ROI). However, the limitation of third-party cookies and an ever-growing focus on privacy make it difficult to rely on basic analytics. This talk discusses a pioneering project where a Bayesian model was employed to assess the marketing media mix effectiveness of WeRoad, an innovative Italian tour operator.
The Bayesian approach allows for the incorporation of prior knowledge, seamlessly updating it with new data to provide robust, actionable insights. This project leveraged a Bayesian model to unravel the complex interactions between marketing channels such as online ads, social media, and promotions. We'll dive deep into how the Bayesian model was designed, discussing how we provided the AI system with expert knowledge, and presenting how delays and saturation were modelled.
Attendees will walk away with:
- A simple understanding of the Bayesian approach and why it matters.
- Concrete examples of the transformative impact on WeRoad's marketing strategy.
- A blueprint to harness predictive models in their own business strategies.
First delivered at PyData Milan 2024, Italy
Also delivered at PyData Berlin 2024, PyData Paris 2024, WeMakeFuture Bologna 2024
Recorded session at https://youtu.be/k2LDWMLZQ8k?si=ViXHKVje9mOC0IL5
WeAreDevelopers World Congress 2025 Sessionize Event Upcoming
Cloud Day 2024 Sessionize Event
Kaggle Days Meetup 2024
Codemotion Milan 2024 Sessionize Event
AI Conf 2024 Sessionize Event
Futuru 2024
WeAreDevelopers World Congress 2023 Sessionize Event
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top