Emanuele Fabbiani
Head of AI at xtream
Milan, Italy
Actions
Engineer, researcher, entrepreneur. Emanuele earned his PhD in AI by researching time series forecasting in the energy field. He was a guest researcher at EPFL Lausanne, and he's now the Head of AI at xtream, where he solves business problems with AI. He published 8 papers in international journals, presented and organized tracks and workshops at 20+ international conferences, including AMLD Lausanne, ODSC London, WeAreDevelopers Berlin, PyData Berlin, PyData Paris, PyCon Florence, and lectured in Italy, Switzerland, and Poland.
Area of Expertise
Topics
Academic and Business Research in AI for Energy
Extensive academic literature covers the topic of power and natural gas demand forecasting. Years of study and hundreds of students proposed many forecasting approaches, providing a solid foundation for practitioners.
Unfortunately, constraints imposed by real-world scenarios often prevent a straightforward application of existing contributions. For instance, reliable demand data may be published with a delay of weeks or even months, thus compromising the application of many short-term models, and forecasts must be used for weather features, impacting the overall performance.
Using case studies from real projects, we point out the most important gaps between requirements from the industry and academic works and propose solutions to mitigate the issues. We report here three examples.
Concerning gas demand forecasting, we show how temperature forecasting errors impact the accuracy of different models, with both theoretical derivation and practical experiments. We propose an algorithm for the computation of a "similar day", to compensate for the lack of recent data using seasonality. Finally, we show how different models are better at capturing different properties of the series, and how ensembling approaches may improve the overall accuracy.
Besides the proposed examples, the talk is intended to point out that the literature survey is a critical yet preliminary phase of a Machine Learning project aiming at creating business value. Existing work shall often be modified and extended to comply with extended constraints and more complex datasets.
Delivered at Applied Machine Learning Days, 25-29 January 2020, Lausanne, Switzerland
Recorded session at https://www.youtube.com/watch?v=FZwOrGD3agA
Slides at https://www.slideshare.net/EmanueleFabbiani/academic-and-business-research-in-ai-for-energy
tsviz: a Data-Scientist-Friendly Addin for RStudio
In recent years, charting libraries have evolved following two main directions. First, they provided users with as many features as possible and second, they added high-level APIs to easily create the most frequent visualizations. RStudio, with its addins, offers the opportunity to further ease the creation of common plots.
Born as an internal project in xtream, tsviz is an open-source Shiny-based addin which contains powerful tools to perform explorative analysis of multivariate time series.
Its usage is dead simple. Once launched, it scans the global environment for suitable variables. You chose one, and several plots of the time series are shown. Line charts, scatter plots, autocorrelogram, periodogram are only a few examples. Interactivity is achieved by the miniUI framework and the adoption of Plotly charts.
Its wide adoption among our customers and the overall positive feedback we received demonstrate how addins, usually thought of as shortcuts for developers, may provide effective support to data scientists in performing their routine tasks.
Delivedered at erum 2020, 12-20 June 2020, Milan, Italy
Recorded session: https://youtu.be/t8PZbP5b8EM
Reference post: https://towardsdatascience.com/introducing-tsviz-interactive-time-series-visualization-in-r-studio-a96cde507a14?gi=ac1810e1f6df
Reference repository: https://github.com/xtreamsrl/tsviz
Forecasting Gas Demand: a Machine Learning Approach
Forecasting gas demand is critical to pipe reservation and price forecasting.
After presenting statistical characterization of residential, industrial and thermoelectric gas demand, several statistical learning models are applied and compared to perform day-ahead forecasting. Different ensemble models are also considered.
A considerable improvement over the forecasts performed by SNAM, the Italian transmission system operator, is achieved.
Delivered at P-Value Meetup, Pavia, Italy, 29 June 2020
Summary and slides at https://www.meetup.com/it-IT/Data-Science-Meetup-Pavia/events/262410495/
Machine Learning methods for the Identification of Power Grids
The increasing integration of intermittent renewable generation in power networks calls for novel planning and control methodologies, which hinge on detailed knowledge of the grid.
However, reliable information concerning the system topology and parameters may be missing or outdated for temporally varying AC networks. The talk proposes an online learning procedure to estimate the admittance matrix of an AC network capturing topological information and line parameters.
We start off by providing a recursive identification algorithm that exploits phasor measurements of voltages and currents. With the goal of accelerating convergence, we subsequently complement our base algorithm with a design-of-experiment procedure, which maximizes the information content of data at each step by computing optimal voltage excitations.
Our approach improves on existing techniques and its effectiveness is substantiated by numerical studies.
First delivered at End of Semester Talks, 2020, Lucerne University of Applied Sciences and Arts, Lucerne, Switzerland
AI & Psychology: can they nudge towards wellbeing?
Documentaries like ‘The Social Dilemma’ (Netflix, 2020) raise an important question: can machine-mediated interaction lead to manipulation? The risk is there, but the mix of social psychology and artificial intelligence can also result in meaningful benefits for individuals and communities. Inclusivity, transparency and explainability are key to mitigating the risks while maximising the positive effects.
How can researchers and practitioners follow this path? Psychologists, AI researchers, and engineers from both the academy and the industry discuss the topic, starting from concrete cases of applied research in the fields of nutrition, medicine, and financial services.
Delivered at Milan Digital Week, 17-21 March 2021, Milan, Italy
AI & Sustainable Energy
Sustainable energy is going to be a critical challenge in the next few years. There’s no denying that. Machine Learning and AI have already shown their potential in helping the transition towards new energy paradigms, but many problems are still unsolved and innovation needs to progress at a fast pace.
This is why the collaboration between researchers and practitioners is crucial and this is why we organized a proper track at the Applied Machine Learning Days 2021.
In one of the leading conferences in Europe for Machine Learning, I was in charge of the track: choosing and inviting keynote speakers, writing, publishing and spreading the public call for speakers and managing the sessions during the conference.
Held at Applied Machine Learning Day, on 26th April 2021 and 29th March 2022, Lausanne, Switzerland
Recording available at https://www.youtube.com/c/AppliedMachineLearningDays/videos (track AI & Sustainable Energy)
MLOps on AWS: a Hands-On Tutorial
Applying machine learning in the real world is hard: reproducibility gets lost, datasets are dirty, data flows break down, and the context where models operate keeps evolving. In the last 2-3 years, the emerging MLOps paradigm provided a strong push towards more structured and resilient workflows.
MLOps is about supporting and automating the assessment of model performance, model deployment and the following monitoring. Valuable tools for an effective MLOps process are data version trackers, model registries, feature stores, and experiment trackers.
During the workshop, we will showcase the challenges of “applied” machine learning and the value of MLOps with a practical case study. We will develop an ML model following MLOps best practices, from raw data to production deployment. Then, we will simulate a further iteration of development, resulting in better performance, and we will appreciate how MLOps allows for easy comparison and evolution of models.
AWS will provide the tools to effectively implement MLOps: the workshop is also intended to offer an overview of the main resources of the cloud platform and to show how they can support model development and operation.
Held at Applied Machine Learning Day, 26th March 2022, Lausanne, Switzerland
Repo: https://github.com/xtreamsrl/amld22-mlops-on-aws
European Power Price Scenarios: a Cloud-Native Approach
Imagine you are the CEO of a major utility. You need to decide whether to invest in a new solar plant. What data do you need?
The most important factor influencing the decision is the future price of power. However, this is notoriously hard to predict. So, analysts create scenarios, where they project demand and generation up to 40 years into the future and intersect the curves to compute the price.
When we started the project, such computations were made in Excel, with many difficulties and known problems. There was no auditing, and no backup, the files were hard to understand and maintain, and could only simulate a few days each year.
We reviewed the whole design, creating a cloud-native application based on AirFlow, Kubernetes, and cloud functions.
The latter was adopted to parallelize the solution of a convex optimization problem meant to model the behaviour of energy storage facilities, such as batteries.
This talk presents the main modules of the system, discussing the architectural decisions and the trade-offs we had to face.
Delivered at BI Days, on 10th January 2023, University of Wroclaw, Poland.
More info at https://p.wz.pwr.edu.pl/~business.intelligence/BI-Days/BIDay23Jan
The Hitchhiker's Guide to asyncio
asyncio is the de-facto standard for asynchronous programming in Python and enables concurrent operations without using threads or processes.
In this talk, we will delve into the technical details of asyncio and show how it can be used to improve the performance of Python applications. We will start by discussing the difference between threading, multiprocessing and async programming. Then, we will introduce the basic building blocks of asyncio: Event loops and Coroutines. We will dive deep into the way Coroutines work, discussing their origins and how they are linked to Generators.
Next, we will look at Tasks, which are a higher-level abstraction built on top of Coroutines. Tasks make it easy to schedule and manage the execution of Coroutines. We will cover how to create and manage Tasks and how they can be used to write concurrent code.
Finally, we will also cover some more advanced topics such as Async Loops and Context Managers, and how to handle errors and cancellations in asyncio.
First delivered at Python Milan, 8 March 2023, Milan, Italy
Held also at PyCon Italy, 24-28 May 2023, Florence, Italy
Recorded session at https://www.youtube.com/watch?v=UyRj8Sh3E_Y
Time Series Forecasting on AWS
Applying machine learning in the real world is hard: reproducibility gets lost, datasets are dirty, data flows break down, and the context where models operate keeps evolving. In the last 2-3 years, the emerging MLOps paradigm provided a strong push towards more structured and resilient workflows.
In this talk, we show how to build custom models using SageMaker and its MLOps facilities. We will explore modules such as the experiment tracker, model register, feature storage, and model deployment. As a case study, we will use the forecasting of the Italian power load.
First delivered at AWS User Group Meetup, 18 April 2023, Milan, Italy
Serverless Computing for Mathematical Optimization
Solving large optimization problems is challenging, yet when parallelization is possible, serverless computing can drastically reduce the elapsed time.
In this talk, we consider a real case study: creating power market scenarios by finding the optimal way of using power storage facilities.
This results in a convex optimization problem with hundreds of thousands of variables. Despite parallelization, a conventional solution requires around two hours on a 16-core virtual machine.
We discuss how different serverless approaches can be adopted to reduce the elapsed time and demonstrate how cloud functions help cut it by 96% while keeping costs as low as 3€ per run. Additionally, we show how to connect the optimization module to the rest of the data processing pipeline, implemented in AirFlow.
By the end of this talk, you will have a better understanding of the capabilities of serverless for high-performance computing.
First delivered at PyCon Italy, 24-28 May 2023, Florence, Italy
Recorded session at https://youtu.be/JAOK9Zut_R4
Should You Trust Your Copilot? Limitations and Merits of AI Coding Assistants
More and more developers are using AI coding assistants in their daily work. GitHub Copilot got 400.000 subscribers in the first month and was praised by influential engineers, including Guido van Rossum.
Several studies confirm that AI assistants increase productivity, but other works raise concerns about their adoption.
The Free Software Foundation complained about the training process of Copilot, which allegedly violated the licensing of open-source code. The ACM warned that adopting coding assistants in education may lead to over-dependence and a decrease in learners' understanding. Finally, in a recent preprint, researchers from Stanford describe a controlled experiment suggesting that coding assistants may increase security-related bugs.
By providing an unbiased and independent review of the literature, this talk aims to inform the debate about the trustworthiness of AI coding assistants and to provide insights into their future evolution.
First delivered at Pisa.dev, 19th June 2023, Pisa, Italy
Held also at ODSC Europe, 14-15 June 2023, London, UK
Held also at WeAreDevelopers World Congress, 26-28 July 2023, Berlin, Germany
Recorded session at https://www.youtube.com/live/JHAqIgBo-q8?feature=share
Embeddings, Transformers, RLHF: Three Key Ideas to Understand ChatGPT
ChatGPT has achieved tremendous success and is already transforming the daily routines of many professionals across various industries.
While countless articles highlight the "30 must-know commands," few delve into the actual workings of the technology behind ChatGPT. To understand it, it's essential to grasp three key concepts:
- Embeddings: These represent words and phrases numerically, allowing large language models like GPT to process natural language.
- Transformers: The core component of large language models. Using the attention mechanism, they can focus on semantically related words even when they appear distant from one another.
- RLHF (Reinforcement Learning with Human Feedback): This technique is employed to train models on extensive datasets of questions and answers with minimal human supervision.
In this talk, Emanuele Fabbiani from xtream will provide a concise yet thorough introduction to embeddings, transformers, and RLHF. He'll describe the technology powering ChatGPT, enabling the audience to harness the tool to its fullest potential.
First delivered at Talks at Buildo, 12 July 2023, Milan, Italy
Also held at BI Digital, 7 October 2023, Biella, Italy
Also held at Talks at BitRocket, 10 November 2023, Palermo, Italy
Also held at Boolean Masterclass, 21 November 2023, Milan, Italy
Also held at SIIAM Congress, 7 December 2023, Rome, Italy
Recorded session at https://www.youtube.com/live/Gf1OkqIPo_w?si=Uak7EBxJbma6u7Fy&t=9672
Tool, author, or danger? The role of generative AI in academic research
In the 2023 study titled "Can Linguists Distinguish Between ChatGPT/AI and Human Writing? A Study of Research Ethics and Academic Publishing," a survey revealed that 22% of academic editors are opposed to incorporating Generative AI in scholarly research. However, the same study also indicated that distinguishing AI-generated text from human-written text remains a challenge for both individuals and machines.
Responding to instances where ChatGPT was credited as an author, the prestigious journal 'Nature' revised its authorship guidelines in mid-2023. The new policy states that "Large Language Models do not currently meet our criteria for authorship," effectively disallowing non-human authors.
This talk delves into the controversial role of generative AI in academic research. Is it merely a tool, a potential author, or a threat to be avoided? We will examine recent studies and present diverse viewpoints to shed light on this issue. Our discussion will argue that while generative AI cannot be regarded as an author, at least in the foreseeable future, its value as a research tool is undeniable and cannot be ignored. The industry's use of AI underscores its significant utility, emphasizing the need for its strategic integration into research methodologies.
Participants of this talk will gain insights into incorporating generative AI tools in their research, along with an understanding of the strengths and limitations of current AI technologies.
First delivered at SIIAM Congress, 7 December 2023, Rome, Italy
Ordo Ab Chao: the magic of DALL-E and Midjourney
Have you ever laughed at a picture of the Pope drinking beer or holding a gun? Then you've witnessed the capabilities of Diffusion Models such as Midjourney, Stable Diffusion, and DALL-E.
This talk will discuss why Diffusion Models were developed, how they work, and what makes them so effective at creating images based on a prompt in natural language.
We will delve into the processes of forward and reverse diffusion, explaining how the AI models build realistic pictures from nothing but noise. We will also explain some of their weird features: for instance, why it is so hard for such models to reproduce text and create realistic hands.
At the end of this talk, you will have a clear idea of the main concept powering diffusion models and understand their strengths and limitations. You will also learn how to use Diffusion Models for practical applications.
First delivered at Codemotion Meetup, 13 February 2024, Milan, Italy
Delivered also at Papers We Love Meetup, 11 April 2024, Milan, Italy
Is your marketing effective? Let Bayes decide!
Understanding the effectiveness of various marketing channels is crucial to maximise the return on investment (ROI). However, the limitation of third-party cookies and an ever-growing focus on privacy make it difficult to rely on basic analytics. This talk discusses a pioneering project where a Bayesian model was employed to assess the marketing media mix effectiveness of WeRoad, an innovative Italian tour operator.
The Bayesian approach allows for the incorporation of prior knowledge, seamlessly updating it with new data to provide robust, actionable insights. This project leveraged a Bayesian model to unravel the complex interactions between marketing channels such as online ads, social media, and promotions. We'll dive deep into how the Bayesian model was designed, discussing how we provided the AI system with expert knowledge, and presenting how delays and saturation were modelled.
Attendees will walk away with:
- A simple understanding of the Bayesian approach and why it matters.
- Concrete examples of the transformative impact on WeRoad's marketing strategy.
- A blueprint to harness predictive models in their own business strategies.
First delivered at PyData, 21 February 2024, Milano, Italy
Recorded session at https://www.youtube.com/live/bo2IidymmX0?si=hnGUYNN9pKRmQnWY&t=1450
WeAreDevelopers World Congress 2023 Sessionize Event
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top