Greg Faletto

Information & Communications Technology

Business & Management

Consumer Goods & Services

Media & Information

Data Science Data Science & AI Machine Learning Machine Learning and AI Business Analysis Marketing

Los Angeles, California, United States

Greg Faletto

Ph.D. Student, Dept. of Data Sciences and Operations, USC Marshall School of Business

Greg Faletto is a research assistant and PhD student in the Statistics group of the Department of Data Sciences and Operations at the University of Southern California Marshall School of Business. Greg’s research broadly focuses on statistics, machine learning, feature selection, and applied statistical methods for business and social science. Over the past decade Greg has developed original machine learning models for companies like Live Nation and ZipRecruiter, and his team won “Best Model” at the Orange County R Users Group Hackathon 2019 for their generalized additive model that correlated regional health outcomes with the presence of water pollutants in California

Current sessions

Improving Feature Selection Through Stability

Are you a data scientist looking for a way to get more actionable insights from your clients’ data? Or maybe you’re looking for an edge in a machine learning competition? Stability selection, a recently developed feature selection method, can help. Similar to bagging, stability selection involves repeated subsamples of the data set to find the most important features. Stability selection has good properties, has received a lot of attention from academic machine learning researchers, and is very easy to implement in R. But the broader data science community hasn’t embraced it yet. In this talk, after briefly discussing why feature selection is important, I’ll explain how stability selection works and what problems it solves. I’ll show how it can easily be implemented using the R package stabs. I’ll also give some examples of when stability selection is useful, and I’ll wrap up by sharing some lessons I’ve learned through my own academic research on stability selection, including some of its limitations. This talk is geared towards data scientists at intermediate or advanced levels who already understand fundamentals like how to use the lasso.


Analyzing a Business Using (Free) Public Data

Modeling customer acquisition, retention, and revenue per user are key tasks for data scientists working in business. But getting experience working on these problems usually requires access to private business data. The focus of this talk will be how to model and forecast these three critical business metrics for subscription-based businesses using only publicly available data. Further, these models can be combined to valuate a business. Because this valuation links forecasts of measurable business metrics to business value, it can be used to create key performance indicators (KPIs) that measure actual dollar value-added. This method will be demonstrated by an example valuation of Buffer, a social media management platform, using data anyone can access online for free

These skills are useful for data scientists working in businesses anywhere from a startup to a Fortune 500 company looking to forecast growth, measure employee performance, or value themselves (or an acquisition target) for an exit or acquisition.

This talk will include:

How to model customer acquisition, retention, and revenue using public data
How to access data needed to build these models
How to use these models to valuate a business
A demonstration of this method to valuate Buffer using publicly available data


Predicting Concert Set Lists as Bernoulli Autoregressive Processes

Set lists--the list of songs performed by an artist at a concert--for many artists are freely available online going back many years. The list of songs played at each concert is random to some degree, but also connected to past set lists. So the sequence of concert set lists can be thought of as a random process. This particular random process is a Bernoulli autoregressive process. Such processes have many applications even though they have been studied very little and are not widely known. The focus of this talk will be how to model such processes with Markov chain models, using concert set list prediction as an illustrative example. Other uses for this kind of model will also be discussed, including predicting changes in the stock market, athletic performance, and patient adherence to prescribed treatments. This talk will cover:

How to access concert set list data for free online
What a Bernoulli autoregressive process is
How to build a model to make predictions from a Bernoulli autoregressive process
Other applications of such models