Most Active Speaker

Scott Bell

Scott Bell

Azure and Databricks SME

Peterborough, United Kingdom

Actions

Scott is currently a principal consultant with RapidData focusing on the Azure Data Platform, Integration Engineering and Analytics.
Previously, he was a a senior consultant and UK&I Databricks SME at Avanade.

He has a master degree in Computer Science with a specialism on Secure Machine Learning in the Cloud.

He is passionate about all things data, AI, Rugby League, Beer, R and Azure

Awards

  • Most Active Speaker 2024

Area of Expertise

  • Information & Communications Technology

Topics

  • Databricks
  • Azure Databricks
  • All things data
  • Azure Data Lake
  • Azure Data Factory
  • Azure Data Platform
  • Big Data
  • Data Science
  • Analytics and Big Data
  • Azure SQL Database
  • Databases
  • Data Science & AI
  • data engineering
  • Azure Data & AI
  • Data Platform
  • Microsoft Data Platform
  • Azure Databricks for AI
  • Data Analytics
  • Database
  • Machine Leaning
  • Azure Functions
  • Azure Logic Apps
  • Azure SQL Server
  • Azure PaaS

Danger in Dialogue: The Security Risks of Large Language Models

In the rapidly evolving landscape of artificial intelligence, Large Language Models (LLMs) stand at the forefront, driving innovation and reshaping industries from automated customer service to sophisticated content creation tools. However, the rapid integration of these models into various facets of society has outpaced the understanding of their inherent security risks. "Danger in Dialogue" aims to bridge this gap by shedding light on the potential threats posed by LLMs and drawing attention to critical threats that stakeholders must navigate to leverage AI's full potential safely.

**Emerging Threats in LLMs**

Prompt Injection: A manipulation technique where attackers craft specific inputs to trigger unintended or harmful responses from the model, potentially leading to misinformation or exploitation of the system for malicious purposes.

Data Exfiltration: The risk that sensitive information, embedded in the training data of LLMs, can be extracted through carefully designed queries, posing significant privacy and security challenges.

Plagiarism: As LLMs generate content with increasing sophistication, the line between original creation and AI-assisted plagiarism becomes blurred, raising concerns over intellectual property rights and the authenticity of digital content.

Hallucinations: Instances where LLMs generate false or misleading information, presenting it as fact. These inaccuracies can undermine trust in AI systems and spread misinformation if not adequately addressed.

Tokenization Errors: Flaws in the process of breaking down text into manageable pieces for the model to understand can introduce biases or distort the intended meaning of the input, leading to erroneous outputs.

Reputation Management: The output of LLMs can influence public perception of individuals, organizations, and concepts, making it crucial to manage and mitigate any reputational damage caused by biased or inaccurate model responses.

The session will delve into each of these threats, offering insights into their mechanics, real-world implications, and the challenges they pose to the ethical and secure deployment of LLMs.

Optimizing Your Delta Lake: Beyond the Defaults

Delta Tables provide amazing out of the box features such as ACID Compliance and Time Travel to your datalake but there's a world of optimization beyond the default settings.

This session is designed to take your Delta Lake to the next level. We'll start with a brief introduction to Delta Tables and quickly dive into advanced optimization techniques.

Learn how to reduce storage costs, supercharge query performance, and govern your Delta Lake with precision.

We'll explore topics such as Delta Transaction Log retention, deletion vectors, liquid clustering, change data capture, and generated columns. Ever wondered, "What the heck is a bloom filter?" We've got you covered.

Join me to unlock the full potential of your Delta Lake and make informed decisions that go beyond just leaving things at their defaults.

Web Scraping with Microsoft Fabric

This session introduces you to the world of web scraping using Microsoft Fabric Notebooks. Whether you're a data engineer a market researcher, or just someone curious about gathering data from the internet, this session is for you.

We'll cover the basics of web scraping, the legal and ethical considerations, and dive into hands-on demonstrations using Microsoft Fabric Notebooks. You'll learn how to efficiently scrape data from websites, manipulate and store the extracted information, and apply it to real-world scenarios. By the end of this session, you'll be equipped with the knowledge and tools to start your web scraping projects using Microsoft Fabric so you can leverage new data in your platform.

Optimize your life with Power Automate

Covid-19 gave us all unprecedented challenges and changes to our ways of working. Some of those challenges allowed us to reflect and reset our lives.

This is an exploration of how the power platform enabled myself to build a series of tools that made life easier during this difficult time.

Some of the topics covered:
- How to be left alone while working from home!
- How to Remember and Optimize everything you do!

Another way to describe this talk would be "Scott does silly things with the power platform that are actually helpful!"

Navigating Data Governance in the Age of Generative AI

In the rapidly evolving world of data analytics, the emergence of Large Language Models (LLMs) has sparked a debate: Are LLMs signaling the end of traditional data analytics? This session delves into the heart of this question, exploring the fundamental workings of LLMs and their transformative impact on the analytics landscape.

Attendees will gain insights into the advantages and potential pitfalls of integrating LLMs into their data strategy. We'll discuss the innovative use cases LLMs unlock and emphasize the paramount importance of governance and lineage in harnessing their full potential. Whether you're intrigued by the brilliance of LLMs or wary of their implications, this session will equip you with a balanced perspective to navigate the future of data analytics.

This session will explore:

The Rise of Generative AI: A quick overview of the development and potential of generative AI technologies, including GPT, DALL-E, and generative adversarial networks (GANs).

Shifting Paradigms: How generative AI is challenging the established norms in data analytics, from data collection to decision-making.

The Importance of Data Governance: Why and how to establish robust data governance models when integrating generative AI into your existing data ecosystem.

Data Strategy Reimagined: Tools and tactics for formulating a forward-thinking data strategy in an age where data can be generated rather than merely analyzed.

Data Lineage in a Generative World: The importance of tracing data lineage for credibility, quality control, and ethical considerations, especially when generative AI is part of the equation.

How to survive (and thrive) as a fully remote worker

Are you considering a move to a fully remote work lifestyle? Do you work remotely but find it a painful experience?

Whether you're the only remote member of your team or you're entire workplace has embraced the Work From Home in your pyjamas lifestyle.This talk aims to provide you with strategies and techniques to survive and hopefully thrive as a remote worker.

It will focusing on themes such as:
- Being unafraid to be your full self
- Managing your Workday and load aka Avoid Burnout
- Forcing yourself to disconnect
- Self Kindness and mental Health breaks
- Setting positive cultures and expectations
- Virtual commutes
- Don't be afraid to say NO!
- Optimize your time via Asynchronous Working
- Building Remote Relationships
- Invest in yourself and your space like companies do with their offices
- Remote doesn't always mean home alone
- Technology can solve some problems but not all!

This isn't about the perfect remote life because they probably don't exist! This about aspiration and improvement of WFH for you. Please bring your experiences and questions too because I'd love to learn something new!

Handling UK Bank Holidays in Synapse Pipelines

Managing data processes in Azure Synapse or Azure Data Factory (ADF) can be challenging, especially when accounting for variables like UK Bank Holidays. This lighting session provides a guide to handling these challenges in your Synapse Analytics pipelines.

Drawing from real-world experience, we'll explore how to integrate the UK government's public holiday API to ensure your data processes are not running when they shouldn't be.

You'll learn how to set up parameters and variables, configure web activities, and use filter activities to accurately determine if a given day is a UK Bank Holiday.

10 things you (probably) don't know about Databricks

Learn 10 tricks, tips and hacks to better leverage databricks in your platforms.

Learn everything from how to configure your workspace with an undocumented to how to better optimize your delta tables

There should be something new for everyone

From the creator of @dailydatabricks on twitter

Are Microsoft Certifications Valuable?

What value do Microsoft certifications offer? Over the last year I've gone from 1 certification to almost all azure and data certifications.

In this dicussion, I will talk about my experience of bruteforce passing all these certifications. I will give you honest feedback about what has and hasn't given myself and colleagues value. I'll give you tips and resources to help prepare. And tell you when you should focus on strategic and personal development solutions.

Advanced Analytics with R & Databricks

Utilize databricks with R to develop high quality analytical solutions. R is an incredibly powerful statistical language and is often overlooked when building solutions with databricks.

- Did you know that you can run interactive web applications in databricks with R?
- Did you know you can build intuitive data pipelines that leverage the best parts of spark and R?
- Explore how to weave R magic in your existing databricks solutions?

Learn how to build compelling data products in R with databricks!

Cosmos 101

Find out everything you need to know to get started with Azure Cosmos DB in 50 minutes or less.

What's partitioning?

What's consistency?

What does multi modal really mean?

What's a change feed?

Can I leverage Cosmos in my data platform?

DATA BASH '23 Sessionize Event

November 2023

#DataWeekender 6.5 Sessionize Event

November 2023

Southampton Data Platform and Cloud user group - in-person meetup User group Sessionize Event

January 2023 Southampton, United Kingdom

Scott Bell

Azure and Databricks SME

Peterborough, United Kingdom

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top