Attacking LLM Detectors with Homoglyph-Based Attacks

This session explores an attack vector, homoglyph-based attacks, that effectively evades state-of-the-art AI-generated text detectors.

We'll begin by explaining the idea behind homoglyphs, characters that look similar but are encoded differently. You'll learn how these can be used to manipulate tokenization and evade detection systems. We'll cover the mechanisms of how homoglyphs alter text representation, discuss their impact on existing LLM detectors, present a comprehensive evaluation of their effectiveness against various detection methods, and see how we can protect detectors against these attacks.

Join us for an immersive exploration and stay ahead of evolving threats!

Aldan Creo

MS @ UC San Diego

San Diego, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Attacking LLM Detectors with Homoglyph-Based Attacks

Aldan Creo

Links

Actions