Whodunnit - git repository mysteries

With all the recent focus on software supply chain security, let's look at the very far left of this process - how does git know who did what, when, where, and why?

It seems straightforward to assume that you have all of this information in a git repository, but that's probably not the case. In this talk, Natalie will walk through how to determine the answers to each of these questions, edge cases and technical gotchas to watch out for, and why each are important to your company's security posture.

**Who?** will walk through identity and commit signing in git. This seemingly simple information turns out to be quite hard to reliably determine. We'll review setting your user identity in git and how/if that links to an external identity provider or your repository hosting service, how that identity interacts with signature verification, the common methods of git commit signing, and what the future of signature verification looks like for git. The walkthrough shows how each of these can leave a gap for auditors and how to address these with reasonable certainty.

**What?** answers, but then finds more questions, to what files changed at each point in time, how force pushing or history rewrites change this, and how to view these trends in bulk. It has become common to rewrite history to remove large files or secrets, but how effective is this in practice? What gaps does it leave in your reporting process to understand what files have changed?

**When?** considers time management in git - local versus universal, when it matters, and looking at how often a file changes through the lens of reporting to someone who doesn't know what git is.

**Where?** walks through several "where" questions - the confusing ways that `git checkout` can do different things to different files based on context, leveraging `CODEOWNERS` to provide ownership to teams over parts of a repository, and where git stores credentials. We'll look at what types of controls these credentials may bypass by looking deeply into the dozen or so types of credentials in GitHub and what they can do. Lastly, we'll consider the places `git` can do things automatically with hooks - hooks that run locally or remotely, how to leverage them responsibly, and what they realistically can do for governance.

**Why?** will consider documenting code changes over time. It's difficult to understand why a change occurred without this context and impossible to go back in time to ask yourself why. This section will cover structuring commit messages, then how to enforce these across projects by force and by cultural convention. Zooming out a little more, we'll talk about merge strategies and how they intersect with trying to prove changes over time. It ends on why and how to add additional insights to this history using an Architecture Decision Record to document major decisions, and why not to use Issues or Pull/Merge Requests for these records.

**No really, why?** ends on the existential questions of what it means to write software in a regulated environment, or pulling entirely unregulated software in. It's possible to do this quickly, reliably, and securely - we just need to know `git` inside and out.

Natalie Somersall

Principal Field Engineer, Public Sector @ Chainguard

Denver, Colorado, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Whodunnit - git repository mysteries

Natalie Somersall

Links

Actions