Gaussian Processes are great for Drug Discovery, not so much for living life
Recently, I have been reading about Gaussian Processes and how they work. There are several people on the internet who can explain it much more intuitively than me, so instead of trying to reinvent the wheel again, I'm going to link to an article I used to learn about it.
Now here's why you should care about them: sometimes you have a function that gives you the true value of something you care about — the binding affinity of a drug candidate, the half-life of a chemical — but it's painfully slow to evaluate. Gaussian processes let you take a handful of tested values and build a probability distribution over the entire function. Give it any new input, and it hands you back two things: a prediction, and a measure of how confident it is. You get a fast, calibrated oracle.
I like to think we all use these Gaussian processes in our daily lives. Maybe it's because here the true function isn't expensive in compute, but in vulnerability. I mean, why put yourself out there in this random, scary, uncertain world when there's a magic oracle right in your head that'll tell you right how every single outcome will pan out, before you even try?
I thought of inserting some "blah blah the map is not the territory, abstractions are lossy blah blah, this holier-than-thou undergrad now chides you on running away from the trials and tribulations of growing up". But that's pretty hypocritical. And not as beautiful. I would not be doing justice to the Bayesian gods.
The thing with Gaussian processes is that they don't assume your function neatly passes through all of your points it's been trained on. No, that would give you very complicated functions that would probably overfit. Instead, they assume Gaussian noise — zero mean, fixed variance — and use it to smooth everything out. And we do the same thing. We take the messy, fat-tailed chaos of our actual experiences and quietly compress it into a neat bell curve: this is roughly how things go for people like me. Clean. Manageable. Wrong.
Because life is anything but a Gaussian distribution. To hell with the Central Limit Theorem! Your sample sizes are tiny, your observations are biased towards things you've already done, and the stuff that actually ends up mattering — the friendships, the breakthroughs, the moments that redefine you — those live out in the tails, exactly where your internal model has the least data and the widest confidence intervals.
Braylon Mullins was 0-for-4 from 3-point range when he took the most important shot of his life for the UConn men's basketball team. Any statistician would have told him to pass the ball to someone else with better odds. But when he saw the guy who'd been putting up 3's the whole game pass it back to him when seconds were left on the clock, he didn't listen to what the Gaussian Prior had to say; he knew it had to be him.
Know when these statistical processes might be useful, and when you're better off throwing caution to the wind. You're 0-for-4. The clock is running. The ball finds its way to you anyway. Shoot it