Every deep learning paper is wrong. Don’t at me. — seen in the bowels of Twitter

Bruh! I’ve been away for five years. Only posting now because I’m productively procrastinating (yes that is a lifehack link, don’t judge me) working on a grant whose writing is going to be like pulling nails.

This blog was supposed to serve as a journal (as much of a journal as something public facing can be, when my identity is no secret), and it’s been a while since I put pen to paper. So what’s there to update on? I’m an assistant professor now, of Computer Science, in a wonderfully homey department in the northeast. Hence the grant writing. I’m still working on the same type of problems– machine learning and what I would call information-theoretic applied mathematics.

My personal life is more full-throated now. Maybe more to come on that in the future… I’m still parsing out exactly how to use social media in an age when your students and colleagues can look you up. That’s part of why I’m not on any social media, other than Youtube, where I fastidiously maintain a bevy of troll-adjacent accounts. I can only imagine how hard dating is when there’s the possibility you can run into students, postdocs, etc. on dating sites. There’s something about the neutered, purely intellectual nature of academia (which I love, don’t get it wrong) that makes it gauche to be caught trying to start a romantic relationship … maybe that’s just me. In fact, thinking of some other profs I know, that is entirely just me.

So about that quote: I am working on deep learning because that’s almost required nowadays in machine learning. But I am constitutionally opposed to much of the baselessness of it all. It’s a great opportunity of constructing theoretical foundations, and I appreciate that a lot of good work has been done in that direction. But fundamentally it seems like the advances come from clever engineering of inductive biases, and the theory is forever playing catch-up. And when you read many of these papers closely, the proofs are problematic, incorrect, or implicitly make assumptions that trivialize the results. I can’t give statistics on that, just I’ve seen it enough recently to be bothered by it. What I’m saying is, I want a paradigm shift. Look at this recent paper on foundational models (a term referring to large models trained in an unsupervised manner, that can then be fine-tuned for other tasks). It doesn’t leave a good taste in my mouth.

I have more to say, but I’m already late for a meeting. Welcome back to me, and see you later.