Wilkinson on a priori error analysis

I’ve been reading a lot of NLA lately (e.g., a recent paper on communication-avoiding RRQR), and necessarily brushing up on some details I paid scant attention to in my NLA courses, like the details of the different types of pivoting. Which led me to this quote by a famous numerical analyst: There is still a … Continue reading Wilkinson on a priori error analysis

Nystrom vs Random Feature Maps

I haven’t seen a truly convincing study comparing Nystrom approximations to Random Feature Map approximations. On the one hand, a NIPS 2012 paper compared the two and argued that because the bases Nystrom approximations use are adaptive to the problem, whereas those used by RFMs are not, Nystrom approximations are more efficient. This is an … Continue reading Nystrom vs Random Feature Maps

My podcast masterlist

Here’s an early Christmas gift to you: a list of podcasts I enjoy! For listening while you’re doing all your holiday season travelling. APM: Marketplace KCRW’s Left, Right, and Center Newshour BBC World Update: Daily Commute Common Sense with Dan Carlin PRI’s The World: Latest Edition On the Media The Young Turks Video Podcast Citizen … Continue reading My podcast masterlist

Mirror descent is, in a precise sense, a second order algorithm

For one of our projects at eBay, I’ve been attempting to do a Poisson MLE fit on a large enough dataset that Fisher scoring is not feasible. The problem is that the data also has such large variance in the scales of the observation that stochastic gradient descent does not work, period — because of … Continue reading Mirror descent is, in a precise sense, a second order algorithm

a bit on word embeddings

Lately I’ve been working almost exclusively on continuous word representations, with the goal of finding vectorial representations of words which expose semantic and/or syntactic relationships between words. As is typical for any interesting machine learning problem, there are a glut of clever models based on various assumptions (sparsity, hierarchical sparsity, low-rankedness, etc.) that yield respectable … Continue reading a bit on word embeddings

Installing Hadoop on Ubuntu (works for Ubuntu 12.04 and Hadoop 2.4.1)

I’m trying to use LDA on a large amount of data. A quick recap: Tried vowpal wabbit … it’s fast, I’ll give it that, but it’s also useless: the output is dubious (what I think are the topics look like they haven’t changed very much from the prior) *and* I have no idea how it … Continue reading Installing Hadoop on Ubuntu (works for Ubuntu 12.04 and Hadoop 2.4.1)

Sharing numpy arrays between processes using multiprocessing and ctypes

Because of its global interpreter lock, Python doesn’t support multithreading. To me, this is a ridiculous limitation that should be gotten rid of post-haste: a programming language is not modern unless it support multithreading. Python supports multiprocessing, but the straightforward manner of using multiprocessing requires you to pass data between processes using pickling/unpickling rather than … Continue reading Sharing numpy arrays between processes using multiprocessing and ctypes

Eigenvector two-condition number for a product of PSD matrices

I’m pushing to submit a preprint on the Nystrom method that has been knocking around for the longest time. I find myself running into problems centering around expressions of the type \(B^{-1}A\), where \(A, B\) are SPSD matrices satisfying \(B \preceq A\). This expression will be familiar to numerical linear algebraists: there \(B\) would be … Continue reading Eigenvector two-condition number for a product of PSD matrices