Mirror descent is, in a precise sense, a second order algorithm

For one of our projects at eBay, I’ve been attempting to do a Poisson MLE fit on a large enough dataset that Fisher scoring is not feasible. The problem is that the data also has such large variance in the scales of the observation that stochastic gradient descent does not work, period — because of […]

Algebra: it matters

I’m looking at two different models for learning polynomial functions, and trying to determine if they are equivalent. After a couple days of thinking, I’ve reduced the question to the following: Can every symmetric polynomial of degree \(r\) in \(d\) variables that has no constant term be written as a sum of the \(r\)-th powers […]

Julia, once more

Julia + PyCall + CCall + Gadfly or PyPlot (+ Julia Studio ?) looks delicious. The only feature that absolutely needs to be added is shared memory parallelism (why wasn’t this an initial core feature of the language?), but I’m extremely excited by the current awesomeness of the Julia ecosystem. I recommend you get into […]

a bit on word embeddings

Lately I’ve been working almost exclusively on continuous word representations, with the goal of finding vectorial representations of words which expose semantic and/or syntactic relationships between words. As is typical for any interesting machine learning problem, there are a glut of clever models based on various assumptions (sparsity, hierarchical sparsity, low-rankedness, etc.) that yield respectable […]

Installing Hadoop on Ubuntu (works for Ubuntu 12.04 and Hadoop 2.4.1)

I’m trying to use LDA on a large amount of data. A quick recap: Tried vowpal wabbit … it’s fast, I’ll give it that, but it’s also useless: the output is dubious (what I think are the topics look like they haven’t changed very much from the prior) *and* I have no idea how it […]

Sharing numpy arrays between processes using multiprocessing and ctypes

Because of its global interpreter lock, Python doesn’t support multithreading. To me, this is a ridiculous limitation that should be gotten rid of post-haste: a programming language is not modern unless it support multithreading. Python supports multiprocessing, but the straightforward manner of using multiprocessing requires you to pass data between processes using pickling/unpickling rather than […]

Eigenvector two-condition number for a product of PSD matrices

I’m pushing to submit a preprint on the Nystrom method that has been knocking around for the longest time. I find myself running into problems centering around expressions of the type \(B^{-1}A\), where \(A, B\) are SPSD matrices satisfying \(B \preceq A\). This expression will be familiar to numerical linear algebraists: there \(B\) would be […]