A workaround for installing SQBLib

I spent about an hour today getting Carlos Becker’s SQBLib package for gradient boosted tree regression in matlab working. The issue is that it depends on liblbfgs, and following the instruction Carlos gave for compiling liblbfgs resulted in errors when MATLAB tried to link liblbfgs into his code.

Specifically, I got a mysterious error like
/usr/bin/ld: /home/agittens/Downloads/liblbfgs-1.10/lib/lbfgs.o: relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile with -fPIC.

See the original install instructions. Here’s the fix that worked for me on a 64-bit Linux box:

  1. Run ./configure --disable-shared --enable-static --enable-sse2 from the liblbfgs root directory
  2. Run make --dry-run to find the gcc command that generates liblbfgs.o and modify it to add the -fPIC compiler option. In my case, that resulted in the line
    gcc -DHAVE_CONFIG_H -I.. -I. -I../include -msse2 -DUSE_SSE -O3 -ffast-math -fPIC -Wall -c lbfgs.c -o lbfgs.o'
  3. Move into the lib/ directory and execute this commnad
  4. Edit the GLN... case of the switch statement in the compileMex.m file to replace the current
    %s/lib/.libs/lbfgs.o portion of the mex call with %s/lib/lbfgs.o
  5. Run compileMex(<liblbfgs root directory>)

Quick note on the Chen, Chi, Goldsmith covariance sketching paper

NB: I will update this post as I read the paper, in case it turns out that the first issue I raised is not legitimately a concern.

Covariance estimation (and the natural extension, precision estimation) has always been an interesting topic for me because it (can) represent a concise, concrete, and very broadly applicable instance of applied nonasymptotic random matrix theory. Likewise, I’m also quite interested in matrix sketching algorithms. Thus I was very excited to see the latest preprint by Chen, Chi, and Goldsmith on arxiv which presents a convex optimization algorithm for recovering covariance matrices from randomized sketches obtained in a streaming manner. Their convex optimization problem is essentially the PhaseLift formulation used for recovering phase from magnitude, but their proof shows that it works for covariance matrix recovery.

I have only just started reading this paper, and I’m still excited, but I have two concerns already: first, it is not clear to me that the algorithm they propose is actually the algorithm they provide a guarantee for! At the least, the text must be corrected to make it clear that this is indeed the case. To be more precise, their algorithm is to compute a few random sketching vectors ahead of time, then as the rows of the matrix come in, compute the magnitude of the projection of each row onto *one* randomly chosen sketching vector. The measurement model described mathematically seems to compute the magnitude of the projection of each row onto *each* of the sketching vectors. Big difference there.

Second, their algorithm provides a Frobenius norm guarantee, which is par for the course, but they make claims about things like subspace detection, for which afaik, Frobenius norm guarantees are too weak to really be of interest. But here, this may be a matter of preference, and practitioners probably don’t care about sharp guarantees as long as it works in practice and has at least a semblance of theoretical support.