Sampling models and vector sums

What happens when you sample columns from a matrix whose columns all have roughly the same norm? It’s clear that you should have some kind of concentration: say if you sample \(\ell\) of \(n\) columns, then the Frobenius norm of the resulting matrix, appropriately scaled, should be a very good approximation of the Frobenius norm of the original matrix. This is easy to show (well, it’s easy to establish a rather weak probability tail bound on the norm) using a vector Bernstein inequality, where the `vectors’ are simply the original matrix with all but one column set to zero. The key step is to change the sampling model from one where you randomly select a subset of cardinality \(\ell\) to a simpler one based on i.i.d. Bernoulli r.v.s where you randomly select a subset with expected cardinality \(\ell\). The change is straightforward because in this case, the support of the vectors don’t overlap, so adding more vectors montonically increases the Frobenius norm.

Harder question: what happens when you sample columns from \(\mat{X}\) and rows from \(\mat{Y}\), once again with appropriate rescaling, to form \(\hat{\mat{X}}\) and \(\hat{\mat{Y}}\)? In particular, what can you say about \(\|\hat{\mat{X}}\hat{\mat{Y}} – \mat{X}\mat{Y}\|_F\)? Again, assuming the Frobenius norms of \(\mat{X}\) and \(\mat{Y}\) are evenly distributed across their columns and rows, respectively, this difference should be small. It’s easy to get a bound on the expectation, but what happens in probability? Here you could again use a vector Bernstein inequality to get a reasonable result (I think), with vectors \(\text{vec}(\mat{X}_i \mat{Y}^i)\), but now the support of the vectors overlap, so it’s not clear if the change of sampling models can be done.

So, the question is: let \(\mathbb{P}_{U(\ell)}\) denote the probability under the model of randomly selecting subsets of cardinality exactly \(\ell\) and \(\mathbb{P}_{B(\ell)}\) denote the probability under the model of randomly selecting subsets of cardinality on average \(\ell\). Is it the case that, for vectors in some vector space,
\mathbb{P}_{U(\ell)}\left( \left\|\sum_{i \in I} v_i \right\| \geq t \right) \leq 2 \mathbb{P}_{B(\ell)}\left( \left\|\sum_{i \in I} v_i \right\| \geq t \right)?
The constant may vary, or there may even be a small additive term on the rhs, and I really only need an answer in the case that the norm is the standard Euclidean norm on \(\R^n\).

Verify these matrix properties (easy and fun)

I’m looking at products like \(\mat{G}^t \mat{A} \mat{G}\) where the columns of \(\mat{G}\) are (nonisotropic) normal vectors. Specifically, I’d like to know the distribution of the eigen/singular-values of this product. Surprisingly, I was unable to find any results in the literature on this, so I started reading Gupta and Nagar to see if I could work it out using Jacobian magic.

In the preliminary material, they list some basic matrix algebra facts. Your mission, should you choose to accept it, is to prove the following:

  • If \(\mat{A} > \mat{0},\) \(\mat{B} > \mat{0},\) and \(\mat{A} – \mat{B} > \mat{0},\) then \(\mat{B}^{-1} – \mat{A}^{-1} > \mat{0}\)
  • If \(\mat{A}>\mat{0}\) and \(\mat{B} > \mat{0},\) then \(\det(\mat{A}+\mat{B}) > \det(\mat{A}) + \det(\mat{B})\)

The second’s easy, but for the first I found it necessary to use the fact that every positive matrix has a unique positive square root.

Some more, this time on Kronecker products:

  • If \(\mat{A}\) has eigenvalues \(\{\alpha_i\}\) and \(\mat{B}\) has eigenvalues \(\{\beta_j\},\) then \(\mat{A} \otimes \mat{B}\) has eigenvalues \(\{\alpha_i \beta_j\}.\)
  • If \(\mat{A}\) is \(m \times m\) and \(\mat{B}\) is \(n \times n\), then \(\det(\mat{A} \otimes \mat{B}) = \det(\mat{A})^n \det(\mat{B})^m\)

It’s useful here to note that
(\mat{A} \otimes \mat{B}) (\mat{C} \otimes \mat{D}) = \mat{AC} \otimes \mat{BD},
which has some obvious consequences, like that the Kronecker product of two orthogonal matrices is orthogonal. To be clear, the first Kronecker question can be addressed without any tedious calculations.

George Gently

Wooo woooo! I just found George Gently‘s coming (came?) back for 2011. I also found out that there was a 2010 season! Too bad Netflix doesn’t have it yet 🙁

Here’s the first Netflix review I (remember having) ever wrote. I was that impressed by the show. Hope I can share that enthusiasm. For some reason Netflix removed all the apostrophes in the review.

George Gently’s an older DI at the Yard who thought he was ready to retire after his wife was murdered because of his crusade against corruption in the force, but a fresh lead on the trail of the man he suspects is behind the murder of his wife brings him back into the game, this time in Northumberland. He’s assisted in his investigation by John Bacchus, a young, ambitious, and brash inspector who, from past bitter experience, Gently sees exhibits all the earmarks of becoming corrupt. So, after the case is resolved, Gently decides to stay on in Northumberland and take the youngster in hand. Thus the first episode sets the stage … after that, each episode deals with a new case.

Yes, the cases are interesting, and the setting is verisimilar, but the appeal of the series for me lies first and foremost in the relationship between Gently and Bacchus. Bacchus is, from the beginning, an unsympathetic character: for no readily apparent reason, he’s unsatisfied with his marriage, and perhaps unconsciously sabotaging it; he’s willing to do whatever it takes — beatings, planting evidence, accepting bribes — to close a case; and his general opinions on race and gender issues reflect those of the times in an unflattering manner. Yet, you can’t help but get the feeling that there’s something salvageable under those youthful excesses. Gently does an excellent job of mentoring Bacchus in a show-and-not-tell manner, but it’s unclear if Bacchus is taking in the messages Gently’s trying to get across or if he’s simply humoring him.

The two episodes of season three are the best of the series. In the first, Bacchus and Gently come to loggerheads over a case involving a child, and in the second, we see concrete hints that Bacchus is becoming a better man. I’d love to see more in this series, but this was a great stopping point. Always leave them wanting more, right?

A sequence whose closure is the unit circle

Looks like I’ll be visiting U. Michigan for a week or so at the end of October. What does one do on such a visit? I’ve been invited to give a talk, but that doesn’t occupy more than an afternoon…

I spent this morning reading through some of the literature on subspace tracking, then wound up visiting Nocedal and Wright for a refresher on the augmented Lagrangian method (Chap 17). Flipping through the book, I came across the following question:

Show that every point on the unit circle is a limit point of the sequence
\vec{x}_k = \left(1 + \frac{1}{2^k}\right) \begin{pmatrix} \cos k \\ \sin k \end{pmatrix}.

Not challenging (esp. if you don’t go into the nasty \(\varepsilon-\delta\) details ), but it’s a cute problem.

ChapterZero’s relocated

I still haven’t hammered out the database issues, but I’m tired of my little vacation (i.e. several months!) from posting, so I’m going ahead with the relocation. At some unspecified time in the future (once I figure out the technical details), I’ll migrate the database over from the old location. That site’ll still be up, albeit in its broken state (the math and other escaped code is not being interpreted correctly).

Here’s an inagural problem: Argue that \[\frac{21}{22} \geq \frac{\log 9}{\log 10} \geq \frac{20}{21}\] without using decimal expansions of the involved quantities. Feel free to use arguments that aren’t feasible without a CAS like Mathematica– I sure did.