# Sampling models and vector sums

What happens when you sample columns from a matrix whose columns all have roughly the same norm? It’s clear that you should have some kind of concentration: say if you sample $$\ell$$ of $$n$$ columns, then the Frobenius norm of the resulting matrix, appropriately scaled, should be a very good approximation of the Frobenius norm of the original matrix. This is easy to show (well, it’s easy to establish a rather weak probability tail bound on the norm) using a vector Bernstein inequality, where the `vectors’ are simply the original matrix with all but one column set to zero. The key step is to change the sampling model from one where you randomly select a subset of cardinality $$\ell$$ to a simpler one based on i.i.d. Bernoulli r.v.s where you randomly select a subset with expected cardinality $$\ell$$. The change is straightforward because in this case, the support of the vectors don’t overlap, so adding more vectors montonically increases the Frobenius norm.

Harder question: what happens when you sample columns from $$\mat{X}$$ and rows from $$\mat{Y}$$, once again with appropriate rescaling, to form $$\hat{\mat{X}}$$ and $$\hat{\mat{Y}}$$? In particular, what can you say about $$\|\hat{\mat{X}}\hat{\mat{Y}} – \mat{X}\mat{Y}\|_F$$? Again, assuming the Frobenius norms of $$\mat{X}$$ and $$\mat{Y}$$ are evenly distributed across their columns and rows, respectively, this difference should be small. It’s easy to get a bound on the expectation, but what happens in probability? Here you could again use a vector Bernstein inequality to get a reasonable result (I think), with vectors $$\text{vec}(\mat{X}_i \mat{Y}^i)$$, but now the support of the vectors overlap, so it’s not clear if the change of sampling models can be done.

So, the question is: let $$\mathbb{P}_{U(\ell)}$$ denote the probability under the model of randomly selecting subsets of cardinality exactly $$\ell$$ and $$\mathbb{P}_{B(\ell)}$$ denote the probability under the model of randomly selecting subsets of cardinality on average $$\ell$$. Is it the case that, for vectors in some vector space,
$\mathbb{P}_{U(\ell)}\left( \left\|\sum_{i \in I} v_i \right\| \geq t \right) \leq 2 \mathbb{P}_{B(\ell)}\left( \left\|\sum_{i \in I} v_i \right\| \geq t \right)?$
The constant may vary, or there may even be a small additive term on the rhs, and I really only need an answer in the case that the norm is the standard Euclidean norm on $$\R^n$$.