We generate particle batch meshes with a fixed chunk of quads at the
origin, and then dispatch instances of that batch. The vertex shader
fetches from a shader storage buffer of particle info by using the
instance ID (between batches) and the vertex ID (within batches), using
the batch size. We need to run these in batches because GPUs process
instances somewhat serially, so if you have a tiny instance you're
wasting a bunch of spare compute instead of running nicely in parallel.
Having smaller instances than the full particle size helps with avoiding
rendering lots of hidden particles if you're using dynamic particle
counts, and also gives smaller data uploads (although that's not
particularly significant, since they're static buffers of geometry).
The shaders do billboarding based on head position rather than the
projection matrices so they don't visibly move when you rotate your
head: projection-based billboards look really bad in VR it turns out.