* fix ringbuffer thread safety on ARM. fix#715#388
This patch addresses the thread safety problem of `jack_ringbuffer_t`
mentioned in #715 and #388. The overbound read bug caused by this problem
is impossible to reproduce on x86 due to its strong memory ordering, but
it is a problem on ARM and other weakly ordered architectures.
Basically, the main problem is that, on a weakly ordered architecture,
it is possible that the pointer increment after `memcpy` becomes visible
to the other thread before `memcpy` finishes:
memcpy (&(rb->buf[rb->write_ptr]), src, n1);
// vvv can be visible to reading thread before memcpy finishes
rb->write_ptr = (rb->write_ptr + n1) & rb->size_mask;
If this happens, the other thread can read the remaining garbage values
in `rb->buf` due to be overwritten by the unfinished `memcpy`.
To fix this, an explicit pair of release/acquire memory fences [1] is
used to ensure the copy on the other thread *happens after* the `memcpy`
finishes so no garbage values can be read.
[1]: https://preshing.com/20130922/acquire-and-release-fences/
* remove volatile qualifier on ringbuf r/w pointers
The volatile constraints are excess when compiler barriers are present.
It generates unnecessary `mov` instructions when pointers aren't going
to be updated.
* simplify read/write space calculations
This optimization is possible because the buffer size is always a power
of 2. See [1] for details.
[1]: https://github.com/drobilla/zix/pull/1#issuecomment-1212687196
* move acq fences to separate lines