Use atomic_fetch_add from C11's stdatomic if available or fallback to GCC's __atomic functions. Remove atomic_add because is can trivially be implemented by calling exchange_and_add instead.