Justin Ruggles
b57e38f52c
ac3dsp: x86: Replace inline asm for in-decoder downmixing with standalone asm
Adds a wrapper function for downmixing which detects channel count changes
and updates the selected downmix function accordingly.
Simplification and porting to current x86inc infrastructure by Diego Biurrun.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
10 years ago
Justin Ruggles
43717469f9
ac3dsp: Reverse matrix in/out order in downmix()
Also use (float **) instead of (float (*)[2]). This matches the matrix
layout in libavresample so we can reuse assembly code between the two.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
10 years ago
Hendrik Leppkes
8d1267932c
x86/h264_weight: use appropriate register size for weight parameters
This fixes decoding corruption on 64 bit windows.
Signed-off-by: Martin Storsjö <martin@martin.st>
9 years ago
Diego Biurrun
2caa93b813
mpegaudiodsp: Change type of array stride parameters to ptrdiff_t
This avoids SIMD-optimized functions having to sign-extend their
stride argument manually to be able to do pointer arithmetic.
9 years ago
Diego Biurrun
e4a94d8b36
h264chroma: Change type of stride parameters to ptrdiff_t
This avoids SIMD-optimized functions having to sign-extend their
stride argument manually to be able to do pointer arithmetic.
9 years ago
Diego Biurrun
2ec9fa5ec6
idct: Change type of array stride parameters to ptrdiff_t
ptrdiff_t is the correct type for array strides and similar.
9 years ago
Diego Biurrun
009adfd4fb
x86: fpel: Remove unnecessary sign extend
9 years ago
Anton Khirnov
de2ae3c1fa
lavc: add clobber tests for the new encoding/decoding API
9 years ago
Anton Khirnov
12004a9a7f
audiodsp/x86: yasmify vector_clipf_sse
9 years ago
Anton Khirnov
683da86aab
audiodsp: reorder arguments for vector_clipf
This will make the x86 asm simpler.
ARM conversion by Martin Storsjö <martin@martin.st> and Janne Grunau
<janne-libav@jannau.net>
9 years ago
Anton Khirnov
eea9857bfd
blockdsp: drop the high_bit_depth parameter
It has no effect, since the code is supposed to operate the same way for
any bit depth.
9 years ago
Anton Khirnov
75d98e30af
audiodsp/x86: clear the high bits of the order parameter on 64bit
Also change shl to add, since it can be faster on some CPUs.
CC: libav-stable@libav.org
9 years ago
Anton Khirnov
1d6c76e11f
audiodsp/x86: fix ff_vector_clip_int32_sse2
This version, which is the only one doing two processing cycles per loop
iteration, computes the load/store indices incorrectly for the second
cycle.
CC: libav-stable@libav.org
9 years ago
Diego Biurrun
de452e5037
pixblockdsp: Change type of stride parameters to ptrdiff_t
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Also adjust parameter names to be "stride" everywhere.
9 years ago
Diego Biurrun
721d57e608
vp56: Separate VP5 and VP6 dsp initialization
VP5 has no arch-specific optimizations (nor will it get some in the
future), so it makes no sense to try to share dsp init code with VP6.
9 years ago
Diego Biurrun
3fd22538bc
prores: Change type of stride parameters to ptrdiff_t
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Also adjust parameter names to be "linesize" everywhere.
9 years ago
Diego Biurrun
f81be06cf6
cavs: Change type of stride parameters to ptrdiff_t
ptrdiff_t is the correct type for array strides and similar.
9 years ago
Diego Biurrun
802727b538
vp8: Update some assembly comments left unchanged in bd66f073fe
9 years ago
Diego Biurrun
d9d26a3674
vp56: Change type of stride parameters to ptrdiff_t
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
9 years ago
Diego Biurrun
6892df9294
vp3: Change type of stride parameters to ptrdiff_t
This avoids SIMD-optimized functions having to sign-extend their
stride argument manually to be able to do pointer arithmetic.
Also adjust parameter names to be "stride" everywhere.
9 years ago
Diego Biurrun
e2b9993558
simple_idct: x86: Drop disabled IDCT implementation
This gem has been disabled since 2001.
9 years ago
Ronald S. Bultje
9790b44a89
vp9mc/x86: sse2 MC assembly.
Also a slight change to the ssse3 code, which prevents a theoretical
overflow in the sharp filter.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
11 years ago
James Almer
67922b4ee4
vp9mc/x86: add AVX and AVX2 MC
Roughly 25% faster MC than ssse3 for blocksizes 32 and 64.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
11 years ago
Clément Bœsch
3cda179f18
vp9mc/x86: rename ff_* to ff_vp9_*
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
James Almer
8be8444d01
vp9mc/x86: rename ff_avg[48]_sse to ff_avg[48]_mmxext
pavgb is an sse integer instruction, so the mmxext flag is enough
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
Clément Bœsch
6ab642d69d
vp9mc/x86: simplify a few inits.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
Ronald S. Bultje
3a09494939
vp9mc/x86: add 16px functions (64bit only).
Signed-off-by: Anton Khirnov <anton@khirnov.net>
12 years ago
Anton Khirnov
89466de4ae
vp9/x86: rename vp9dsp to vp9mc
It only contains the MC SIMD, other SIMD will go into different files.
9 years ago
Christophe Gisquet
3c504bc359
x86: deduplicate some constants
Signed-off-by: Anton Khirnov <anton@khirnov.net>
11 years ago
Diego Biurrun
d06dfaa5cb
x86: huffyuv: Use EXTERNAL_SSSE3_FAST convenience macro where appropriate
10 years ago
Diego Biurrun
4efab89332
x86: Use *_FAST/*_SLOW CPU feature detection macros where appropriate
10 years ago
Diego Biurrun
0a39c9ac0b
x86: hpeldsp: Don't check for bitexact flag when initializing VP3-specific code
That code is only ever initialized with that flag set.
10 years ago
Diego Biurrun
95c1df929b
x86: hpeldsp: Drop unused function parameters
10 years ago
Diego Biurrun
c3e83ad3b7
x86: hpeldsp: Use EXTERNAL_SSE2_FAST where appropriate
10 years ago
Diego Biurrun
1dfc3cf89d
x86: hpeldsp: Split off VP3-specific bits into a separate file
10 years ago
James Almer
fca3c3b619
hevc: Add AVX2 DC IDCT
Originally written by Pierre Edouard Lepere <pierre-edouard.lepere@insa-rennes.fr>.
Integrated to Libav by Josh de Kock <josh@itanimul.li>.
Signed-off-by: Alexandra Hájková <alexandra@khirnov.net>
9 years ago
Clément Bœsch
4a081f224e
libavcodec: fix constness in clobber test avcodec_open2() wrappers
Signed-off-by: Martin Storsjö <martin@martin.st>
9 years ago
Anton Khirnov
9df889a5f1
h264: rename h264.[ch] to h264dec.[ch]
This is more consistent with the naming of other decoders.
9 years ago
Martin Storsjö
f1a9eee41c
x86: Add missing movsxd for the int stride parameter
Signed-off-by: Martin Storsjö <martin@martin.st>
9 years ago
Diego Biurrun
1e9c5bf4c1
asm: FF_-prefix internal macros used in inline assembly
These warnings conflict with system macros on Solaris, producing
truckloads of warnings about macro redefinition.
9 years ago
Diego Biurrun
dc40a70c57
Drop unnecessary libavutil/x86/asm.h #includes
9 years ago
Diego Biurrun
a6a750c7ef
tests: Move all test programs to a subdirectory
9 years ago
Vittorio Giovara
41ed7ab45f
cosmetics: Fix spelling mistakes
Signed-off-by: Diego Biurrun <diego@biurrun.de>
9 years ago
Diego Biurrun
01621202aa
build: miscellaneous cosmetics
Restore alphabetical order in lists, break overly long lines, do some
prettyprinting, add some explanatory section comments, group parts
together that belong together logically.
10 years ago
Diego Biurrun
1a094af638
fft: Split MDCT bits off from FFT
10 years ago
Diego Biurrun
73ff983e8d
fft: x86: cosmetics: Drop silly comments, add comment, whitespace
10 years ago
Diego Biurrun
257b30af8e
x86: hevc: Fix linking with both yasm and optimizations disabled
Some optimized functions reference optimized symbols, so the functions
must be explicitly disabled when those symbols are unavailable.
10 years ago
Diego Biurrun
15a24614ae
build: Add vc1dsp component for more fine-grained dependencies
10 years ago
Luca Barbato
e280fe1329
v210: Use separate sample_factors
The 10bit and the 8bit functions can now be implemented to process
a different amount of samples.
And while at it simplify a little the code.
10 years ago
James Darnley
15ec7aa417
v210: Add avx2 version of the 10-bit line encoder
Around 25% faster than the ssse3 version.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
10 years ago