Diego Biurrun
994c4bc107
x86util: Port all macros to cpuflags
Also do some small cosmetic changes: Drop pointless _MMX suffix from ABSD2
macro name, drop pointless check for MMX support, we always assume MMX is
available in our SIMD code, fix spelling.
13 years ago
Michael Niedermayer
f59750641a
swscale: x86: Add some forgotten 12-bit planar YUV cases
Signed-off-by: Diego Biurrun <diego@biurrun.de>
13 years ago
Vittorio Giovara
41ed7ab45f
cosmetics: Fix spelling mistakes
Signed-off-by: Diego Biurrun <diego@biurrun.de>
9 years ago
Diego Biurrun
04581c8c77
x86: yasm: Use complete source path for macro helper %includes
This is more consistent with the way we handle C #includes and
it simplifies the build system.
13 years ago
Diego Biurrun
6860b4081d
x86: include x86inc.asm in x86util.asm
This is necessary to allow refactoring some x86util macros with cpuflags.
13 years ago
Henrik Gramner
729f90e268
x86inc improvements for 64-bit
Add support for all x86-64 registers
Prefer caller-saved register over callee-saved on WIN64
Support up to 15 function arguments
Also (by Ronald S. Bultje)
Fix up our asm to work with new x86inc.asm.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
13 years ago
Ronald S. Bultje
45fdcc8e2d
swscale: convert hscale() to use named arguments.
13 years ago
Ronald S. Bultje
aba7a827aa
swscale: convert hscale to cpuflags().
13 years ago
Ronald S. Bultje
2254b559cb
swscale: make filterPos 32bit.
Fixes overflows for large image sizes.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
13 years ago
Ronald S. Bultje
3b15a6d742
config.asm: change %ifdef directives to %if directives.
This allows combining multiple conditionals in a single statement.
13 years ago
Ronald S. Bultje
6ea64339c5
swscale: split scale.asm.
scale.asm keeps horizontal scaling functions, whereas output.asm gets
the vertical scaling/output functions.
13 years ago
Diego Biurrun
52de07e1f1
swscale: Mark yuv2planeX_8_mmx as MMX2; it contains MMX2 instructions.
13 years ago
Ronald S. Bultje
8283f90a52
swscale: handle unaligned buffers in yuv2plane1
The issue had been introduced in
c435653627
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
13 years ago
Ronald S. Bultje
c435653627
swscale: write yuv2plane1 MMX/SSE2/SSE4/AVX functions.
13 years ago
Ronald S. Bultje
9e66b892e8
swscale: add missing colons to x86 assembly yuv2planeX.
This fixes assembling using "nasm".
13 years ago
Ronald S. Bultje
6cacecdca3
swscale: make yuv2yuvX_10_sse2/avx 8/9/16-bits aware.
Also implement MMX/MMX2 versions and SSE4 versions.
13 years ago
Kieran Kunhya
7fbbf95293
yuv2planeX10 SIMD
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
13 years ago
Ronald S. Bultje
6aa3cac6bf
swscale: use aligned move for storage into temporary buffer.
The intermediate buffer is always aligned.
13 years ago
Ronald S. Bultje
e0c3e07387
sws: implement MMX/SSE2/SSSE3/SSE4 versions for horizontal scaling.
Speed: from 3.9x to 9.6x speed improvement over C, and some small
(up to 15%) speed improvements over existing MMX code (particularly
for bigger filters).
13 years ago