Justin Ruggles
0e8fdd41c2
dsputil: use cpuflags in x86 emu_edge_core
avoids passing around the extra argument among all the macros it uses
13 years ago
Justin Ruggles
395f2e70dd
dsputil: use movups instead of movdqu in ff_emu_edge_core_sse()
This allows emulated_edge_mc_sse() and gmc_sse() to be used under
AV_CPU_FLAG_SSE.
13 years ago
Justin Ruggles
9d06037d48
twinvq: add SSE/AVX optimized sum/difference stereo interleaving
13 years ago
Justin Ruggles
b8f02f5b4e
dsputil: use cpuflags in x86 versions of vector_clip_int32()
13 years ago
Justin Ruggles
4e8e262476
fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm
13 years ago
Ronald S. Bultje
38e06c2969
Move clipd macros to x86util.asm.
This allows sharing them between multiple .asm files.
14 years ago
Dave Yeo
cc73511e8e
Fix NASM include directive
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
14 years ago
Ronald S. Bultje
3a39195b1d
Move x86inc.asm to libavutil/.
This allows using it in libswscale/ also.
14 years ago
Justin Ruggles
6054cd25b4
ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions.
14 years ago
Dave Yeo
d69f9a4234
Add support for a.out object format to assembler macros.
This format is still used by e.g. OS/2.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
14 years ago
Diego Biurrun
888fa31eca
Fix FSF address copy paste error in some license headers.
14 years ago
Justin Ruggles
e6e9823488
Add apply_window_int16() to DSPContext with x86-optimized versions and use it
in the ac3_fixed encoder.
14 years ago
Mans Rullgard
2912e87a6c
Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Ronald S. Bultje
17cf7c68ed
Fix ff_emu_edge_core_sse() on Win64.
Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more strict
on the size of registers and which registers are being used for operations
where multiple are available. This fixes segfaults in emulated_edge()
function calls on Win64.
14 years ago
Justin Ruggles
c73d99e672
Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.
Signed-off-by: Mans Rullgard <mans@mansr.com>
14 years ago
Ronald S. Bultje
81f2a3f4ff
Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
14 years ago
Jason Garrett-Glaser
2966cc1849
Update x264asm header files to latest versions.
Modify the asm accordingly.
GLOBAL is now no longoer necessary for PIC-compliant loads.
Originally committed as revision 23739 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Alex Converse
3deb53849e
Implement an sse version of scalarproduct_float().
Originally committed as revision 21386 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
758c7455f1
fix a crash in ape decoding on x86_32 sse2
Originally committed as revision 20777 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
a4605efdf5
slightly faster scalarproduct_and_madd_int16_ssse3 on penryn, no change on conroe
Originally committed as revision 20743 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
b1159ad928
refactor and optimize scalarproduct
29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)
9-123% faster ape decoding on G4.
Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
b10fa1bb8b
port ape dsp functions from sse2 to mmx
now requires yasm
Originally committed as revision 20722 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
b07781b6e4
fix linking on systems with a function name prefix (10l in r20287)
Originally committed as revision 20294 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
e17ccf60fe
huffyuv: add some const qualifiers
Originally committed as revision 20290 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
2f77923d72
simd add_hfyu_left_prediction
2.2x faster than C on conroe, 3.6x on penryn.
4-6% faster huffyuv decoding if using left or plane mode and yuv
Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk
15 years ago
Loren Merritt
3daa434a40
ff_add_hfyu_median_prediction_mmx2
overall ffvhuff decoding speedup: 28% on core2, 25% on k8.
Originally committed as revision 17059 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Diego Biurrun
a6493a8fbd
Rename libavcodec/i386/ --> libavcodec/x86/.
It contains optimizations that are not specific to i386 and
libavutil uses this naming scheme already.
Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Jason Garrett-Glaser
40c7d0ae3c
Add automatic prefix handling to yasm functions. Does nothing now, but will
be useful for porting x264 asm in the future.
Originally committed as revision 16234 to svn://svn.ffmpeg.org/ffmpeg/trunk
16 years ago
Loren Merritt
7ca7d5fae0
file which should have been added in r14749
Originally committed as revision 14751 to svn://svn.ffmpeg.org/ffmpeg/trunk
17 years ago