242 Commits (c17be817637ade85fb5d8138e8f402e67b51ac23)

Author SHA1 Message Date
  Luca Barbato 22b48b85b6 altivec support for snow 19 years ago
  Loren Merritt 5e8b787afa simplified and slightly faster h264_chroma_mc8_mmx 19 years ago
  Loren Merritt 513fbd8e5a prefetch pixels for future motion compensation. 2-5% faster h264. 19 years ago
  Loren Merritt 5e6a5c4daf 10l 19 years ago
  Loren Merritt fdd3057981 added mmx implementation of h264_chroma_mc2 19 years ago
  Robert Edele e8600e5edc add MMX and SSE versions of ff_snow_inner_add_yblock 19 years ago
  Robert Edele 2c9a0285d4 snow mmx+sse2 optimizations, part 4 19 years ago
  Robert Edele 4567b4bdab Add the mmx and sse2 implementations of ff_snow_vertical_compose(). 19 years ago
  Robert Edele 059715a41c First part of a series of speed-enchancing patches. 19 years ago
  Zuxy Meng 82eb4b0f1b 3DNow! & Extended 3DNow! versions of FFT 19 years ago
  Loren Merritt 548a1c8a35 h264_idct8_add_mmx 19 years ago
  Loren Merritt 6da971f160 h264_idct_add only needs mmx1 19 years ago
  Zuxy Meng 2ffb22d2ad use xorps instead of mulps to toggle the sign of a float, as suggested by Software Optimization Guide for AMD64 Processors. 19 years ago
  Loren Merritt d84f7c61ee gcc2.95 workaround 19 years ago
  Loren Merritt 7a5b2fa812 remove some useless instructions 19 years ago
  Loren Merritt 6a8eb0f45a 4% faster h264_qpel_mc 19 years ago
  Loren Merritt ef9d1d1575 h264: special case dc-only idct. ~1% faster overall 19 years ago
  Loren Merritt 4e295993ba 10l in 1.12 19 years ago
  Loren Merritt 6ee669732d 10l (x86_64) 20 years ago
  Loren Merritt e545f37527 18% faster put_h264_qpel16_mc[13]2_mmx2 20 years ago
  Loren Merritt c03ce51dfb 11% faster put_h264_qpel16_v_lowpass_mmx2 20 years ago
  Loren Merritt 0331f09237 15% faster put_h264_qpel16_hv_lowpass_mmx2 20 years ago
  Steve L'Homme 68b51e58ce MSVC-compatible __align8/__align16 declaration 20 years ago
  Diego Biurrun 5509bffa88 Update licensing information: The FSF changed postal address. 20 years ago
  Loren Merritt e8b562087d tweak h264_biweight 20 years ago
  Loren Merritt cec9395977 fix some potential arithmetic overflows in pred_direct_motion() and 20 years ago
  Diego Biurrun bb270c0896 COSMETICS: tabs --> spaces, some prettyprinting 20 years ago
  Diego Biurrun 115329f160 COSMETICS: Remove all trailing whitespace. 20 years ago
  Guillaume Poirier f6d1338cb5 Add the rest of missing Reg_* macros to support both AMD-64 style regs and IA32 regs. 20 years ago
  Loren Merritt ea15df8048 use sse16_sse2() in nsse 20 years ago
  Loren Merritt a6624e21cb faster h264_chroma_mc8_mmx, added h264_chroma_mc4_mmx. 20 years ago
  Loren Merritt b926572aa9 h264 mmx weighted prediction. up to 3% overall speedup. 20 years ago
  Loren Merritt 5693c08356 sse2 16x16 sum squared diff (306=>268 cycles on a K8) 20 years ago
  Michael Niedermayer 12e9668119 replace a few mov + psrlq with pshufw, there are more cases which could benefit from this but they would require us to duplicate some functions ... 20 years ago
  Reimar Döffinger cd7af76d9e Fix compile without CONFIG_GPL, misplaced #endif caused a missing }. 20 years ago
  Michael Niedermayer 9f211bc6d7 remove unused table entries 20 years ago
  Michael Niedermayer 84740d5980 xvids mmx&mmx2 idcts 20 years ago
  Måns Rullgård 79396ac685 Kill some compiler warnings. Compiled code verified identical after changes. 20 years ago
  Michael Niedermayer d3a9f79871 simplify (d&a) and (d&~a) calculation, hint by skal 20 years ago
  Michael Niedermayer b5b65df7a9 add consts (this was in my local tree, dunno where it came from, probably forgoten from some const patch) 20 years ago
  Måns Rullgård bf4e3bd2d0 kill a bunch of compiler warnings 20 years ago
  Alexander Strasser c11c2bc20b libavutil: Utility code from libavcodec moved to a separate library. 20 years ago
  Loren Merritt d2bb7db135 sort H.264 mmx dsp functions into their own file 20 years ago
  Michael Niedermayer c26ae41db2 adding a few const 20 years ago
  Michael Niedermayer 435b0720a8 100l for myself (breaking amd64) 20 years ago
  Michael Niedermayer 6510f43cf3 merge a few asm blocks so gcc cant unoptimize it (658->631 dezicycles on duron) 20 years ago
  Michael Niedermayer 987ae784e6 get rid of 2 movq (680 -> 658 dezicycles on duron) 20 years ago
  Michael Niedermayer e4b36d4434 avoid one transpose (730->680 dezicycles on duron) 20 years ago
  Loren Merritt 85bbfcd4ee 10l (symbol mangling) 20 years ago
  Michael Niedermayer 1f3dbc09b1 add rounding bias before the horizontal idct (765->730 dezicyles on duron) 20 years ago