FFmpeg

Commit Graph

Author	SHA1	Message	Date
Martin Storsjö	a76bf8cf12	arm: vp9itxfm: Optimize 16x16 and 32x32 idct dc by unrolling This work is sponsored by, and copyright, Google. Before: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub1_add_neon: 273.0 189.5 211.7 235.8 vp9_inv_dct_dct_32x32_sub1_add_neon: 752.0 459.2 862.2 553.9 After: vp9_inv_dct_dct_16x16_sub1_add_neon: 226.5 145.0 225.1 171.8 vp9_inv_dct_dct_32x32_sub1_add_neon: 721.2 415.7 727.6 475.0 Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	388e0d2515	aarch64: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter No measured speedup on a Cortex A53, but other cores might benefit. Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	fea92a4b57	arm: vp9mc: Calculate less unused data in the 4 pixel wide horizontal filter Before: Cortex A7 A8 A9 A53 vp9_put_8tap_smooth_4h_neon: 378.1 273.2 340.7 229.5 After: vp9_put_8tap_smooth_4h_neon: 352.1 222.2 290.5 229.5 Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	5e0c2158fb	aarch64: vp9mc: Simplify the extmla macro parameters Fold the field lengths into the macro. This makes the macro invocations much more readable, when the lines are shorter. This also makes it easier to use only half the registers within the macro. Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
James Almer	33ab1d4c6f	avformat/apetag: reorder some code to improve readability This way it's clear the size field accounts for the footer length plus every tag entry, but not the header. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	9 years ago
James Almer	84d874a680	avformat/apetag: account for header size if present when returning the start position The size field in the header/footer accounts for the entire APE tag structure except the 32 bytes from header, for compatibility with APEv1. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	9 years ago
James Almer	e8d6fef316	avformat/apetag: fix flag value to signal footer presence According to the spec[1], a value of 0 means the footer is present and a value of 1 means it's absent, the exact opposite of header presence flag where 1 means present and 0 absent. The reason for this is compatibility with APEv1 tags, where there's no header, footer presence was mandatory for all files, and the flags field was a zeroed reserved field. [1] http://wiki.hydrogenaud.io/index.php?title=Ape_Tags_Flags Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>	9 years ago
Vittorio Giovara	53ea595eec	mov: Rework stsc index validation In order to avoid potential integer overflow change the comparison and make sure to use the same unsigned type for both elements.	9 years ago
Vittorio Giovara	ce6d72d107	imgutils: Document av_image_get_buffer_size()	9 years ago
Paul B Mahol	ba632efa93	avcodec/qdmc: silence gcc 6.2.0 warning Signed-off-by: Paul B Mahol <onemda@gmail.com>	9 years ago
Luca Barbato	b6093e8c72	hlsenc: Correctly write down all 16 bytes in hex Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	9 years ago
Carl Eugen Hoyos	74c576957a	lavf/movenc: Remove two unused variables.	9 years ago
Carl Eugen Hoyos	3ea9773793	lavc/mjpegenc_common: Remove an unused variable.	9 years ago
Matt Wolenetz	36aba43bd5	lavf/mov.c: Avoid heap allocation wraps in mov_read_{senc,saiz}() Core of patch is from paul@paulmehta.com Reference https://crbug.com/643952 (senc,saiz portions) Signed-off-by: Matt Wolenetz <wolenetz@chromium.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Matt Wolenetz	9bbdf5d921	lavf/mov.c: Avoid OOB in mov_read_udta_string() Core of patch is from paul@paulmehta.com Reference https://crbug.com/643952 (udta_string portion) Signed-off-by: Matt Wolenetz <wolenetz@chromium.org> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Martin Storsjö	bc25897630	utvideodec: Add a missing include This was missing from 77c23704c76, fixing building. Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Michael Niedermayer	ce6e7a2db1	avcodec/mjpegenc: Simplify by moving assert into ff_mjpeg_encode_huffman_close() Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Michael Niedermayer	3e1507a954	avcodec/mjpegenc: Bypass the 2 pass encoding when optimal tables are not requested This limits the bugs, speedloss and extra memory allocation to the case when optimal tables are needed. Fixes regressions with slice multi-threading Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Michael Niedermayer	f57665b318	avcodec/mjpegenc: Revert some differences in ff_mjpeg_encode_mb() relative to pre optimal huffman The changes are not needed anymore and the return code was never used Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Michael Niedermayer	b39129b68e	avcodec/mjpegenc_huffman: remove unneeded header include Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Michael Niedermayer	d23af72a0c	avcodec/tests/mjpegenc_huffman: Remove static in main() table Avoids false positives when greping for non constant statics Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Michael Niedermayer	daccbe81a2	avcodec/mjpegenc: Drop i_tex misuse, set itex/header bits correctly, fix 2pass encoding Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Michael Niedermayer	e10bd12c25	avcodec/mjpegenc: Remove non functional huffman reallocation and error handling If this is wanted iam not against it but it must be designed to work with all cases like slice threads, and a single growing buffer does not work very well with slices. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	9 years ago
Timo Rothenpieler	a52976c0fe	nvenc: make gpu indices independent of supported capabilities Do not allocate a CUDA context for every available gpu. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	9 years ago
Derek Buitenhuis	77c23704c7	avcodec: Mark some codecs with threadsafe init as such Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	9 years ago
Martin Storsjö	0c0b87f12d	aarch64: vp9itxfm: Fix incorrect vertical alignment Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	8476eb0d3a	aarch64: vp9itxfm: Update a comment to refer to a register with a different name Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	3dd7827258	aarch64: vp9itxfm: Use the right lane sizes in 8x8 for improved readability Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	ed8d293306	aarch64: vp9itxfm: Use a single lane ld1 instead of ld1r where possible The ld1r is a leftover from the arm version, where this trick is beneficial on some cores. Use a single-lane load where we don't need the semantics of ld1r. Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	4da4b2b87f	aarch64: vp9itxfm: Share instructions for loading idct coeffs in the 8x8 function Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	3933b86bb9	arm: vp9itxfm: Share instructions for loading idct coeffs in the 8x8 function Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	a63da4511d	aarch64: vp9itxfm: Do separate functions for half/quarter idct16 and idct32 This work is sponsored by, and copyright, Google. This avoids loading and calculating coefficients that we know will be zero, and avoids filling the temp buffer with zeros in places where we know the second pass won't read. This gives a pretty substantial speedup for the smaller subpartitions. The code size increases from 14740 bytes to 24292 bytes. The idct16/32_end macros are moved above the individual functions; the instructions themselves are unchanged, but since new functions are added at the same place where the code is moved from, the diff looks rather messy. Before: vp9_inv_dct_dct_16x16_sub1_add_neon: 236.7 vp9_inv_dct_dct_16x16_sub2_add_neon: 1051.0 vp9_inv_dct_dct_16x16_sub4_add_neon: 1051.0 vp9_inv_dct_dct_16x16_sub8_add_neon: 1051.0 vp9_inv_dct_dct_16x16_sub12_add_neon: 1387.4 vp9_inv_dct_dct_16x16_sub16_add_neon: 1387.6 vp9_inv_dct_dct_32x32_sub1_add_neon: 554.1 vp9_inv_dct_dct_32x32_sub2_add_neon: 5198.5 vp9_inv_dct_dct_32x32_sub4_add_neon: 5198.6 vp9_inv_dct_dct_32x32_sub8_add_neon: 5196.3 vp9_inv_dct_dct_32x32_sub12_add_neon: 6183.4 vp9_inv_dct_dct_32x32_sub16_add_neon: 6174.3 vp9_inv_dct_dct_32x32_sub20_add_neon: 7151.4 vp9_inv_dct_dct_32x32_sub24_add_neon: 7145.3 vp9_inv_dct_dct_32x32_sub28_add_neon: 8119.3 vp9_inv_dct_dct_32x32_sub32_add_neon: 8118.7 After: vp9_inv_dct_dct_16x16_sub1_add_neon: 236.7 vp9_inv_dct_dct_16x16_sub2_add_neon: 640.8 vp9_inv_dct_dct_16x16_sub4_add_neon: 639.0 vp9_inv_dct_dct_16x16_sub8_add_neon: 842.0 vp9_inv_dct_dct_16x16_sub12_add_neon: 1388.3 vp9_inv_dct_dct_16x16_sub16_add_neon: 1389.3 vp9_inv_dct_dct_32x32_sub1_add_neon: 554.1 vp9_inv_dct_dct_32x32_sub2_add_neon: 3685.5 vp9_inv_dct_dct_32x32_sub4_add_neon: 3685.1 vp9_inv_dct_dct_32x32_sub8_add_neon: 3684.4 vp9_inv_dct_dct_32x32_sub12_add_neon: 5312.2 vp9_inv_dct_dct_32x32_sub16_add_neon: 5315.4 vp9_inv_dct_dct_32x32_sub20_add_neon: 7154.9 vp9_inv_dct_dct_32x32_sub24_add_neon: 7154.5 vp9_inv_dct_dct_32x32_sub28_add_neon: 8126.6 vp9_inv_dct_dct_32x32_sub32_add_neon: 8127.2 Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	5eb5aec475	arm: vp9itxfm: Do a simpler half/quarter idct16/idct32 when possible This work is sponsored by, and copyright, Google. This avoids loading and calculating coefficients that we know will be zero, and avoids filling the temp buffer with zeros in places where we know the second pass won't read. This gives a pretty substantial speedup for the smaller subpartitions. The code size increases from 12388 bytes to 19784 bytes. The idct16/32_end macros are moved above the individual functions; the instructions themselves are unchanged, but since new functions are added at the same place where the code is moved from, the diff looks rather messy. Before: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub1_add_neon: 273.0 189.5 212.0 235.8 vp9_inv_dct_dct_16x16_sub2_add_neon: 2102.1 1521.7 1736.2 1265.8 vp9_inv_dct_dct_16x16_sub4_add_neon: 2104.5 1533.0 1736.6 1265.5 vp9_inv_dct_dct_16x16_sub8_add_neon: 2484.8 1828.7 2014.4 1506.5 vp9_inv_dct_dct_16x16_sub12_add_neon: 2851.2 2117.8 2294.8 1753.2 vp9_inv_dct_dct_16x16_sub16_add_neon: 3239.4 2408.3 2543.5 1994.9 vp9_inv_dct_dct_32x32_sub1_add_neon: 758.3 456.7 864.5 553.9 vp9_inv_dct_dct_32x32_sub2_add_neon: 10776.7 7949.8 8567.7 6819.7 vp9_inv_dct_dct_32x32_sub4_add_neon: 10865.6 8131.5 8589.6 6816.3 vp9_inv_dct_dct_32x32_sub8_add_neon: 12053.9 9271.3 9387.7 7564.0 vp9_inv_dct_dct_32x32_sub12_add_neon: 13328.3 10463.2 10217.0 8321.3 vp9_inv_dct_dct_32x32_sub16_add_neon: 14176.4 11509.5 11018.7 9062.3 vp9_inv_dct_dct_32x32_sub20_add_neon: 15301.5 12999.9 11855.1 9828.2 vp9_inv_dct_dct_32x32_sub24_add_neon: 16482.7 14931.5 12650.1 10575.0 vp9_inv_dct_dct_32x32_sub28_add_neon: 17589.5 15811.9 13482.8 11333.4 vp9_inv_dct_dct_32x32_sub32_add_neon: 18696.2 17049.2 14355.6 12089.7 After: vp9_inv_dct_dct_16x16_sub1_add_neon: 273.0 189.5 211.7 235.8 vp9_inv_dct_dct_16x16_sub2_add_neon: 1203.5 998.2 1035.3 763.0 vp9_inv_dct_dct_16x16_sub4_add_neon: 1203.5 998.1 1035.5 760.8 vp9_inv_dct_dct_16x16_sub8_add_neon: 1926.1 1610.6 1722.1 1271.7 vp9_inv_dct_dct_16x16_sub12_add_neon: 2873.2 2129.7 2285.1 1757.3 vp9_inv_dct_dct_16x16_sub16_add_neon: 3221.4 2520.3 2557.6 2002.1 vp9_inv_dct_dct_32x32_sub1_add_neon: 753.0 457.5 866.6 554.6 vp9_inv_dct_dct_32x32_sub2_add_neon: 7554.6 5652.4 6048.4 4920.2 vp9_inv_dct_dct_32x32_sub4_add_neon: 7549.9 5685.0 6046.9 4925.7 vp9_inv_dct_dct_32x32_sub8_add_neon: 8336.9 6704.5 6604.0 5478.0 vp9_inv_dct_dct_32x32_sub12_add_neon: 10914.0 9777.2 9240.4 7416.9 vp9_inv_dct_dct_32x32_sub16_add_neon: 11859.2 11223.3 9966.3 8095.1 vp9_inv_dct_dct_32x32_sub20_add_neon: 15237.1 13029.4 11838.3 9829.4 vp9_inv_dct_dct_32x32_sub24_add_neon: 16293.2 14379.8 12644.9 10572.0 vp9_inv_dct_dct_32x32_sub28_add_neon: 17424.3 15734.7 13473.0 11326.9 vp9_inv_dct_dct_32x32_sub32_add_neon: 18531.3 17457.0 14298.6 12080.0 Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	79d332ebbd	aarch64: vp9itxfm: Move the load_add_store macro out from the itxfm16 pass2 function This allows reusing the macro for a separate implementation of the pass2 function. Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	47b3c2c18d	arm: vp9itxfm: Move the load_add_store macro out from the itxfm16 pass2 function This allows reusing the macro for a separate implementation of the pass2 function. Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	115476018d	aarch64: vp9itxfm: Make the larger core transforms standalone functions This work is sponsored by, and copyright, Google. This reduces the code size of libavcodec/aarch64/vp9itxfm_neon.o from 19496 to 14740 bytes. This gives a small slowdown of a couple of tens of cycles, but makes it more feasible to add more optimized versions of these transforms. Before: vp9_inv_dct_dct_16x16_sub4_add_neon: 1036.7 vp9_inv_dct_dct_16x16_sub16_add_neon: 1372.2 vp9_inv_dct_dct_32x32_sub4_add_neon: 5180.0 vp9_inv_dct_dct_32x32_sub32_add_neon: 8095.7 After: vp9_inv_dct_dct_16x16_sub4_add_neon: 1051.0 vp9_inv_dct_dct_16x16_sub16_add_neon: 1390.1 vp9_inv_dct_dct_32x32_sub4_add_neon: 5199.9 vp9_inv_dct_dct_32x32_sub32_add_neon: 8125.8 Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Martin Storsjö	0331c3f5e8	arm: vp9itxfm: Make the larger core transforms standalone functions This work is sponsored by, and copyright, Google. This reduces the code size of libavcodec/arm/vp9itxfm_neon.o from 15324 to 12388 bytes. This gives a small slowdown of a couple tens of cycles, up to around 150 cycles for the full case of the largest transform, but makes it more feasible to add more optimized versions of these transforms. Before: Cortex A7 A8 A9 A53 vp9_inv_dct_dct_16x16_sub4_add_neon: 2063.4 1516.0 1719.5 1245.1 vp9_inv_dct_dct_16x16_sub16_add_neon: 3279.3 2454.5 2525.2 1982.3 vp9_inv_dct_dct_32x32_sub4_add_neon: 10750.0 7955.4 8525.6 6754.2 vp9_inv_dct_dct_32x32_sub32_add_neon: 18574.0 17108.4 14216.7 12010.2 After: vp9_inv_dct_dct_16x16_sub4_add_neon: 2060.8 1608.5 1735.7 1262.0 vp9_inv_dct_dct_16x16_sub16_add_neon: 3211.2 2443.5 2546.1 1999.5 vp9_inv_dct_dct_32x32_sub4_add_neon: 10682.0 8043.8 8581.3 6810.1 vp9_inv_dct_dct_32x32_sub32_add_neon: 18522.4 17277.4 14286.7 12087.9 Signed-off-by: Martin Storsjö <martin@martin.st>	9 years ago
Rostislav Pehlivanov	53234b9ba5	tests/mjpegenc_huffman: align static tables Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>	9 years ago
Rostislav Pehlivanov	a70f0927ea	mjpegenc: use s->avctx as a context for av_log rather than NULL Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>	9 years ago
Rostislav Pehlivanov	20614e868b	tests/mjpegenc_huffman: replace assert() with av_assert0() Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>	9 years ago
Rostislav Pehlivanov	d164ef6589	mjpegenc_common: add missing ff_ prefix to init_uni_ac_vlc Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>	9 years ago
Marton Balint	3aae1eff12	ffplay: change keyboard volume control to logarithmic The command line parameter remains linear. Signed-off-by: Marton Balint <cus@passwd.hu>	9 years ago
Diego Biurrun	c546147db0	configure: Correctly recurse in do_check_deps() Fixes all sorts of configuration problems introducec by `dad7a9c7c0` on non-Linux or non-vanilla configs. Also removes a line made redundant in that commit.	9 years ago
Mark Thompson	d1acab8293	vaapi_encode: Add VP8 support Fixes ticket #6116. (cherry picked from commit `ca62236a89`)	9 years ago
Mark Thompson	be6546a4ff	vaapi_encode: Pass framerate parameters to driver Only do this when building for a recent VAAPI version - initial driver implementations were confused about the interpretation of the framerate field, but hopefully this will be consistent everywhere once 0.40.0 is released. (cherry picked from commit `ff35aa8ca4`)	9 years ago
Mark Thompson	2201c02e6d	vaapi_h264: Enable VBR mode Default to using VBR when a target bitrate is set, unless the max rate is also set and matches the target. Changes to the Intel driver mean that min_qp is also respected in this case, so set a codec default to unset the value rather than using the current default inherited from the MPEG-4 part 2 encoder. (cherry picked from commit `eddfb57210`)	9 years ago
Mark Thompson	ceb28c3cc4	vaapi_encode: Support VBR mode This includes a backward-compatibility hack to choose CBR anyway on old drivers which have no CBR support, so that existing programs will continue to work their options now map to VBR. (cherry picked from commit `f033ba470f`)	9 years ago
Mark Thompson	3b95c7c17d	vaapi_encode: Add MPEG-2 support (cherry picked from commit `ca6ae3b77a`)	9 years ago
Mark Thompson	eefa4b76ee	vaapi_h264: Scale log2_max_pic_order_cnt_lsb with max_b_frames Before this change, it was possible to overflow pic_order_cnt_lsb and generate a stream with invalid POC numbering. This makes sure that the field is large enough that a single IDR B* P sequence uses fewer than half the available POC lsb values. (cherry picked from commit `89725a8512`)	9 years ago
Mark Thompson	c667c0979c	vaapi_encode: Support forcing IDR frames via AVFrame.pict_type (cherry picked from commit `a3c3a5eac2`)	9 years ago

... 169 170 171 172 173 ...

92870 Commits (f7745edeaaeeb3f4f9cc0a4f545d538cd86fa418) All Branches Search

92870 Commits (f7745edeaaeeb3f4f9cc0a4f545d538cd86fa418)

All Branches