You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

260 lines
6.9KB

  1. =============================================
  2. SNOW Video Codec Specification Draft 20070103
  3. =============================================
  4. Definitions:
  5. ============
  6. MUST the specific part must be done to conform to this standard
  7. SHOULD it is recommended to be done that way, but not strictly required
  8. ilog2(x) is the rounded down logarithm of x with basis 2
  9. ilog2(0) = 0
  10. Type definitions:
  11. =================
  12. b 1-bit range coded
  13. u unsigned scalar value range coded
  14. s signed scalar value range coded
  15. Bitstream syntax:
  16. =================
  17. frame:
  18. header
  19. prediction
  20. residual
  21. header:
  22. keyframe b MID_STATE
  23. if(keyframe || always_reset)
  24. reset_contexts
  25. if(keyframe){
  26. version u header_state
  27. always_reset b header_state
  28. temporal_decomposition_type u header_state
  29. temporal_decomposition_count u header_state
  30. spatial_decomposition_count u header_state
  31. colorspace_type u header_state
  32. chroma_h_shift u header_state
  33. chroma_v_shift u header_state
  34. spatial_scalability b header_state
  35. max_ref_frames-1 u header_state
  36. qlogs
  37. }
  38. spatial_decomposition_type s header_state
  39. qlog s header_state
  40. mv_scale s header_state
  41. qbias s header_state
  42. block_max_depth s header_state
  43. qlogs:
  44. for(plane=0; plane<2; plane++){
  45. quant_table[plane][0][0] s header_state
  46. for(level=0; level < spatial_decomposition_count; level++){
  47. quant_table[plane][level][1]s header_state
  48. quant_table[plane][level][3]s header_state
  49. }
  50. }
  51. reset_contexts
  52. *_state[*]= MID_STATE
  53. prediction:
  54. for(y=0; y<block_count_vertical; y++)
  55. for(x=0; x<block_count_horizontal; x++)
  56. block(0)
  57. block(level):
  58. if(keyframe){
  59. intra=1
  60. y_diff=cb_diff=cr_diff=0
  61. }else{
  62. if(level!=max_block_depth){
  63. s_context= 2*left->level + 2*top->level + topleft->level + topright->level
  64. leaf b block_state[4 + s_context]
  65. }
  66. if(level==max_block_depth || leaf){
  67. intra b block_state[1 + left->intra + top->intra]
  68. if(intra){
  69. y_diff s block_state[32]
  70. cb_diff s block_state[64]
  71. cr_diff s block_state[96]
  72. }else{
  73. ref_context= ilog2(2*left->ref) + ilog2(2*top->ref)
  74. if(ref_frames > 1)
  75. ref u block_state[128 + 1024 + 32*ref_context]
  76. mx_context= ilog2(2*abs(left->mx - top->mx))
  77. my_context= ilog2(2*abs(left->my - top->my))
  78. mvx_diff s block_state[128 + 32*(mx_context + 16*!!ref)]
  79. mvy_diff s block_state[128 + 32*(my_context + 16*!!ref)]
  80. }
  81. }else{
  82. block(level+1)
  83. block(level+1)
  84. block(level+1)
  85. block(level+1)
  86. }
  87. }
  88. residual:
  89. FIXME
  90. Tag description:
  91. ----------------
  92. version
  93. 0
  94. this MUST NOT change within a bitstream
  95. always_reset
  96. if 1 then the range coder contexts will be reset after each frame
  97. temporal_decomposition_type
  98. 0
  99. temporal_decomposition_count
  100. 0
  101. spatial_decomposition_count
  102. FIXME
  103. colorspace_type
  104. 0
  105. this MUST NOT change within a bitstream
  106. chroma_h_shift
  107. log2(luma.width / chroma.width)
  108. this MUST NOT change within a bitstream
  109. chroma_v_shift
  110. log2(luma.height / chroma.height)
  111. this MUST NOT change within a bitstream
  112. spatial_scalability
  113. 0
  114. max_ref_frames
  115. maximum number of reference frames
  116. this MUST NOT change within a bitstream
  117. ref_frames
  118. minimum of the number of available reference frames and max_ref_frames
  119. for example the first frame after a key frame always has ref_frames=1
  120. spatial_decomposition_type
  121. wavelet type
  122. 0 is a 9/7 symmetric compact integer wavelet
  123. 1 is a 5/3 symmetric compact integer wavelet
  124. others are reserved
  125. stored as delta from last, last is reset to 0 if always_reset || keyframe
  126. qlog
  127. quality (logarthmic quantizer scale)
  128. stored as delta from last, last is reset to 0 if always_reset || keyframe
  129. mv_scale
  130. stored as delta from last, last is reset to 0 if always_reset || keyframe
  131. FIXME check that everything works fine if this chanes between frames
  132. qbias
  133. dequantization bias
  134. stored as delta from last, last is reset to 0 if always_reset || keyframe
  135. block_max_depth
  136. maximum depth of the block tree
  137. stored as delta from last, last is reset to 0 if always_reset || keyframe
  138. quant_table
  139. quantiztation table
  140. Range Coder:
  141. ============
  142. FIXME
  143. Neighboring Blocks:
  144. ===================
  145. left and top are set to the respective blocks unless they are outside of
  146. the image in which case they are set to the Null block
  147. top-left is set to the top left block unless its outside of the image in
  148. which case it is set to the left block
  149. if this block has no larger parent block or its at the left side of its
  150. parent block and the top right block is not outside of the image then the
  151. top right block is used for top-right else the top-left block is used
  152. Null block
  153. y,cb,cr are 128
  154. level, ref, mx and my are 0
  155. Motion Vector Prediction:
  156. =========================
  157. 1. the motion vectors of all the neighboring blocks are scaled to
  158. compensate for the difference of reference frames
  159. scaled_mv= (mv * (256 * (current_reference+1) / (mv.reference+1)) + 128)>>8
  160. 2. the median of the scaled left, top and top-right vectors is used as
  161. motion vector prediction
  162. 3. the used motion vector is the sum of the predictor and
  163. (mvx_diff, mvy_diff)*mv_scale
  164. Intra DC Predicton:
  165. ======================
  166. the luma and chroma values of the left block are used as predictors
  167. the used luma and chroma is the sum of the predictor and y_diff, cb_diff, cr_diff
  168. Motion Compensation:
  169. ====================
  170. FIXME
  171. LL band prediction:
  172. ===================
  173. FIXME
  174. Dequantizaton:
  175. ==============
  176. FIXME
  177. Wavelet Transform:
  178. ==================
  179. FIXME
  180. TODO:
  181. =====
  182. Important:
  183. finetune initial contexts
  184. spatial_decomposition_count per frame?
  185. flip wavelet?
  186. try to use the wavelet transformed predicted image (motion compensated image) as context for coding the residual coefficients
  187. try the MV length as context for coding the residual coefficients
  188. use extradata for stuff which is in the keyframes now?
  189. the MV median predictor is patented IIRC
  190. Not Important:
  191. spatial_scalability b vs u (!= 0 breaks syntax anyway so we can add a u later)
  192. Credits:
  193. ========
  194. Michael Niedermayer
  195. Loren Merritt
  196. Copyright:
  197. ==========
  198. GPL + GFDL + whatever is needed to make this a RFC