Overview

This project took place over Summer 2023 as part of the Google Summer of Code (GSoC). This project's objective was to implement the in-loop filter portion of the new Versatile Video Codec (VVC, H.266) decoder currently under development in FFmpeg. The In-Loop filters that I chose to implement are the Sample Adaptive Offset (SAO) and Deblocking filters. These filters are already implemented in C, and my objective was to implement x86_64 assembly versions that were more performant than the C "template" version.

The need for SIMD-accelereated code for this new decoder is clear - even with SIMD, software decoding of the VVC requires lots of computational power, and compilers are currently unable to optimize code to fully utilize the SIMD aspects of modern processors.

SAO Filter

SAO is an in-loop filter used in both HEVC and VVC to improve the objective quality of the reconstructed picture by supressing banding artifacts and ringing artifacts caused by quantization errors of high frequency components. The SAO filter can be split into two parts - a band filter and an edge filter.

As part of a "qualification task" to join FFmpeg's GSoC 2023 program, I had to bring over the SAO filter from HEVC to VVC. This is possible as the filter is largely unchanged between the two. During this period before GSoC, I was largely able to bring forward the band filter. However, the edge filter alluded me, and I was not able to complete that.

Thus, my first GSoC task was to complete the edge filter implementation. This was largely done by the middle of June, and my code was merged in this PR.

Deblocking Filter

As with the SAO filter, the deblocking filter can been split into two parts: the luma deblocking filter and the chroma deblocking filter. I chose to start with the chroma deblocking filter. The deblocking filters also shared commonality with the HEVC version to some degree - which I intended to port. I also tried writing the "stronger" version of these filters, but I was ultimately confined to only working on the parts that were already implemented in HEVC.

I chose to start with the chroma deblocking filter, as that had a checkasm test that would be easily modified to work with the new filter. This would guide me in working on the chroma deblocking filter. While the checkasm test did prove helpful, it also gave me a lot of issues!

The chroma deblocking filter took until early August to complete - mainly because I did not catch some pixel boundaries and I was getting lots of segfaults that I was not able to pin down. I then tried writing a version that respected theseboundries, but found myself stuck. However, I decided to benchmark the version I had that worked, and the results came back much more performant than I had expected, so my work will be submitted as such.

This left me with not a lot of time to work on the luma deblocking filter. I had honestly wished that I can say that the luma deblocker was complete to some level and in my final GSoC submission, but this is not to be. Complicating matters was the lease on the place I was renting ended on the 24th, leaving me very little time between moving to work on the filter. I had identified where the HEVC and VVC filters differed (mainly in accessing and calculating tc, and using tc to clip pixels to a limit), but YUView still shows lots of pixels where off-by-one errors on luma pixels occured.

Current State

  • AVX2 SAO Code: Merged

  • AVX2 Chroma Deblocker: Currently blocked by checkasm throwing an error on the GitHub Actions check. I am not sure what that is about, as on my system, checkasm reports all functions passses. Furthermore, all conformance checks pass, so I believe the issue is with checkasm and the Github Actions tester.

  • AVX2 Luma Deblocker: Still needs more work. I am not sure where the error is, as I have accounted for all tc changes. Further time will be needed for the Luma deblocker to be ready.

Out-of-scope(?) work

This was entirely out of scope of my original GSoC project, but I got a MacBook Pro with an ARM-based M1 Pro chip after I had submitted my GSoC proposal. One night, after bashing my head on the chroma filter to no end, I decided to work on a distraction, and did the exact same port of the SAO filter from HEVC to VVC on ARM NEON. I spent a few hours on this and it has been merged in this PR.

Future work

  • Luma Deblocker: the luma deblocker currently found in this PR and the codepack below does not work. Obviously, this needs to be fixed.

  • Chroma Deblocker: shift != 0 (which I believe is where the pixels are close to an end) should be handled in assembly. The strong case should also not be too hard

  • ARM NEON: Once the issues with the luma deblocker are complete and I have a sense of all the changes needed to port the luma deblocking filter, an ARM NEON port of the filter might be a good idea.

Before I go into the next part, I would like to say that I am not abandoning the FFvvc project after the end of GSoC. My availability might not be as good as I will have to juggle between classes, club activities, work, and this project. However, I will continue working on this to the best of my ability.

Where to find my code

Acknowledgement

I would like to extend my utmost appreciation to Nuo Mi, my mentor, who has been an amazing mentor to work with. Despite the timezone difference, Nuo Mi has been able to explain exactly what I needed in order to get to the point where I am right now. I would like to also thank the FFmpeg project as a whole for taking me up on this GSoC project.

I would like to thank my parents for being extremely supportive of me working through GSoC even if they can't quite grasp what I'm doing.

I would like to thank my roommates Tom, August, Hemant, and Kelton for keeping me sane over the summer while I was couped up in my room staring at assembly - thanks for all the beers!

And finally - thank you, the reader, for reading my submission/blog post! I'm glad I was able to keep your attention, and I hope you got something out of my overall GSoC experience!

Till Next Time,

r/l/s