Introduction
I like to grind in games. Part of how I play games is that I like to record all gameplay to preserve it in its entirety. That mentality has posed some technical challenges over time that tested hard drive capacity, video codec tuning, and more. That Dark Aether camo in Call of Duty: Black Ops Cold War looks pretty good. But it will take days of gameplay. That's a huge quantity of footage which would normally take a huge amount of time to run through x265. Time for another media project?
Sure. I'm up for the challenge.
Video/Audio Format
Like always, I'm setting up a checklist of things I want for the final video to meet. Here's the specifications I want:
- 2560×1080 resolution at 60 FPS. The video needs to be in HDR10 and compressed as HEVC/H.265.
- Lossless 7.1.4.4 Dolby Atmos Audio as well as a Lossless 7.1 Surround Sound track generated from the 7.1.4.4 audio for compatibility with YouTube and local video players. 7.1.4.4 should be archived via WavPack and 7.1 should be compressed with FLAC.
- Separated microphone audio tracks of all participants, captured in the highest quality possible. Some times the highest available is from Discord, Ennuicastr, or a raw recording directly from someone's microphone. It's not consistent.
- Separated in-game voice chat. People use the voice chat in-game. Archiving this separately would be nice for more control in post.
- Timestamps of the exact moment the video file was created with nanosecond precision (or 100ns precision on Windows), as well as timestamps of the exact moment the video got processed.
Setup
The entire point of a project like this is to automate the entire process. I should be expected to do 0 editing. I should just place files in a folder, run a script, and have a final result a few minutes or hours later.
I'll briefly go over the details of the setup down below. Then, during the breakdown, I'll get into the real details of everything. Consider this my "what I want" wishlist.
Video Recording Setup
I realised early on in this that the procedure of recording in raw via Dxtory and then compressing via FFmpeg would significantly slow down the entire production procedure. So, this setup will use GeForce Experience instead. I've been toying around with its recorder for a while now and I think I've made it work in my favour for this specific project. It encodes the video as it's recording via GPU, so the encoding step is skipped entirely. The videos to be processed will be in an MP4 container. Disgusting.
Here are the video settings for GeForce Experience that were used in this project:
I should note, GPU encoding is usually worse than CPU encoding for videos. I've written a paper on this in graduate school, demonstrating the differences between how they turn out. The general idea is that CPU-based encoding will yield better quality at the cost of encoding time, while GPU-based encoding will be significantly faster, but the quality suffers. It's more complicated, but that's the simple version. The way I find to get around this is to set the bitrate so high that the quality difference just won't matter.
You can see the comparison of lossless vs. CPU vs. GPU here:
Yes, I used Modern Warfare 2 footage as part of a paper in graduate school. It was awesome. Anyways, you can easily tell the differences in how each preserve details. Specifically look at the third line of text. You can barely make out the text in the CPU-encoded version, and can't read it at all in the GPU-encoded one. I understand this is a challenging scenario to encoders.
Before some suggests to me OBS, it doesn't support HDR10, and the software has never met my demands anyways. In fact, when I push it, it often freezes and I end up losing a significant chunk of my recording. MKV is a wonderful container for recovery at least.
Audio Recording Setup
GeForce Experience does not record lossless audio. Instead, it uses AAC-LC. Here is an FFmpeg dump of a sample file:
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 194 kb/s (default)
Metadata:
creation_time : 2021-08-25T02:49:34.000000Z
handler_name : SoundHandle
vendor_id : [0][0][0][0]
Stream #0:2(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 195 kb/s (default)
Metadata:
creation_time : 2021-08-25T02:49:34.000000Z
handler_name : SoundHandle
vendor_id : [0][0][0][0]
Obviously, this is unacceptable to me. Also, GeForce Experience has issues with audio desynchronisation and drops surround sound channels. Getting around this is easy. I just use Audacity to record the audio losslessly. Then I can use the lossy AAC audio to line up and export the lossless audio from Audacity directly.
Here are the audio settings for GeForce Experience that were used in this project:
In terms of voices of every participant properly separated, my friend group uses Discord to communicate. So there's two options, and we use both of them:
- Craig, a Discord bot that will go into a voice chat and record the Opus audio of each person separately, and DM you links to download the individual tracks.
- Ennuicastr, which takes it a step further by having you open up a tab on your web browser and it will record microphone feed via that. This bypasses Discord's audio quality restrictions and filters. It sounds significantly better.
Directory Structure
I want a simple but effective structure for my directories. There will be two directories:
- queue - Stores pending videos for processing.
- processed - Stores processed videos. Easy.
In addition, the bash scripts I will run will be stored in the root of this project. So it should look like this:
processed/
queue/
prepare.sh
remux.sh
This code is part of my render_tools
suite on GitHub. You can
grab them in their particular branch here:
https://github.com/iDestyKK/render_tools/tree/dev/geforce_experience/geforce_experience
Queue File Structure
I want files to be put into the queue
directory in a very
specific format (Dxtory). Then when I run my script, it should magically
generate files in processed
. Let's say I have a recorded video
called gameplay.mp4
. And let's say I have a few voice files
and a Dolby Atmos 16 channel file in there. It should look like this:
gameplay.mp4
gameplay (16ch).raw
gameplay st0 (Voice - DKK [GeForce Experience]).aac
gameplay st1 (Voice - DKK [Ennuicastr]).flac
gameplay st2 (Voice - DKK [Craig]).flac
gameplay st3 (Voice - SKK [Ennuicastr]).flac
gameplay st4 (Voice - SKK [Craig]).flac
gameplay st5 (Voice - D4 [Ennuicastr]).flac
gameplay st6 (Voice - D4 [Craig]).flac
The script should go through the MP4 and all files of similar name and put them all into a brand new MKV file. It'll be a single file that will contain all of the audio tracks properly separated and losslessly preserved. It should show like this:
This makes it very simple to produce videos. Simply record. Align audio.
Extract. Run ./remux.sh
. Done. Now let's get into the details.
Recording
Recording Video
I just use the Alt + F9 hotkey and GeForce Experience will record the session to an MP4 file. This is as simple as it gets.
Recording Audio
GeForce Experience records this too. But, as said above, this audio is unacceptable. So it's time to have Audacity record alongside GeForce Experience. The procedure is simple:
- Have Audacity record the entire session (multiple games, if possible).
- Run a magically awesome script to extract the game and microphone audio.
- Import all gameplay audio (AAC) from all tracks in the session into Audacity.
- Line up each AAC track with where they were recorded in Audacity.
- Export each game segment (via Export Selected Audio...). I do this a lot, so I have it hotkey'd to Ctrl+Shift+W. I recommend setting that up in Audacity because it'll be a common occurrence.
Getting exact creation time
If you have Git Bash and the suite of UNIX apps that come with it installed, you can easily get the creation time of a video file. My script does this automatically. But if you were curious:
UNIX> stat "BlackOpsColdWar 2021.08.29 - 23.07.40.04.mp4"
File: BlackOpsColdWar 2021.08.29 - 23.07.40.04.mp4
Size: 5880051128 Blocks: 5742240 IO Block: 65536 regular file
Device: 548f4e94h/1418677908d Inode: 7036874418181045 Links: 1
Access: (0644/-rw-r--r--) Uid: (197609/ idest) Gid: (197609/ UNKNOWN)
Access: 2021-08-30 20:48:23.317670100 -0400
Modify: 2021-08-29 23:16:00.712173500 -0400
Change: 2021-08-30 01:01:35.468077900 -0400
Birth: 2021-08-29 23:07:41.655625200 -0400
It's that Birth:
timestamp. This is a Windows-only thing. You
can spit it out in a more compliant way (ISO 8601) via sed
magic, as usual:
UNIX> stat "BlackOpsColdWar 2021.08.29 - 23.07.40.04.mp4" \
| grep "Birth: " \
| sed 's/.*: \(.*-.*-.*\) \(.*:.*:.*\..*\) \(.*\)\(..\)$/\1T\2\3:\4/'
2021-08-29T23:07:41.655625200-04:00
My script will take this timestamp, along with the current exact moment,
and store them in the final MKV file as DATE_RECORDED
and
DATE_ENCODED
respectively.
Generating a YouTube delivery file
After recording and utilising the remux.sh
script mentioned up
above, the video is archived in a single file. That's nice and all. But
there are 2 formats that I want. I want one file that has all multitrack
audio in one place, which we have. Another that is essentially a
YouTube-ready deliverable. For that, it's very easy to go through
MKV files and make one. Just remux with the video and first audio track
only.
Assume the video as VIDEO.MKV
, and we want a YouTube-ready
deliverable called VIDEO.YT.MKV
. Thus,
ffmpeg -i "VIDEO.MKV" -map 0:v -map 0:a:0 -c copy "VIDEO.YT.MKV"
Upload the VIDEO.YT.MKV
to YouTube.
If I want voices in the YouTube file?
The entire purpose of the multitrack archival is excessive power and flexibility in post. Someone's microphone too loud? We got the separated tracks. But either way, editing is required to mix them in with the game audio.
Drag the MKV into Audacity and import all tracks. You will want the 7.1 mix
and the highest possible quality of every speaker. So, if a raw microphone
track exists, pick it over Ennuicastr or Craig. The priority order is
Raw > Ennuicastr > Craig
. After that, mix everyone's
volume until it sounds right. Everyone is subjective about this. So do it
the way it feels correct to you. After that, it's time to mix and export.
I work with surround sound audio. This gameplay has a 7.1.4.4 spatial and
7.1 surround sound track. Discord VC audio is mono. When you work with
surround sound, usually mono dialogue audio goes into the centre
channel (channel 3). Thus, in Audacity, mix everyone's voices into that
track. Make sure to re-amplify everything so no clipping occurs. Then
export a new 7.1.flac
file.
Finally, assume the video VIDEO.MKV
, corrected audio
7.1.flac
, and output file you want, VIDEO.YT.MKV
.
Thus,
ffmpeg -i "VIDEO.MKV" -i "7.1.flac" -map 0:v -map 1:a -c copy "VIDEO.YT.MKV"
Well, that was simple.
We can do better though
We didn't really check every box up above. And this is purely a software limitation. For instance, GeForce Experience only allows recording game audio and one additional track of our choice. So, if I record my microphone, it isn't possible to also record game audio. Audacity does grant us one more track, technically. But we use it to grab lossless game audio to replace whatever trash GeForce Experience provides. And if we do that, then aligning the audio in Audacity is nearly impossible without reference points.
In most cases, the AAC audio is actually good enough. But the moment you need to edit, it's asking for quality degredation. And by the time it hits YouTube, it's going to be compressed at least 3 times if you threw it into an editor first. The only exception is if you exported to a lossless format from your editor. It's good practice to keep the highest quality version of your audio and video until the very last step, if possible.
That's a long-winded way of saying, we can do better. If I wanted to record game audio, microphone, in-game voice chat, Discord VC, etc, it simply isn't possible with GeForce Experience. Dxtory got it right. But then we lose proper HDR10. It's frustrating. So we'll explore more options in the future.