Gusilingiz: The Videos
This post is about the boring inside baseball minutia of audios and videos and scripting, so beware. I uploaded the four videos comprising the entire brief lifespan of my Gusilingiz fortress to my YouTube channel. Since I did some experimental stuff in those videos, I thought I would document it all here, because I love this stuff and you can’t stop me. By the way, I spelled Gusilingiz wrong in my last post, in case you’re an expert on the dwarven language and noticed that glaring error.
This post is about the boring inside baseball minutia of audios and videos and scripting, so beware.
I uploaded the four videos comprising the entire brief lifespan of my Gusilingiz fortress to my YouTube channel. Since I did some experimental stuff in those videos, I thought I would document it all here, because I love this stuff and you can’t stop me.
By the way, I spelled Gusilingiz wrong in my last post, in case you’re an expert on the dwarven language and noticed that glaring error. And I know, I know, the diacritic is still missing.
Recording videos of playing Dwarf Fortress is a bit of a challenge, since, as everyone knows, the game looks like garbage. Even beyond that, there is no sound. As I’ve said before, the finest examples of Dwarf Fortress on the YouTubes are from Kruggsmash, and it’s obvious he puts a *ton* of work into those videos, basically replacing what Dwarf Fortress actually looks and sounds like with his own drawings and sound effects. Like, full-time job levels of work, with a team of more than one person. Obviously I’m not going to do that. (I mean, unless someone wants to pay me a living wage. Then I would totally do that, because it would be a blast.)
First there is the challenge of getting OBS to record the game at all. Dwarf Fortress normally runs in a window. It has a fullscreen option, but I’ve never been able to get it to work. For some reason it thinks the best resolution for full screen on my system is a size that covers one and a half of my monitors, and some portion that extends down off the bottom of the screens. (I imagine it’s related to my using a non-standard DPI setting in Windows to make the fonts bigger, which basically wrecks 95% of all Windows software because no programmers ever account for that.)
You’d think that you could simply use the Window Capture source in OBS and be done with it. But you’d be wrong, in my case. Because for some reason, OBS records some bizarre size and shape that doesn’t really relate to the size of the Dwarf Fortress window. Again, I’m just going to guess it’s related to that DPI thing.
So I had to use the dreaded Display Capture OBS source, which just captures whatever is showing on your Windows desktop. Game windows, alerts, taskbars, titlebars, browsers, passwords, whatever, it captures it all. It’s usually not what you want to do in OBS, but it seems like the only option that works for me here. I just have to make sure Dwarf Fortress is the frontmost window when I’m playing, which usually isn’t a problem.
Of course, OBS would also capture the window’s titlebar and the Windows taskbar at the bottom of the screen. I have the taskbar on all the time, not auto-hidden, because I like it that way and it’s my desktop and you can’t stop me from setting it up how I like. (Sorry, in 2019, it’s just easier to assume everyone on the Internet is going to fight me on every personal preference.) So I had to setup a special scene in OBS for Dwarf Fortress, go into the Transform settings and crop out the top part of the screen and the bottom part of the screen. It was a pain, but I guess that’s the price you pay for having the unmitigated gall to use a larger font setting for your Windows desktop.
Incidentally, I just use OBS, I don’t use that Streamlabs thing, because I don’t do streaming, I just record videos locally to my hard drive. I have no idea if any of this stuff works in SLOBS or not.
I won’t go into a lot of detail here, because it could fill another whole blog post or two, but I have OBS set to record video files at 1920×1080, 60 frames per second, and encode at a bitrate of 10,000. I record three separate audio tracks: A microphone track, a game audio track, and a “backup” microphone track, which is just a cheap plastic Logitech desk mic, in case I accidentally mute the main mic, which happens more often than you might think, and after the first time you record 30 minutes of silence, you kind of never want it to happen again. I record the video files to the MKV format just in case the power goes out in the middle of a recording, because it’s the only multi-audio-track format you can recover. (It’s happened to me several times, and again, after the first time, you never want it to happen again.)
Next is the really fun part! I wrote a whole lot of ffmpeg.exe scripts to process the raw MKV files into completed MP4 video files. I’ve been working on them for years now, and they get a little more refined every couple of months. This latest video series is a whole new generation of scripts! Ahhhh! This stuff is so cool! Scriiiiiiipts!!! Automaaaaaation!!!! Hours of manual labor reduced to seconds!!! Ahhhh!
Ahem. Sorry. I guess I should try to be more brief because I’m just getting started here. And this next part is the *really* complicated part.
Basically, my ffmpeg post-processing scripts mix the isolated microphone and game audio tracks together (applying ducking) into one audio track, and render the video to a more compact size, and convert to MP4. Then I can store or archive the video or upload it to YouTube. It’s actually a pretty small script.
You know, looking at that script, it seems to have a big error in it… I may have been mixing the tracks wrong this whole time. That’s why I should document these things more. Because that “;[vox][game] amix=inputs=2 [mix];[mix] volume=2.2 [out]” part looks complete wrong to me now (I mean, obviously, am I right?), but I may have done it like that for a reason that I can’t remember.
Someday I should convent that to PowerShell. I end up making a new script file for each different game I record. Oh well.
This particular script for Dwarf Fortress re-encodes the video down to 2500Kbps (or is it KBps? or bps? or Bps? I can never remember-it’s 2500 “bitrate units”), and re-scales the size down to 1280×720, but maintains the 60fps. I wanted these videos as small as possible, while still being able to read the text after compression. (I use fairly large fonts because a) I can’t read small ones and b) it looks better on video.) I typically record 30 minute videos, and they end up in the 500-600MB range.
On the audio side, which is far more interesting to me, most of the work of microphone compression and gating is done by an outboard Behringer compressor, so I don’t have to do much to the microphone track. But I do apply a small EQ notch in the script at 140Hz because my computer room has a nasty resonance around there. It’s a square box of a room and I have a handful of acoustic foam tiles that I bought way back in the 90s tacked on the walls which kills enough of the mid- and high-frequency reflections, but I can’t do much about the lower frequencies unless I want to go out and buy some expensive “bass traps.” Oops I’m getting out of control again. Nobody cares about this stuff, nobody would ever hear it except me, nobody ever even watches my videos, but that resonance still drives me crazy.
The one thing I wanted to note about the microphone track that is new for *these* videos is that I added a low-pass filter to the script to roll off the high frequencies at 11Khz. My Rode NT1 has a bit of a “sizzle” at the high end that has been bothering me lately. There’s some harshness in the sibilances that I wanted to get rid of. This microphone is probably not the best choice for pure speaking voice work anyway. It’s better for singing and even better for instruments. Anyway, the low-pass filter worked pretty well, at the expense of some definition.
The other thing that is new for these videos is the game audio track. I think I mentioned that Dwarf Fortress doesn’t have any sound. So normally, there wouldn’t *be* a game audio track. There is a nifty classical guitar piece that plays on repeat by default, but after a while, it gets a bit tiresome. So I added an audio track of royalty-free music that I got from YouTube. I had to look up some new ffmpeg command-line parameters to add that, and even update ffmpeg to the latest version, because there was apparently a bug that prevented -stream_loop from working right. Anyway there’s some funky fresh upbeat happy generic elevator music playing on all these videos where the dwarves are being massacred by werelizards. It’s funny. Well, to me, at least.
Instead of trying to chain together multiple MP3s into one long background music track within the script, I used REAPER to make one long 35-minute MP3 of all the music MP3s. That way I was able to blend the beginnings and ends of the songs together like a DJ would. If for some reason the video is longer than 35 minutes, the music will just repeat from the beginning. I also applied a little bit of EQ to pull out a lot of the bass and notch out a lot of midrange so the music wouldn’t compete too much with the voiceover. And of course compression to try to get all the MP3s to sound roughly the same volume.
And one last thing I wanted to note is that I wrote a PowerShell script to concatenate a series of short videos together into one long video. Dwarf Fortress is, let’s say, not the most exciting game to watch. So I start and stop the recordings a lot, so I only capture the interesting parts when there is something to say. Since the OBS team stubbornly refuses to add a “pause recording” button, that means I end up with a big list of 1-minute-long recording files.
In the past, if I ever needed to “pause” a recording, it’s usually just once or twice in a 30-minute period (usually when my dog barks at me), and I could use Avidemux to manually put the short videos together into a long video. But for this series I was “pausing” *all the time* so I had a list of like 20-30 videos to put together. Doing that manually would be a pain. Living the life of a programmer means automating away everything that is painful, so I wrote a PowerShell script to do it for me.
UPDATE 1/26: On the off chance anyone is actually trying to *use* the script below, beware that I just discovered a problem with it. The audio drifts out of sync toward the end of the resulting concatenated video. I’m working on a new script to fix that.
UPDATE 1/28: I’m still not sure why the audio is getting out of sync. The problem is that the audio starts to play before the associated video, so it looks like the video lags behind the audio, and it gets farther out of sync the longer the video plays. I’ve tried a bunch of different things. It only seems to happen in *some* video players, particularly the one on the iPad. VLC plays the videos just fine. It might be a problem with the Windows script up above and the -stream_loop option. I don’t fully understand the inner workings of the timing encoded into videos so it’s a bit of a head-scratcher.
UPDATE 1/29: I’m about to declare it a lost cause. The root cause of the problem, I think, is concatenating videos without key frames at each join point. Supposedly re-rendering the video at the concatenation step would solve the keyframe issue, but handling 30+ videos with the ffmpeg concat filter on one command line is a bit of a challenge, as it would result in a command line length that would wrap around the entire world seven times.
The above script only works if all the video files have the same encoding. You’d have to re-render if they were different. It also assumes all the files were recorded “today.” Also it sorts them by file date, so keep that in mind if you edit any of the files before running the script.
Okay I’ve run out of energy, but I think I’ve successfully documented the cool new script stuff I did on these videos for posterity.
UPDATE 1/30: Obviously when I said “I’m about to declare it a lost cause” up there, I meant, “I’m still obsessed with figuring out this drifting audio and I’m never going to stop until I do.” I re-wrote the scripts to concatenate with the ffmpeg filter and render in one step. For this particular workflow, I can do that without suffering a video quality loss. It means I have one PowerShell script that is only for this one specific case of recording dozens of short Dwarf Fortress video clips and appending them together, which will probably never be re-used again in my lifetime, but at least I can concatenate the damn videos and play them on my iPad correctly now. New script to follow after some more testing.