Friday, September 25, 2020

Handwave code and using co2

Let's break down the code in Handwave to see how co2 works as a NES game development language.

Controller logic code in Handwave
Code to read state of the NES controllers



Here, I'll talk a bit about Handwave's code and some of my experiences working with co2 as a programming language.

A Dive Into Handwave's Code

Not unlike my previous game, Handwave is broken up into initialization, game phase loops, game logic, controller logic, PPU control code, and raw data. Additionally, there is audio logic to control the APU, and two large state machines to read the music data (one to convert the data to notes on the screen, and the other to drive detection of when notes should be played and trigger the APU correctly). I was able to get started from co2's packaged-in example code and leaned on some snippets of What Remains's source code for examples and language quirks.

The source code is here for following along.

Initialization

At the top of the file is a nes-header macro that declares some "hardware" configuration (this basically tells emulators what sort of cartridge this game would have run on, were it a real cart). We then have a set of defaddr and defconst declarations to name things. I absolutely love that this language allows for constant declarations; takes a lot of cognitive load off of managing the code to be able to name things!

The game's code code essentially has two entrypoints defined by the (defvector) declaractions: reset is the code run when the system is powered on (or the reset button is pressed), and nmi is the code run every time the vertical blank signal (i.e. a "non-maskable-interrupt") is sent to indicate we're between frames of video. In reset, we do the regular setup for preparing the game (initializing variables and rendering an initial screen to the PPU). Several init- and load- subroutines live here, which prep various pieces with various patterns of data. co2 provides several convenience routines in its "standard library," as it were, that simplify this process; ppu-memset is sugar on top of initializing a range of PPU with a constant value (by storing an address to the PPU_ADDR address and then looping across storing data to the PPU_DATA address); ppu-memcpy is similar sugar for ripping a chunk of RAM into the PPU. A variety of set-sprite- methods do the math to set pieces of sprite data (assuming the sprite data is positioned at the recommended #x200 address).

Game Phase Loops

At the moment, the game has two phases: waiting for start, and playing. Since there are only two, a simple g-playing global variable tracks which one we're in. Every nmi, we first update everything that requires the PPU to be quiescent (;;; TIMING CRITICAL CODE), then we drop into the relevant loop depending on mode.

In the waiting-to-play mode, we listen for buttons to indicate players want to play (check-wave-activations), listen for "netcode mode" to be toggled (toggle-netcode), clear the logged-in players if player 1 hits select (clear-active-handwaves), and start when start is pressed (by setting g-playing to #t). "Netcode mode" is a feature I added to make the game easier to play on multiple machines over a network, which I'll explain in a future post.

Play mode scrolls the notes, checks to see if players are playing notes, and runs the animation loop for moving sprites around.

Game Logic

Several pieces of game logic live here, split into their own subroutines. (check-wave-activations) looks to see if controller buttons have been pushed to sound the handwave "bells". Any buttons pressed trigger (on-wave-triggered), which either logs the player in (if we're not g-playing) or plays the note (there are two ways notes play, depending on whether we're in "netcode mode" or not).

There's a utility function buried in here that I want to highlight: (roll-left-n). This sub gets around a small "bug" (really, a lack of feature) in co2: the left-shift operator (<<) can only accept a constant value as the number of bits to shift (note: this is still more functionality than raw 6502 assembly gives us; the underlying ASL operator always shifts left only one bit!). This utility function accepts a second argument and uses it as the number of bits to shift. The name isn't great; I forgot that ROL is also a 6502 assembly operator, which "rolls" a field of 8 bits (moves the MSB to the LSB and moves every other bit one towards MSB). I like now easy it is in co2 to lay down subroutines like this; now that I have it, it's just as easy to use as the existing (<<) routine (though a lot more CPU-intense).

(handle-next-draw-note) is one of the two "parsers" for the music representation language I created. The language consists of bytes that are either "nodes" or "directives." A "node" guides how music is played, and can be a note or a rest; a "directive" controls the musical staff. It currently supports ending the song, but in the future it can be used to indicate changing the voicing of a handwave. The format is detailed in the area headed by ;;;; SONG DATA. This parser is called once every "beat" ( about 8 frames of animation); it consumes 1 or more bytes of music information to determine what notes should be rendered at the right edge of the screen. The return value is a state machine flag (either read another byte or pause reading for n beats to rest) and 16 bits indicating which notes should be drawn; input to the function is 16 bits serving as accumulators for the notes to be drawn. A global, g-song-render-index, tracks where in memory we are.

Pointers are a very cool feature of co2; they smooth over the fact that in the 6502, arithmetic registers are 8-bits wide but the address space is 16 bits. Several specialized opcodes in the 6502 allow you to do indirect indexed lookup of memory by treating two sequential bytes in "zero-page" memory (the addresses $00-$FF) as one 16-bit address. So co2 has utility functions to represent a single 16-bit zero-page value and to use those values to "peek" and "poke" (read and write) memory and to increment the pointer (taking care of the overflow in the math to step the MSB of the 16-bit address when the LSB overflows). Handwave uses two pointers to track song progression: the g-song-render-index draws notes, and the g-song-play-index handles audio playback, 27 beats behind the first pointer (to give notes time to crawl right-to-left across the screen).

Shortly after (handle-next-draw-note) is (handle-next-play-note). It's very similar in structure, but instead of adjusting PPU state, it determines if the player is trying to sound the note and adjusts the audio state. The state machine itself is basically the same, and it also takes in two 8-bit accumulators to track the state of all 16 playable "handwaves." There is some fanciness to account for netcode mode; outside of that mode, the logic determines if a note is for a handwave not controlled by a player and auto-plays it, but inside of that mode it also tracks whether the player is trying to play a note and might be lagging.

Controller Logic

The (read-joypads) sub pulls data from all four controllers and updates a set of global variables:
  1. pad-data, which tracks which buttons were pressed
  2. pad-data-last-frame, which tracks button presses on the previous run of read-joypads
  3. pad-press, which is the buttons that are newly-pressed (i.e. "button down" events)
This data is read from (button-pressed), which gives a few into the buttons newly-pressed this read.

Graphics Logic

(plot-notes) draws 16 notes onto the column of the staff just off-screen, using (find-scroll-edge) to locate which column should be updated. Next, there's an (anim-sprites) routine. This is a rewrite of a similar routine I built for Petris; it runs through some small lists of offsets to sprite position each frame to "bump" the sprite until a "stop" indicator of #xFF is reached, which ends the animation. Setting a memory location to an index of one of the animation offsets starts applying the animation to the sprite the next time the subroutine is called.

Audio Routines

The NES audio toolkit consists of five waveform generators and a simple mixer to combine them. Handwave currently uses only the two square wave ("pulse") generators (the sawtooth generator has no volume control, which makes it a poor fit for a bell-like sound). The audio processor provides a limited amount of automation, of a sort; it's capable of automatically tracking note play duration, a pitch-bend effect, and a "fade" effect (which we use to give a nice bell sound). Timbre can also be adjusted by tweaking the duty cycle; the square wave can be "on" for 12%, 25%, 50%, or 75% of the time. We use a 50% pulse width and a volume-decay of 3, so the envelope logic diminishes volume once per 3+1 = 4 quarter-frames, i.e. once per audio frame (audio runs at about 60Hz timing on frames); since the decay level steps from 15 to 0, the sound decays over 15 frames, which is about 1/4 second.

For Handwave, notes are played by (play-note). We set the pitch from a precomputed lookup table for the 16 notes (calculated by the logic in song.scm, which I'll explain in a future post). A simple flip-flop value in g-which-pulse alternates between 0 and 1 every time a note is played to choose whether it's played on the first or second pulse generator (so up to 2 notes can sound simultaneously).


Data Section

The final portion of the code hard-codes various data values, including which buttons play which handwaves, the note pitches, the palette configs for sprites and background, the animation for the handwave icons to "bump" when they're played, and the song data itself.

How NES programming differs from modern web apps

My day job is to do user interfaces for web applications. One of the things I enjoy about NES programming is that the discipline itself has different best practices; I like working on something that makes me shift my focus and remember there are other modes of development. Some big differences:
  • You own the world: Modern app development often has to account for the possibility that the user is doing multiple things. In a web app, the user could close or reload the page at any time, or could put the whole machine to sleep or disconnect the network. None of that is a real concern in a NES game; nothing else is fighting for your resources, the entire CPU, graphics, and audio systems are your own, and if the user resets, it's because they want to restart the game. Battery-backed games can find the need to care about saving and restoring state, but simple games don't worry.

  • Pointer arithmetic is different from data arithmetic: In almost all modern programming on a system of at least the complexity of a CPU, the width of data and a memory address are the same. All general-purpose-computer architectures in this half-decade are using 64-bit address space and 64-bit arithmetic (at least!). And addresses are unlikely to get larger (2^64 is enough indexes to assign a unique number to every grain of sand on every beach on Earth, with a bit left over... that "bit" being "a second Earth's beaches"). It's extremely convenient for memory addresses to be smaller than (or the same size) as numbers the CPU can do math on; that means "pointer arithmetic" is just regular arithmetic.

    The 6502-series microprocessor in the NES uses 8 bits for arithmetic and 16 for addresses. Address math requires setup, relative to regular math. And the CPU has a whole set of special (slower) opcodes to allow for doing operations on memory locations referenced by 2-byte "pointers"---as long as those pointers live at a memory address between #x00 and #xFF. Fortunately, co2 abstracts much of this with special pointer functions.

  • Global variables are best practice: In big systems applications, context and local state is preferred over global variables because "global" means "anything can read or write it." That gets messy fast when you have multiple components touching multiple pieces; it quickly becomes a system where nobody knows what's going on.

    In a small system like a NES game, we don't have the luxury of context. The stack is extremely shallow and intended mostly for remembering subroutine addresses. And as we've seen, scrubbing back and forth across memory takes extra setup and precious CPU cycles. So for many things (especially game state and position of things on screen), global variables rule the day. There's no reason to pass a pointer down four or five levels of a call stack to reference the beginning of sprite memory if every part of the code that works with sprites just knows the address of the first one!

    Unfortunately, this best practice harms one thing: code re-use. Keeping state local makes it easier to copy-and-paste a chunk of code without dangling references. This matters a lot less in the NES world (not much code can be sensibly copy-and-pasted, since every game is so different), but it does matter.

    Worth noting is that this best practice doesn't imply we never use local state. Like all best practices, it has exceptions; utility subroutines that could be called from many places are harder to use if they need to use global variables to "pass arguments." The co2 language has a great "compiled stack" feature to address this, which can intelligently carve up spare memory at compile-time into slices that are mutually-exclusive, so they can be re-used by different subroutines. You get the benefits of modern heap-allocated memory without heap management.

What comes next?

In subsequent posts, I'll talk a little bit about the mini-language I put together to craft songs and the hack I added to account for lag as we tried playing the game using Kosmi.


No comments:

Post a Comment