Thursday, August 6, 2020

Petris code and using nbasic

A few more thoughts about creating a NES game with nbasic (see previous story http://fixermark.blogspot.com/2020/08/petris-creating-nes-game.html for the game itself).

Code to read state of the NES controllers




I chose to use nbasic for this game because I had experience with it in the past; in college, I participated in a student-taught course run by the author of nbasic, which was excellent. I still use the course's site as a reference for "toe-dipping" into NES programming (along with the vast nesdev wiki). I and a team of 2 others created a small Zelda knock-off that is, unfortunately, lost to history. But using nbasic was a positive experience, so I sought it out again when I got the bug in my brain to try and write another NES game.

I'll talk a bit about the structure of Petris's code, then about my experience working with nbasic, and my plans moving forward.

A Dive Into Petris's Code

Roughly speaking, Petris is divided into initialization, game phase loops, game logic code, PPU control code (i.e. render logic), a big chunk of raw data to control what is shown on screen, and a mandatory footer. The header and footer are still cargo-cult code for me; I started from the Starter Source on the nbasic site and expanded out from there (leaning on examples also found on the course site as I went).

Initialization

Initialization begins at the start: label, where we switch off the PPU, do some things that are heavily recommended as best-practice, and configure the PPU as we wish to use it later. Disabling the PPU puts nothing on the screen, but guarantees draw behavior won't interfere with heavy-duty memory manipulation for the PPU, as the CPU and PPU both use the same latches for manipulating memory. It's useful for setup.

We then declare a set of arrays of various sizes. Nbasic allows for arrays to be declared that can be fit anywhere, or at specific locations (this is useful for naming dedicated memory addresses), or in "zeropage" memory (it's a quirk of the 6502 CPU design that RAM with a 0x00 high-order-byte is faster to access, because special opcodes can be used to fetch it with a one-byte address identifier). The design of nbasic doesn't lend itself super-well to structures, so instead of, for example, having a structure encapsulating the current controller values, values from the previous frame, and "rising edge buttons" (i.e. button-down events on this frame), I have three separate arrays, each of length 4 for four controllers.

The last step, at initgame:, is  prepping up game-state logic: timers, sprite initial state (position and animation frame), scores, and a playing flag indicating the game is in progress. Then a bunch of subroutines are called to draw the game into PPU memory (since it's a simple one-screen game, there isn't any scrolling and scroll logic is used just to flip between the main game screen and the "Player wins!" screen).

Game Phase Loops

The game has, broadly, four phases: wait for start, countdown to play, play game, and player-win screen. A series of four blocks of code handles these. Each is structured basically the same way:
  1. Wait for NMI ("non-maskable-interrupt",  the signal used for, among other things, telling the CPU that the PPU is entering vertical-blanking mode and it's safe to mess with PPU memory)
  2. Update the PPU to move things around on the screen
  3. Use the vwait_end subroutine to wait for the PPU to start drawing things on screen, which is a good time to run game logic
  4. Do game logic
Game logic varies from phase to phase: some just run counters, some ask for updates from controllers. The playing flag is set while the main-game phase is running, which other subroutines can use to decide if we're actually in the game phase.

Game Logic Code

Various subroutines in this area handle things the game needs to do to maintain game state, including reading controllers, updating scores, and updating logic that drives animations. Some of these subroutines take input in the form of assuming nbasic named variables are set (I could possibly have used the stack for this, but see "Using nbasic" for why I chose not to). They also use named variables to maintain internal state, which caused me to trip over a bug in nesasm; nesasm tops out at 32 characters for any label (a simple buffer-overrun protection), but it handles long labels by ceasing to read characters, not by throwing an error. This approach is memory-wasteful, in that each subroutine with internal state is going to take a couple bytes of memory that nothing else can use (but the alternative is coming up with a clever protocol for re-using bytes, which risks having two subroutines "cross talk" and tap-dance on each other's internal state).

Of particular note here is load_joysticks: and detect_edge:, which populates joy_edge, an array of bits representing which buttons are freshly-pressed this detection cycle. This is needed for distinguishing new taps from buttons held down (scoring is done at the rate of 1 point per fresh tap, to get that old-school "hammer on the controller for more points" feel). 

This is also where animate lives, which is a simple routine for driving "animations" in the form of x and y offsets to a sprite over time. Each sprite's position is controlled by two value sets: the actual displayed x and y position (which lives in the sprite_mem array and is flashed into the PPU every redraw) and the logical x and y position (which live in sprite_x and sprite_y). The sprite_anim_idx value is an offset into an animation data field that drives animation; every frame, the sprite's x and y offset are read from animation at sprite_anim_idx, added to sprite_x and sprite_y, and the result stored to the on-screen position in sprite_mem. Then sprite_anim_idx is rolled forward, and if it points to the sentinel value 0x80, the animation is done. This is the beginning of a mechanism that could allow for more complex animations (for example, we could add changing the displayed CHR for the sprite, or an additional sentinel value could be added to create looping animations).

Squirreled away under the PPU control code and before the raw data is a bit more game logic that includes a shamelessly-stolen randomizer. The NES itself doesn't have a "true" source of entropy (i.e. there's no built-in battery-backed clock, so without battery-backed memory, the NES's operation is fully deterministic from powerup based on code and controller inputs). The game uses the time between start and player 1 pressing 'A' to start the game to seed a random number generator driving where the dog wants to be pet.

PPU Control Code

This code controls moving things to the PPU for display. Huge, repetitive chunks of this code are loading and storing slabs of images on screen. There's plenty of room for improvement, and I sort of got better over time as I had to write more (for example, I'm not at all proud of load_dog in its current form, which is split up to account for the limitations of CPU speed to make room for screen redraws between loading chunks of the dog). In the dog-loading code, I also accounted for a limitation of the 6502 CPU architecture: when loading data from an absolute memory offset, the 8-bit math only allows at most 256 bytes of data to be addressed from one absolute offset. There are much better ways to design this (the 6502 also allows "indirect indexed" and "indexed indirect" memory access, which reads and writes values based on two bytes in zero-page memory and is the closest thing to modern pointer arithmetic the 6502 offers). The nbasic language itself doesn't have a syntax to take advantage of this feature (as far as I could tell), and I shied away from using direct assembly (for reasons I describe in "Using nbasic"), so I repeated a lot of simple "ripper logic" loops for the sole purpose of loading the same kind of thing from different points in memory.

Raw Data

The wide blocks of raw data are mostly used to map objects in the game to their positions on screen and the CHR tiles that make up their images. Generally, these come in two flavors:
  • starting PPU Y and X address, first CHR, and how many CHRs make up the image (mostly for text)
  • starting PPU Y and X address and a list of CHR bytes making up the image horizontally (for big things like the dog)
Finally, the game palette is configured to give some nice colors for the dog and four distinct player colors for the hands.

Were I to write this all over again, indirect-indexed addressing could really cut down on the amount of repeititon here, as well as in the PPU control code (possibly at the cost of speed, since the indirect CPU operations cost more clock cycles... But this is not a game in need of high-performance tuning!).

Using nbasic

Overall, I found nbasic to be relatively straightforward and much preferred to raw assembly-bashing; it makes some things easier and a few things maybe a little harder

Pros

  • The language really simplifies some things that would require verbose proper form in assembly. In particular, abstracting indirect memory access as arrays and simple arithmetic as prefix-based expressions makes that logic much easier to read than gleaning the meaning of blobs of assembly. I found I would quickly fall into a "flow" of bashing out new segments of the code.
  • The language isn't far abstracted from the underlying assembly; in places where it did something surprising, taking the compiler's assembly dump and matching statements up against the statements in nbasic was very easy. For something like this, where individual bytes can matter (for both space and CPU cycles), that lack of indirection on the abstraction is helpful.

Cons

  • I tripped over a couple of bugs in the implementation. One I was able to push a fix for; the others were more challenging to reproduce and included an issue mixing binary, hex, and decimal representations of numbers in one data statement. There are also a few cases where nbasic will emit code that doesn't pass the assembler (for example, the branchto operator doesn't confirm whether or not the branch distance is too far, trusting the assembler to know). These issues were minor and relatively easy to work around.
  • While nbasic allows users to touch the three 6502 registers directly (i.e. a, x, and y refer to the actual registers), the way they interact with nbasic code is ill-specified and very implementation-dependent. In practice, I found it safest to use assembly directly when I manipulated those variables.
  • There is no concept of functions or local state in nbasic. This could be considered both a pro and a con; the NES architecture is small, and cycles matter. Managing internal state like that costs resources. However, a lightweight function or local-state abstraction would be nice-to-have.
  • As far as I can tell, nbasic doesn't really allow for "full" pointer arithmetic; memory can be accessed by offset from absolute addresses, but accessing it via "the 2-byte address stored at this address" seems to be nonexistent (unless I missed it).
  • Partially because of the difficulty of doing pointer arithmetic, one of the bits of toil coding in this environment is repeating functions with minor tweaks, such as changing the labeled input. The ability to do templates or macro expansions would be valuable for decreasing the amount of repetition needed in the code.

Possible Future Extensions

If I were to extend nbasic, the main thing I'd add is a function-call operation. The 6502 has a stack abstraction that makes a push-call-pop argument-passing interface extremely possible. It could be a fun project to extend nbasic in this way. Local state could be a bit trickier, but coupled wit h function calls, the notion of a local variable can be managed by pushing them to the stack when a call occurs.

An abstraction for pointer arithmetic would also be valuable. I can imagine doing it by providing a simple abstraction for setting two subsequent bytes of zero-page memory to a labeled address, then allowing the [array offset] accessor to reference them. It's possible one would even want those two-byte zero-page values to have a special name (maybe a new pointer declaration, like the existing array declaration) to denote their purpose?

What is next?

There are other languages available for programming the 6502, and I'm interested in checking them out. For my next project, I plan to explore the CO2 language, which is a LISP-based system created to build the game "What Remains." I'll be interested to see how a more abstract language changes the game of making the game.

No comments:

Post a Comment