Using SP1 for scroller games

The place for codemasters or beginners to talk about programming any language for the Spectrum.
Post Reply
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Using SP1 for scroller games

Post by jorgegv »

Hi dudes,

I'm reposting here some articles I have recently posted on Z88DK forums, I hope you find them useful. I'll keep posting updates to both forums, if @PeterJ does not forbid me to do so :D

--------
My previous proofs of concept for vertical scrollers on the ZX spectrum were based on double buffering, and the offscreen buffer had a linear memory layout. This layout very much simplifies the scrolling and screen transfer routines, and also the sprite routines (had they been written!). Linear memory layouts are simple and trivial to work with.

Having done the initial POC for the buffer, scrolling and transfer routines, and having the scroller already working on the screen, I found myself having to write the sprite routines if I wanted to do any games. Sprite routines would have been simple to write for the linear buffer, but the point was that I already had extensive experience using the SP1 library (a great sprite library for the Spectrum written by Alvin Albrecht), and even fixing and modifying it; SP1 already has the capabilities for handling the most important layers in a game: the background layer and the sprite layer. I felt it could be really good if SP1 could be used for handling those layers in my Speccy scroller games.

Then I remembered that when specifying the background tiles in SP1 for each character position, you use a 16-bit value that can specify 2 tile types:

- If the value is 0-255 (high byte is 0), then the low byte is used as an index on a global tile table (8 bytes per tile), which needs to be predefined elsewhere, normally during game loading or initialization.

- If the value is greater than 255 (high byte != 0), then the full 16-bit value is used as a pointer to the raw 8 bytes that should go into that screen position.

I use the second mode in my RAGE1 game library as a way to escape the 255 tile limit, and I am quite familiar with it, so I realized that this method could be used in the following way:

- The offscreen buffer is divided in 8-byte character cells (8x8 pixels) than can be used as SP1 tiles. That is, instead of a linear memory buffer by horizontal lines, graphic data is stored as an array of (M rows x N cols) character tiles.

- The SP1 background tile map is setup so that for each screen character cell, its tile pointer is set to the corresponding cell in the offscreen buffer.

- Scrolling is done on the offscreen buffer, and then the scrolling area is fully SP1-invalidated on each frame (`sp1_Invalidate()` function), so that all of it is redrawn when `sp1_UpdateNow()` function is called.

- Since the sprites can also be managed by SP1, the net result is that we can have a scrolling background with our SP1 sprites moving over it.

- Finally, as an optimization, the cell layout in the offscreen buffer is done by columns (instead of by rows): in (row,col) cell coords we have (0,0), then (1,0), then (2,0), and after the first column ends, (0,1), then (1,1), then (2,1), etc. With this storage schema, vertical scrolling on a given column of character cells can be done with just 8 x M LDI/LDD instructions (unrolled), or a single LDIR/LDDR (loop).

I have made a simple POC which configures all SP1 tile pointers in the scrolling area to a single tile which I trivially manipulate on each frame to get a feel of the scrolling movement; just to verify if SP1 is quick enough to do this.

I found that of course it can't do 50 fps (I expected that), but so far I have found it quite capable, so I'll do more experiments by tweaking e.g. the scrolling area, the scrolling speed, number of sprites, etc.

I'll report back :D

P.S. This post is copied from my notes on my scroller repo , you can head there for the example code. Of course, any feedback is very welcome as usual.

P.P.S. Just to clarify: my scrolling experiments are always pixel scrolling, not character scrolling.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

SP1 scrolling test 1

It can be found in the src/sp1-randomtiles directory. This is a real scroller, using the workflow that would be used for a real game: tiles are drawn on some non-visible top rows and are brought into view by the scrolling process. In this example, the logic for printing tiles on the top is trivial, but in a real game those would come from a map.

Keep in mind that the top row of invisible tiles is only drawn once in a while (when a full tile row has been scrolled down) and not on every scroll cycle.

The scrolling routine has been unrolled (multiple LDDs instead of one LDDR) for better speed.

The number of pixels to be scrolled down on each scroll-cycle can be configured at the top of the demo. Less pixels = slower, but smoother. More pixels = faster but clunkier. 1-pixel scrolling is smooth but it could be apropriate (or not) for the game.

The initial numbers for this scroller are around 12.5 FPS, so we are spending 3 frames and some more just for the graphics handling. We could aim for a 10 FPS target, having some 1 frame and a bit more for our game calculations, sprite movements, checks, etc.

Also worth considering: SP1 graphics algorithm is designed to not have any flickering, so we can try to run at full tilt avoiding the HALT and not waiting for Vsync. This would make the game run slower or faster depending on the number of elements on screen, which may be not desired, but it's worth a try when we have sprites moving on the screen (on next tests).

Here is the TAP file, code is in the repo.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

SP1 scrolling test 2

It can be found in the src/sp1-sprites directory. It's the previous Test 1 real scroller, but this time with sprites running all over the place, while the background is scrolling down.

The source is fully parameterized, so that different configurations can be tested and conclusions drawn.

The following parameters can be modified by just changing the #define's at the top of the C file:
  • Size and position of the scrolling area (dimensions in char cells)
  • Number of pixels to scroll down on a single scroll cycle (pixels)
  • Size of the top non-visible tile row - the one that brings new tiles from the game map (char cells). This is, practically, the size of the biggest tile in the map.
  • Number of sprites (a maximum of 16 have been defined, but only the number #define'd will be used for the demo)
My findings so far:
  • SP1 seems very capable of running a game with a scrolling background and several sprites moving on the screen
  • The number of simultaneous sprites affects speed, but not as much as I expected. A possible explanation may be that since the whole scroll area is invalidated (currently) the drawing of all sprites does not invalidate more cells when more of them are on screen, so just their calculations and positioning matter.
  • I have the definive impression that not using the HALT to wait for Vsync makes the demo go smoother (opinions?)
My next optimizations will be oriented to find the way of not invalidating all the screen, but only the affected cells. This will probably force me to keep track of the current tiles on screen and their position, and will probably give a real speed boost.

Click here for the TAP file and the source code.

P.S. Assembler functions and unrolled loops have been modified to not require any manual adjustment - just change the #define; also I have switched to SDCC compiler for better C syntax support :-)
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

SP1 scrolling test 3

It can be found in the src/sp1-partial-inv-1 directory. Based on Test 2, but this one keeps track of the tiles that are present on each column, and only invalidates the cells that are affected by them. It also keeps track of how the tiles walk down the screen, and propery moves the cells to invalidate.

The information on drawn tiles is known when printing tiles on the top non-visible row.

The basic idea to explore is column cell-range invalidation:
  • Instead of invalidating the whole scroll area with a single sp1_Invalidate() call, we will run an invalidation per column.
  • On each column, only cells that have moving content will be invalidated, taking into account that e.g. a 2-row tile will most of the time invalidate a 3-row area due to vertical pixel offset.
  • Initially, only one range per column will be invalidated (for efficiency reasons). That is, if we have tiles in rows 4-5 and 10-11, the whole 4-11 range will be invalidated.
  • For each column, there is a (min,max) pair which holds the current range of cell invalidations to be done for that column. This pairs are moved down for all columns every 8 scrolled pixels (cell changes occur every 8 scrolled lines). So if scroll is 1-pixel, it will be done every 8 scroll cycles; if scroll is 2-pixel, every 4 scroll cycles; etc.
  • Since these min and max values will be cell-based, and in ranges 0-31/0-23, they will be of type int8_t (signed) and we will use min = -128 (NO_RANGE constant) to signal that there is no invalidation to be done on that column.
  • Initially, each column starts with a (NO_RANGE,NO_RANGE) pair (no invalidation)
  • When printing a tile on the top non-visible row:
    - Set MIN to -(height of top non-visible row)
    - If MAX is NO_RANGE, set MAX to 0
  • Every 8 scrolled pixels, do the following for each of the columns:
    - Adjust MIN and MAX (add 1 to each of them, saturating at the maximum row value)
    - If MIN is out of the scroll area, set both MIN and MAX to NO_RANGE
  • The previous adjustments and movements will have to take into account 1 additional cell due to the pixel offsets mentioned above.
We could keep track of all tiles on each columns and only invalidate the strictly necesary, but this would imply having a changing list for each column, which means more processing on a critical code path. If we have several tiles on the same column, I have assumed that it's probably cheaper to simply invalidate everything in-between than trying to cherry-pick cells to invalidate.

My findings with this test:
  • The idea of only invalidating the needed areas is good and speeds up the demo
  • The current algorithm is not so good: given that on any given column, only a single range of rows can be tracked and invalidated, it quickly degenerates into the full-column invalidation case. E.g.: when the map has few and vertically sparse tiles, the algorithm works well (it can be appreciated at the beginning of the demo, when only a few background tiles can be seen on the screen); but when more tiles start to appear, suddenly all the columns have several tiles and almost the full column is invalidated, defeating the optimization.
A better algorithm which strictly keeps track of all the on-screen tiles and their invalidating rectangles will be explored in Test 4.

As usual, click here for the TAP file and the source code.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

New stack-based scrolling routine

Also in Test 3, a new stack-based scrolling function has been developed and tested.

I have found that the regular unrolled LDD version is hard to beat, mainly due to the fact that when moving the bytes, the source and destination ranges mostly overlap, and this makes it more difficult to optimize the stack transfer.

The approximate T-state count for each of the versions is commented in the code.
User avatar
clebin
Manic Miner
Posts: 979
Joined: Thu Jun 25, 2020 1:06 pm
Location: Vale of Glamorgan
Contact:

Re: Using SP1 for scroller games

Post by clebin »

Lovely, thanks @jorgegv. I’ll look forward to digging into this when I have time to get back to Speccy coding. A couple of my game ideas really need some form of scrolling and my skillz aren’t up to the job of figuring it out.

If you’re in tinkering mode again in the future, it would be interesting to see a character based scroll, maybe around a larger map. When I get around to finishing my Gilligan’s Gold update I’m aware that I’m adding to the pile of flip-screen games with platforms. These routines open up all sorts of possibilities that we don’t see much of in Speccy homebrew.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

Well, I'm in tinkering mode right now :D

These tests I'm doing can be configured for cell-sized (=8 pixel) scrolling, but I think you will probably be more interested in on-demand character scrolling in all directions. You can test my routines for 8-pixel when I include attribute scrolling in them (or you can test them now in case you don't mind for the attributes).

So far my next plans/ideas to test are:
  • A better invalidation algorithm for the scrolling tiles
  • Selective scrolling: not the full column, but only the address ranges that have tiles in them
  • Attribute scrolling
  • Vertical parallax scrolling (=scrolling different columns at different speeds)
  • General speed optimizations (local->static vars, etc.)
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

SP1 scrolling test 4

It can be found in the src/sp1-partial-inv-2 directory. Based on Test 3, but changing the algorithm which keeps track of the cells to invalidate. Conceptually, this method keeps track of the position of each and every tile which is seen on screen, and periodically adjusts the cells that it invalidates in lockstep with the scrolling routine.

Basic ideas:
  • We keep a global list of printed tiles and their invalidation rectangles. This list is a ring buffer for efficiency, since all tiles in that list will added and removed in FIFO fashion
  • When printing a tile on the top row, it is also added to this list, together with its position and rectangle definition
  • The list is updated every 8 scrolled pixels and all the positions of all visible tiles and their rectangles are moved down one row
  • When tiles reach the bottom row and go out of sight, they are removed from the global list. So at any moment, the list only contains the visible tiles.
  • The maximum size for the global list is fixed and depends strictly of the scroll area and the tile sizes, so no heap is required. It can be declared statically with a maximum size, and have the map function ensure that there are never more than that number of visible tiles on screen (including the top non-visible row)
  • The invalidation routine walks the global list going over each of the visible tile position records, and invalidating its rectangle. It is run every frame, so every optimization here has a big impact on performance. The initial version created a sp1_Rect for every visible tile based on its coordinates, but this code was quickly moved into the tile printing and moving functions, since these two are not run every frame, but every 8 scrolled pixels.
This version is faster when there are fewer background tiles on the screen, so its real utility will depend on the complexity of our map. For a map with lots of visible tiles, it seems to be more efficient to use the previous versions which simply invalidate the whole scroll area.

So far the experiments indicate that if you have a complex map, then it's better to have a simple routine that invalidates all the area, and spend the (less) remaining CPU time in having a good simple gameplay. Or you can have a simple background and go with a more complex gameplay, faster, with more sprites, etc.

Next test: Test 5, partial scroll. I'll try not to scroll the whole column, but only the address ranges affected by the visible tiles. We already have this info in the invalidation ranges used in Test 3, so Test 3 will be easily adapted for this case.

It will be very difficult to do it by reusing the tile list used in Test 4 because tiles in that list are stored horizontally (i.e. tiles on top row, then tiles below, etc.), but we need the vertical ranges to make the scroll. It will be very timeconsuming to scroll every tile rectangle separately.

I'm thinking also on designing a visual way of measuring performance that I can include in all the tests; some counter that is incremented on each interrupt and measured and reset on every scroll cycle, and periodically output to screen in some way that does not interfere much with the graphics update.

I also split the code in several files for easier handling.

You can download the TAP file, or read the code here.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

SP1 scrolling test 5

I have decided not to explore the mentioned Test 5 case (partial column scroll), since again we will have the degenerated case of having to scroll most of the column most of the time, and also the scrolling is not the most time-consuming operation which is being executed.

SP1 scrolling test 6

It can be found in the src/sp1-parallax directory. Based on Test 3, the assembler scrolling routine is modified to accept a parameter which is the number of pixels to scroll. Since the scroll routine is LDD based, the number of pixels scrolled does not affect the speed of the routine, it's just a matter of adjusting the offset from the source to the destination address.

The scrolling area is now divided in 3 areas: AREA 1 is the center zone, where the action happens (columns 2 to 13); AREA 2 (columns 1 and 14) and AREA 3 (columns 0 and 15) are the parallax effect zones that scroll at different speed than the main one. AREA 1 scrolls at 1 pixel per cycle, AREA 2 at 2 pixels per cycle and AREA 3 at 4 pixels per cycle.

These new areas have to be explicitly managed when doing the whole scroll effect. This effect is achieved with 3 main functions: draw_top_row_of_tiles(), scroll_down_area(), and move_down_tile_positions(). All 3 functions have been modified so that they take into account the 3 different areas and their scrolling speeds.

Attributes have been changed on all areas for better observation of the effect, and also sprite movement has been constrained to AREA 1, so they don't move over the parallax zones.

Admittedly I'm not a great artist and the tiles are not of great quality.

I was expecting that parallax would make the general scrolling substantially slower, but surprisingly it didn't. I blame it to the draw_top_row_of_tiles() and scroll_down_area() functions not doing that much additional work (with respect to Test 3), because the parallax tiles are half the size of the regular ones and also they are printed ocassionally (as the others). The move_down_tile_positions() function has not changed at all, since the work to do is the same for all the columns, no matter in which parallax zone they are.

My next steps will be to optimize the code, add a trivial IM2 routine and deploy the "performance" monitor in all the previous tests, so that the perceived performance can be justified.

I'll continue exploring other scrolling techniques with code in this repo, although it will probably not be SP1 based.

As with previous examples, you can download the TAP file, or read the code here.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

Performance review of all SP1 scrolling tests

In this section I have compiled the performance characteristics for all the previous tests. For this, I have needed to normalize the code so that comparison between measurements are meaninful.

I have disabled the regular ROM interrupt processing and activated a minimal IM2 interrupt routine and performance counter in all the previous SP1 examples, so that we have a consistent way of measuring the performance of all of them.

Performance is then measured as a Frames Per Second counter which is shown every second at the bottom left corner. It is interesting to see how it changes depending on the number of tiles on screen and the different algorithms used for updating the scrolling zones.

Synchronizing to screen retrace via HALT instruction has also been removed in all tests, since it is not needed in SP1 games, and also allows us to measure the raw speed with no artifacts.

Finally, all scroll areas have been normalized to a 16x16-cell zone, and the number of sprites has been set to 6 in all tests that have them.

The following table resumes the measured performance on each of the tests:

Code: Select all

| Test # | Directory         | Description                                                           | FPS   |
| ------ | ----------------- | --------------------------------------------------------------------- | ----- |
| Test 0 | sp1-baseline      | PoC to test raw performance of SP1 updating whole scroll area         | 22    |
| Test 1 | sp1-randomtiles   | Whole scroll/top-row-update loop, full column scroll and invalidation | 17-18 |
| Test 2 | sp1-sprites       | Test 1, with added moving sprites                                     | 13    |
| Test 3 | sp1-partial-inv-1 | Test 2, with added partial column invalidation, method #1             | 17-25 |
| Test 4 | sp1-partial-inv-2 | Test 2, with added partial column invalidation, method #2             | 20-25 |
| Test 5 | -                 | Not implemented                                                       | -     |
| Test 6 | sp1-parallax      | Test 3, with 2 added parallax zones scrolling at different speeds     | 11-17 |
Conclusions:
  • For a 16x16 scrolling area, a baseline framerate of 22 FPS shows that scrolling games with SP1 are quite a possibility, given that SP1 conveniently integrates background _and_ sprite management in a single library.
  • The scrolling routine is not the critical part, but the SP1 update is. Initial baseline measurements indicated that the scrolling code spends only around half a frame for scrolling an area of this size, and the rest of the time being used by SP1.
  • The optimizations done in Test 3 and Test 4 (partial invalidation instead of fully invalidating the whole scroll area) are indeed valuable and make the FPS ocassionally reach the baseline measurement, and even a bit higher. This is quite remarkable, given than on these tests we have sprites moving all around. Optimization in Test 4 show slightly better performance than Test 3.
  • In both Test 3 and Test 4, results depend highly on the number of visible background tiles, which is to be expected.
  • The parallax effect, contrary to what was indicated in the previous statements, generates a measurable additional load with respect to code with a single scrolling zone.
  • For a scroller game with a single zone (no parallax), the better algorithm seems to be the one in Test 4
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

Updated performance review for a 16x24-cell scrolling zone

I have reconfigured all tests to use a 16x24-cell scrolling zone, which is a more realistic size for a scrolling game on the ZX. The results are proportionally similar to the previous ones, but with bigger differences between the baseline measurements and the optimized ones in Test 3 and Test 4.

Here is the table with the old and new results in an additional column for comparison:

Code: Select all

| Test # | Directory         | Description                                                           | FPS-16x16 | FPS-16x24 |
| ------ | ----------------- | --------------------------------------------------------------------- | --------- | --------- |
| Test 0 | sp1-baseline      | PoC to test raw performance of SP1 updating whole scroll area         | 22        | 15        |
| Test 1 | sp1-randomtiles   | Whole scroll/top-row-update loop, full column scroll and invalidation | 17-18     | 12        |
| Test 2 | sp1-sprites       | Test 1, with added moving sprites                                     | 13        | 10        |
| Test 3 | sp1-partial-inv-1 | Test 2, with added partial column invalidation, method #1             | 17-25     | 13-22     |
| Test 4 | sp1-partial-inv-2 | Test 2, with added partial column invalidation, method #2             | 20-25     | 17-23     |
| Test 5 | -                 | Not implemented                                                       | -         | -         |
| Test 6 | sp1-parallax      | Test 3, with 2 added parallax zones scrolling at different speeds     | 11-17     | 9-16      |
In this new test it can clearly be seen that the baselines (the ones that manage and invalidate the full area) suffer a big performance penalty when enlarging the scrolling area, while the optimized ones are not quite that affected. Indeed the algorithm in Test 4 seems a real candidate for a good ZX scrolling game, using all SP1 power.

I'd like to thank everyone for the feedback on this series of articles. As I said, I'll continue on my personal learning path, exploring and documenting some other scrolling techniques, but not SP1 based. I'm really satisfied with all the knowledge gathered here and I hope this can be a reference work for future game developers who want to make a scrolling game.

Have fun
J.
dfzx
Manic Miner
Posts: 683
Joined: Mon Nov 13, 2017 6:55 pm
Location: New Forest, UK
Contact:

Re: Using SP1 for scroller games

Post by dfzx »

I think a video, or at least some animated GIFs, might help get across what you're saying here. I appreciate you know what you're doing, but it's not an easy story to follow.
Derek Fountain, author of the ZX Spectrum C Programmer's Getting Started Guide and various open source games, hardware and other projects, including an IF1 and ZX Microdrive emulator.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

Mmmm how about direct links to the TAP files?
I take note of the video suggestion though. I think animated gifs are not a good idea becase we are talking about frames per second here, so GIF animation woud probably interfere with it.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

dfzx wrote: Mon Jun 12, 2023 9:53 am I think a video, or at least some animated GIFs, might help get across what you're saying here. I appreciate you know what you're doing, but it's not an easy story to follow.
Following Derek's suggestion, I have prepared a video presentation on these articles. Some SP1 background and detailed explanations on how the examples work are included; they are also shown running:

https://youtu.be/wSUX3YdvARw

The video is ~30 minutes long, you have been warned :D
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

I slipped a small mistake in the video: around 13:05 I say that the fastest scrolling routine is the LDDR one, but it is not. The fastest one is the one with the unrolled LDD loop, as it can be correctly seen in the repo code.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

Hi again.

I have continued my SP1 scrolling quest, now with multidirectional scroll... and so far, the baseline looks good! See for yourself:

Image

The demo above scrolls a 128x128 pixel viewport in 1-pixel increments over a 768x384 pixel map, following a predetermined path. The map is built as a 48x24 grid of 16x16-pixel tiles.

With no optimizations (brute force scrolling and SP1 update of full area), the framerate seems good enough to try on some optimizations.

The TAP file can be downloaded here.

I'm preparing a post with a detailed explanation.
dfzx
Manic Miner
Posts: 683
Joined: Mon Nov 13, 2017 6:55 pm
Location: New Forest, UK
Contact:

Re: Using SP1 for scroller games

Post by dfzx »

As someone who knows a bit about SP1, I can say that this really is an exceptional piece of work.

Can you do a demo video of faster scrolling, say 2 or 4 pixel steps? I'm guessing that's what the parallax example is doing, at least in part?
Derek Fountain, author of the ZX Spectrum C Programmer's Getting Started Guide and various open source games, hardware and other projects, including an IF1 and ZX Microdrive emulator.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

dfzx wrote: Tue Jul 04, 2023 2:40 pm As someone who knows a bit about SP1, I can say that this really is an exceptional piece of work.

Can you do a demo video of faster scrolling, say 2 or 4 pixel steps?
Here is 2px scrolling:

Image

And here is 4px scrolling:

Image

The 4px scroll seems a bit clunky, but it has the good thing that vertical and horizontal scrolling speed are quite comparable in performance, which makes for good gameplay. In lower speeds, the vertical scroll is much, much faster than the horizontal one, and it shows: framerate drops when any horiz scroll is going on, while vertical scroll is quite fast.
dfzx wrote: Tue Jul 04, 2023 2:40 pm I'm guessing that's what the parallax example is doing, at least in part?
Not quite. The parallax example came from my previous experiments which only tested vertical scrolling. And the way my vertical scroll routine works, it does not matter if you want to scroll 1 pixel up, or NUM_PIXELS, the performance is exactly the same: the screen data is laid out in columns, so scrolling is just a matter of doing unrolled LDI's with offset NUM_PIXELS.

With horizontal scrolling this is very different, we cannot do that trick, and it is much slower. Even more, one of the main tricks for horizontal scrolling, which would be to use "RR(HL); INC HL" sequence, cannot be used because of the vertical screen column layout; and we _need_ this layout because of SP1 tile format...

The only trick that we can use is the RLD/RRD shift by 4 bits with a single instruction, which allows for 4px clunky scroll.

Any optimization advise from any of the codemasters lurking here would be very welcome :-)
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

Here is also the 8px scroll video, for the sake of completeness:

Image

And here are the TAPs for all the previous demos, for easy access:

1-pixel Multiscroll
2-pixel Multiscroll
4-pixel Multiscroll
8-pixel Multiscroll
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

Hi, I have spent some time developing a real scroll-map system (the code that decides when and where to draw new graphics while we are scrolling in any direction). This, and some new tiles specially designed for this kind of map, have made the examples much more nicer to view. But still, the animated GIF files do not do justice to the real effect, I encourage you to download the TAP files and see for yourself in your favourite emulator.

I'm still working on the articles about all this, but anyway, here is the meat for the impatient:

1-pixel scroll (TAP file here):
Image

2-pixel scroll (TAP file here):
Image

4-pixel scroll (TAP file here):
Image

8-pixel scroll (TAP file here):
Image
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

Basic multiscroll routines and framerate baselines

My first step at multiscrolling is to do some Proof of Concept routines and establish a code harness where I can run some accounting in the background, and run any routine I want while measuring its performance. Of course this harness is also a testbed to debug the scrolling routines during its development.

As in my previous series on SP1 vertical scrolling examples, my performance measure will be the Frame Per Second (FPS).

Also needed for measuring real performance is to have some way of running a "walk" over a given map by using scrolling, so I have also developed a scroll map system that decides what what to draw in the hidden borders to get the effect of a scrolling viewport over a bigger map. But this mapping system wil be described in detail in a future post.

Back to multidirectional scroll, I need to write specific routines for doing it in all 4 directions (and not just one scroll-down routine like for the vertical scroller series). This is a challenge, because although vertical scrolling is relatively simple, horizontal scrolling needs to be done bit by bit and even though there are specialized Z80 instructions for doing that, I found some obstacles in my way.

The other most important thing is the memory layout of the virtual framebuffer. In this regard, we are completely conditioned by the SP1 tile layout, since for SP1 to update the screen correctly, it expects the 8 bytes associated to a screen cell position, located contiguously in memory. This restriction heavily favours a vertical memory layout by columns, like the one we used with the vertical scrolling series.

One important problem that we'll see is the wildly different performance between vertical and horizontal scrolling. We will need to sort this out and get both performances approximately on par, so that we can have a screen that moves at similar speed in all directions (i.e. an acceptable gameplay).

The speed difference stems from the different ways the scroll is done:
  • For vertical scrolling, we have a specialized instruction: LDI/LDD. Since the column is laid out vertically in memory, it's just a matter of setting a source register (HL) and a destination one (DE) with an offset equal to the number of pixels we want the column to scroll, and just issue a number of LDI/LDD instructions equal to the number of lines of the scroll area. Incrementing/decrementing the source and destination pointers is taken care of by the very LDI/LDD instruction, so we get a really fast and compact transfer loop.
  • For horizontal scrolling, we also have a specialized instruction sequence: RR (HL) + INC HL, which shifts one bit in a memory position through the Carry flag to the next memory position. This roughly does the equivalent of the LDI/LDD instructions for vertical scroll. We have a problem with this sequence, though: it assumes that the memory is laid out horizontally (and not vertically), i.e. the next byte in memory should be the next one horizontally. But our screen is laid out vertically due to SP1 requirements, so instead of INC HL, we need to add a fixed offset to HL it to access the next column. This addition unfortunately destroys the Carry flag previously modified by RR (HL), so we also need to save it for later, and our inner loop then becomes more complicated: RR (HL) + PUSH AF + LD HL,DE + POP AF.
In both cases, one scroll run of the whole framebuffer is roughly proportional to the dimensions of the scrolling area, WIDTH x LINES (WIDTH measured in cells and LINES measured in pixels). But while the inner loop for vertical scroll is 16 T-states (LDI/LDD), for horizontal scroll it is 15+11+8+10=44 T-states (nearly 3 times bigger!). This is the main reason for the very different performance in each direction.

In the case of 4-pixel scrolling we have also a specialized instruction sequence: RRD (HL), which rotates 4 bits through the Accumulator. This instruction does not need the Carry flag, so we can do away with the PUSH/POP in the internal loop, and so RRD (HL) + ADD HL,DE only takes 18+8=24 T-states, which is quite near the 16 T-states for vertical scroll.

In the case of 8-pixel scroll, both horizontal and vertical inner loops can be expressed as LDI/LDD, and so the performance for both directions is identical. But 8-pixel scrolling might not be adequate for all games, though.

This currently makes 4-pixel scroll our best bet for multiscroll games... but we'll see if we can further optimize the other routines. 4-pixel scroll can be OK for some games, but smooth scrolling is... well, The Right Thing, so let's see.

These are the current FPS measurements in all scrolling directions, for a 16x16-cell scrolling viewport and for different pixel increments:

Code: Select all

| Pixels \ Direction | U  | D  | L  | R  | UL | UR | DL | DR |
|--------------------|----|----|----|----|----|----|----|----|
| 1-pixel            | 16 | 16 | 12 | 12 | 10 | 10 | 10 | 10 |
| 2-pixel            | 16 | 16 | 7  | 7  | 6  | 6  | 6  | 6  |
| 4-pixel            | 16 | 16 | 12 | 12 | 11 | 11 | 11 | 11 |
| 8-pixel            | 15 | 15 | 15 | 15 | 11 | 11 | 11 | 11 |
The additional drop in performance for diagonal movements is due to scrolling in both directions on the same move, e.g. scroll in UP-LEFT direction = scroll UP + scroll LEFT in sequence.

The code used for this measurements and performance baselines is in directory src/sp1-multi-map, in the usual repository.

The animated demos and TAP files are ones provided in my previous posts, although the FPS numbers have been slightly improved since, as can be seen in the table.
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

I have polished the map scroller and added a debug panel and view. Scroll speed and debug mode can both be enabled/disabled in realtime so that you can experiment with it and see the inner workings of the scroller. I'm preparing an extensive article on this, but meanwhile you can play with the scroller demo below.

Controls:
  • QAOP: Movement in all directions. You can mix directions (diagonals)
  • S: toggle scrolling speed: 1-2-4-8 pixel steps
  • D: show/hide debugging view and panel
The yellow/black area is the real game window, and the bright bands shown in debug mode are the "hidden" bands where the scroller draws map tiles before they are scrolled into the viewport. When the hidden bands are shown, you can see how/when the scroller decides to draw more tiles.

An animation of the demo can be seen below, but again, I encourage you to download the TAP file and play with it yourself.

Image
User avatar
jorgegv
Microbot
Posts: 112
Joined: Mon Aug 09, 2021 4:50 pm

Re: Using SP1 for scroller games

Post by jorgegv »

Scroll Map Management

A scrolling system can conceptually be seen as a viewport over a bigger map that can't fit in the view, so the need appears for moving the "viewport" over it in small steps. The visible zone of the map is then determined at all times by the position of the viewport over the global map.

We want to do smooth scrolling, so we should be able to do it at 1-pixel resolution. For this reason the viewport coordinates must be held in pixels, and not in tiles or character cells. If using 8 bit integers as pixel coordinates, we could only represent 1 full Spectrum screen (256x192), so we must use 16-bit for viewport coordinates X and Y. When using 16-bit coordinates, the maximum size of the global map is then 65536x65536 pixels. That's 8192 character cells in each dimension, so it seems pretty enough for a Spectrum game :-)

The global map is built with tiles of the same size; e.g. they can be 1x1 character cells (8x8 pixels), 2x2 (16x16 pixels), 3x2 (16x24 pixels), etc. Tile dimensions are integer numbers of cells. Horizontal and vertical tile dimensions do not need to be the same, but it helps with map drawing. In our examples, our map is made of 16x16-pixel tiles (2x2 chars)

The map is stored as a linear byte array of size MAP_WIDTH x MAP_HEIGHT (with dimensions in tiles). Each map position in the array stores a 1-byte tile id and represents the tile at map position (ROW,COL). So for example, if the map size is 20 tile rows x 30 tile columns, the tile id for map position (4,5) is stored at position (4 x 30 + 5) = 125 in the map byte array. There is a global tile table which maps the tile id to the tile data, so that it can be easily used by a tile drawing routine.

With this schema, 3 coordinate systems need to be taken into account, and coordinates converted back and forth between them:
  • The Viewport coordinates: high resolution coordinate system with 16 bits per coordinate - Pixels
  • The Map coordinates: measures the position of tiles in the map, also 16 bits per coordinate, since we saw that the map can be up to 8192x8192. - Tiles
  • The Screen/SP1 coordinates: these are the screen-cell coordinates, our familiar 32x24 cell array (8 bits per coordinate) - Cells.
In the previous coordinate systems, if we were using 2x2-cell (16x16 pixel) tiles, an example map could be:
  • Map coordinates: 48x24 tiles (WxH)
  • Viewport coordinates: 768x384 pixels (WxH)
  • Screen/SP1 coordinates: 96x48 cells (WxH)
Since we are going to scroll map data into view from different directions, the virtual framebuffer (the offscreen) needs to be bigger than the viewport. It has a hidden band of tiles surrounding the visible area, which is 1-tile wide all over the perimeter of the viewport. This is where new tiles are drawn (to memory) before coming into view by the scrolling process; it's the same concept that was seen in my previous SP1 vertical scroller examples for the "hidden top row", but applied to all directions (top, left, bottom, right).

We need then to detect when some new graphics are going to be brought inside the visible area, and update the hidden bands with new graphics. This graphics are created by selecting the tiles from the relevant map coordinates and drawing them on the hidden bands before doing the scroll operation. The scrolling routines for each direction detect when new graphics are needed in the hidden band, and will call the proper map functions to get tile data and draw it to the proper position in the hidden band.

Example: if we want to show a 16x24-cell scrolling viewport on screen, and our map uses 2x2-cell tiles, the offscreen will be 20x28 cells (16+2+2, 20+2+2): a 16x24 window plus a 2-cell wide band around it.

The hidden bands are only updated ocassionally, and only if the scrolling function detects that the new scrolling movement will bring graphics data from that band inside the viewport.

Example: if the viewport is at position (X=16,Y=32) and we want to move the viewport UP 1 pixel, this means that the pixels at Y=31 will scroll down and become Y=32 (if we move the viewport UP, then we need to scroll DOWN :-) ). This condition is detected by the scroll routine and will redraw the top hidden band using the tiles at the associated map positions, before attempting to scroll down.

Since the hidden top tile band is 16 pixels high in our example, this operation does not need to be done for each 1-pixel scroll movement: the top hidden band has been fully redrawn and can be scrolled down 16 times before needing another redraw. The performance impact of redrawing the hidden bands is then quite low, since it only needs to be done on average 1/16 of the times the scrolling routine is invoked.

This ensures the illusion of a continuous movement over the map, even though we are drawing it in small tile-sized increments.

You can find a demo of this map scrolling system in the src/sp1-multi-map folder or you can download the TAP file and play with the demo. You can move the viewport with QAOP, enable/disable the debug panel and hidden band display with D, and toggle the scroll speed with S.

Have fun :)
Post Reply