2014-02-01

2014-02-01 Further Tweaks

I gave a short demo of Doom running on the Revo to a few friends (because the video was so poor) and came away with a number of suggestions:

  • Fixing the VSync issue
  • USB Keyboard Device and Input Queue to make the game playable
  • Sound and Music
  • Finding out the actual frame rate
  • USB Serial interface for streaming console output
  • USB Flash Drive performance
  • Doing happy dance

I've been looking into fixing the smaller items on this list.  The vertical sync was the easiest one, it was 3 lines of code.  In the function to update the front buffer from the back buffer, I put in a spin-lock to wait for the VGA device to be in the vertical retrace period before it started the swap.

I found that the VGA emulation in Bochs doesn't actually support the vertical retrace register, which was a bit disappointing as this means Doom now freezes before the first frame in this spinlock waiting for a VSync that will never come.  I mitigated this by adding code to the VGA initialise() routine to test to see if the vertical retrace register changes value in 100 milliseconds.  There should be 6 vertical retraces in this time (assuming 60Hz), so if the value never changes I assume the register isn't supported and clear the "Enable VSync" flag.

I also investigated why the "wipe" in Doom wasn't working.  The wipe is used when the game is transitioning to the game from the menu or vice versa.  The problem was easily resolved.  When I had removed the code from the I_Video.c file to get Doom to compile without the Linux specific video code, I had removed the content of the I_ReadScreen() function which is used to perform this wipe.  It just returns data from Doom's own internal buffer, so I didn't need to change it and shouldn't have removed it.  With the code reinstated, the wipe now works and as an added benefit, the background for the HUD now appears correctly.

Next I have looked into the load time issue because it really shouldn't take five minutes to load.  My first plan is to increase the amount of data that I transfer from the USB device in each transfer.  Currently I transfer one block at a time as required and cache it in the block device driver in case it gets accessed again next time.  If I increase this to load 8 blocks at a time (aligned to an 8 block boundary), this shouldn't take any more time to load from the USB (I figure I can transfer just over 100 blocks in a 1ms frame assuming USB2 speeds and no bus contention) and should improve reading from subsequent blocks.  With this change in place, the load time is now a respectable 38 seconds.

Still not happy with that though.  The next target is the delay I have while I wait for the USB bus to transfer the data.  Because my OS has the Programmable Interrupt Timer set to 100Hz, my delay function has a maximum granularity of 10ms and I'm using two such delays in the block loading (one after issuing the request, one while waiting for the response).  To replace these, I've written a helper function which will examine the status flags on the USB Transfer Descriptors and will spinlock until they are processed with whatever result.  As the USB transfer should complete in under 1ms, this should save 19ms per 8 blocks loaded.

With all that debugged, The OS goes from the Grub screen to the Doom splash screen in 3 seconds ... boom!