Sunday, May 19, 2019

ESP8266 GameBoy Emulator, attempt.

I had an idea, to port a GameBoy emulator to an ESP8266 with an LCD screen.
found a good candidate for porting Peanut-GB, it's only a single file (.H) and with
plenty of examples.



Very little setup required and a single lcd_draw_line callback, implement LCD drawing
there and voila, GB emulator on the ESP8266 (node MCU clone), but it seems
the raw power of C is not sufficient for this task, and the LCD library
that I've chosen (TFT_eSPI) and modified was also insufficient to make
the emulator run at adequate speeds.

Games are stored in the flash using bin2hex.

The LCD in use is ST7789 based 240x240 connected to the SPI bus running at 40 Mhz.

Pinout:
NodeMCU - LCD
GND -> GND
3.3V -> VCC
3.3V -> BLK
D3 -> DC
D4 -> RES
D5 -> SCL
D7 -> SDA

So what do we have:
1. The emulator runs, tried with "super mario land", Tetris and a demo.
2. Uhm... Profit ?

What we don't have.
1. Good frame time, each frame takes between 70~200 ms which translates between 14~5 FPS.
2. Audio (There was not enough time for smooth frame rates.. so ...)
3. Key input (Yeah.. no..)


It seems that no matter how hard I tried to optimize the code the drawing part
would always lag, if we remove the LCD part, the core did have enough
time to emulate the game, but I think an emulator without a screen
is not a good emulator :)





What did I try to do:
1. ESP runs at 160 MHz, I can try to use NO_SDK (run at 340 Mhz) by CNLohr but the SPI bus is        affected by it.
2. Optimize the gb_draw_line up the wazoo, replaced all multiplications with right shifts.
3. Added per line hash drawing capabilities, so if the pixels of the line did not change, don;t draw            them, helpful with static games, not so much with scrolling.
4. Moving the game code to DRAM, did not produce significant speed ups. *
5. DynaRec? will probably require a lot of RAM and we have only 32000 bytes left.

* One of the issues that I've encountered is that the game needs to reside on the flash,  but because of alignment issues (the game is a an array of bytes [8-bit] but the memory of the ESP is 32Bit) the reading is slower.



Everything was developed using Sloeber, highly recommended for Android development.
Sources: primary project.
Sources: TFT_eSPI modified library
Hope that helps anybody.

R.

5 comments:

  1. Hi Ronen. This is great! You managed to get peanut-gb running on a microcontroller before I managed to!
    It would be great to see how we can get this to a more playable speed.

    I wonder if the main slowdown is redrawing every frame on the screen?

    > we have only 32000 bytes left

    Maybe that memory could be used to keep track of what has been changed in the frame buffer? Since the resolution of the Game Boy is 160x144, we would need only 23,040 bytes to keep the current frame in memory, where each byte stores a single pixel. Or the frame could be packed in memory to 5,760 bytes whereby each byte holds four pixels, since each pixel may only be one of four colours (hence 2-bit for each pixel). The latter would require some processing, but would use much less memory.
    Then, when peanut-gb wants to draw a new line on the screen, you can check to see if that specific line in the frame has actually changed or not. If it hasn't, then skip writing to the screen.

    Also, look at weather the TFT library you're using allows you to pipe pixels to the screen, instead of writing pixels individually using tft.drawPixel(), you could use tft.pushImage(), which I see you used in your previous lcd_draw_line function. Instead of converting the format of the pixel generated by peanut-gb to what your library requires, you could change the way peanut-gb produces these pixels by modifying the __gb_draw_line() function directly by looking at where the "pixels" variable is being written to.

    The ESP8266 SPI supports a 64-byte buffer. Make sure that the TFT library you're using makes use of this buffer to ensure it's transmitting at full speed.

    I don't have an ESP8266 myself, so I don't know how to compile code for it. Make sure that the compiler is optimising the code, as peanut-gb gets a significant speed boost when optimisation (like -O2) is enabled in GCC.

    Also, peanut-gb has a frame skip option (to 30fps) which you could enable to get a more playable speed hopefully. Using frame skip with interlacing will make the game look rather horrid though. You can check out what it looks like in the SDL2 example.

    Your lcd_draw_line2() seems to be empty in your code.

    Also, does Tetris work with peanut-gb? I tested it a long time ago but I think it would stop at the main menu...

    Anyways, let me know what your thoughts are, as I'm very interested to see where this goes. I might even get a ESP8266 or ESP32 myself just to try this out. :)

    Mahyar

    ReplyDelete
    Replies
    1. > Added per line hash drawing capabilities, so if the pixels of the line did not change, don't draw them, helpful with static games, not so much with scrolling.

      Oops! I forgot that you tried this. Hashing is a great idea! Apologies if I missed anything else.

      > One of the issues that I've encountered is that the game needs to reside on the flash, but because of alignment issues (the game is a an array of bytes [8-bit] but the memory of the ESP is 32Bit) the reading is slower.

      I wonder if the four bytes could be cached in a variable, and where some op-codes use 2 or 3 bytes, they could read from that variable instead of from flash again. This is very interesting.

      Delete
    2. Hi, first thank you for Peanut-GB,
      As I said tried caching per line because per pixel (did try it - evidence of many tries are in the code) makes the code even run much worse.

      Regarding the TFT library I've actually crated a separate project for bench-marking different methods of drawing (5 methods),pushImage turned out to be the fastest one, there is still translation from 8-bit to 16-bit in the library so maybe will try something with that.

      Optimization is on by default in the GCC max level.

      Intelacing works, it's on in the video, frame skip did not result in a any notable results.

      lcd_draw_line2 - remains of many testing I've done, it's redundant.

      I've ported the same code to ESP32 and made many modifications for the drawing routines in the seconds core and it works flawlessly!

      Regarding Tetris, yeah it hangs at the menu not sure why.

      R.






      Delete
    3. > I've ported the same code to ESP32 and made many modifications for the drawing routines in the seconds core and it works flawlessly!

      That's great! There's also a port of GNUboy to the ESP32 [here](https://github.com/PocketSprite/8bkc-gnuboy). It'll be interesting to see how yours compares. I think Peanut-GB has better overall compatibility than GNUboy, however the latter has sound support integrated I think.

      > Optimization is on by default in the GCC max level.

      I did some benchmarks and found that when profiling peanut-sdl and recompiling, I got a speed improvement of 26% as shown in: https://github.com/deltabeard/Peanut-GB/blob/master/BENCHMARK.md I don't know how you would be able to profile binaries on an embedded system like the ESP32.

      > Regarding Tetris, yeah it hangs at the menu not sure why.

      Peanut-GB emulation isn't perfect by a large margin. Therefore some games, notably games with a lot of weird, fragile code like Tetris, or those with interesting video effects like Prehistorik Man, won't work very well. I've been able to play my favourites though: Legend of Zelda Link's Awakening, Donkey Kong, Pokemon Silver, Tetris Plus.

      Would you be interested in publishing your modified ESP32 code (maybe in a git repo)? It'll be great for the PocketSprite to get some competition. Although you don't have to publish, since Peanut-GB is MIT licensed, so it's up to you. :)

      Let the world know on your progress!

      Thanks,
      Mahyar

      Delete
    4. I will publish the code for ESP32, trying to implement the sound options, but i'm not sure how well it will go.

      PocketSprite does have sound, looking at their code for inspiration, even trying to compile it for the esp32.

      Delete