Display functions optimization

Libraries, utilities, bootloaders...

Re: Display functions optimization

Postby jonnection » Sun Jan 18, 2015 8:54 pm

I want you all to be aware that there is another limiting factor that needs to be taken into consideration before any big work on optimizations are done, especially optimizations that will affect everything from bitmap bit order to screen rotation etc.

That factor is the response time of a twisted-nematic liquid crystal display (= the Nokia 5110 LCD), which according to a Fujitsu specsheet I have (all TN displays are similar in this respect) is around 60 ms. http://www.fujitsu.com/downloads/MICRO/fma/pdf/LCD_Backgrounder.pdf

60 ms is 16.666666666667 Hz = approx 17 frames per second.

I have seen this also in practise: in "Isle of maniax" I am drawing white houses on top of the black horizon. The end result is that the black horizon is "showing through" the buildings. This is because the LCD response time (60 ms) means that the black pixels do not have time to "turn off" before they are supposed to be white. The result is a blurry mess and it doesn't look nice. You will not see this effect on the Simbuino emulator. Only on the real hardware.

What I am basically getting at here is that at 350 ns (16x16 drawbitmap with Myndale's optimized putpixel routine), you can draw the Gamebuino screen (84x48 pixels) over so many times, that the speed of the routine does not really have any practical meaning

screen total pixels=84*48=4032
testbmp pixels = 16*16=256
paint whole screen once = pixels/bmppixels=15.75
to paint screen with 16x16 bitmaps takes = paintonce*350ns=0.00000551sec
fps limited by 16x16 bmp painting routine = 1s/topaintscreentakes=181405.89569161 fps

So, even with a 350 ns routine, you could achieve a theoretical FPS of 181000 frames per second. Clearly, your LCD is not kind of going to keep up with it. At this point (really, I am not kidding) whether you have a 350 ns or 150 ns drawing routine doesn't make any difference. You wont be able to use that speed in a meaningful way.

These calculations are all assuming Myndales timing measurement in his demo is correct. Which I think they are.
User avatar
jonnection
 
Posts: 317
Joined: Sun May 04, 2014 8:21 pm

Re: Display functions optimization

Postby rodot » Sun Jan 18, 2015 9:35 pm

jonnection wrote:So, even with a 350 ns routine, you could achieve a theoretical FPS of 181000 frames per second. Clearly, your LCD is not kind of going to keep up with it. At this point (really, I am not kidding) whether you have a 350 ns or 150 ns drawing routine doesn't make any difference. You wont be able to use that speed in a meaningful way.


Being able to draw several layers of parallax with different colors (black, white, gray) would require to go over the screen several times per frame. Although I agree that 150 or 350ns doesn't really make any difference at this level (but it is still a very interesting topic about optimization). I would add that having "optimized" routines without the use of the "swizzling" would be a good thing for the portability of the library and bitmaps to other screens... just in case ;)
User avatar
rodot
Site Admin
 
Posts: 1290
Joined: Mon Nov 19, 2012 11:54 pm
Location: France

Re: Display functions optimization

Postby Myndale » Sun Jan 18, 2015 11:48 pm

jonnection wrote:to paint screen with 16x16 bitmaps takes = paintonce*350ns=0.00000551sec

jonnection wrote:These calculations are all assuming Myndales timing measurement in his demo is correct. Which I think they are.


Ah crap, I've been saying nanosecond when I meant microsecond. Sorry guys, my bad. The relative speed-up is the same, I've just been using the wrong nomenclature.

The correct time to paint a full screen is thus 0.0055125 seconds, which at 20fps would net you about 10x overdraw per frame, not including library overhead. That's potentially going to have quite a significant impact on titles that could use those clock cycles for other things like physics or AI.
Myndale
 
Posts: 507
Joined: Sat Mar 01, 2014 1:25 am

Re: Display functions optimization

Postby rodot » Mon Jan 19, 2015 7:03 am

Just a small bump about the the Bitmap class I suggested on the previous page... nobody interested in these more versatile Bitmap and AnimatedBitmap classes ?
User avatar
rodot
Site Admin
 
Posts: 1290
Joined: Mon Nov 19, 2012 11:54 pm
Location: France

Re: Display functions optimization

Postby Myndale » Mon Jan 19, 2015 12:39 pm

I really like the idea in principle but I think the implementation shown is going to cause a lot of problems memory-wise. As far as I can tell every bitmap instance of the class shown is going to take 10 bytes, 10 bitmaps = 100 bytes...that in an environment where there's realistically only a few hundred bytes spare to begin with.

If it's the OOP and flexibility of adding extensions that you find appealing then it's entirely possible to store the bitmap class in PROGMEM as well. The down-side is users would have to access members via get() accessors, those accessors would require an extra PROGMEM read and of course the bitmaps themselves would still be read only. Still, if it solves your problems then something like this illustrates how it could work:

Code: Select all
// flashy new bitmap class
struct Bitmap
{
  const uint8_t * raw;
  uint8_t extended_data;
 
  inline uint8_t getWidth() {return pgm_read_byte(getRawData());}
  inline uint8_t getHeight() {return pgm_read_byte(getRawData()+1);}
  inline uint8_t getPixels() {return pgm_read_byte(getRawData()+2);}
  inline const uint8_t * getRawData() {return (const uint8_t *)pgm_read_word(&this->raw);}
  inline uint8_t getExtendedData() {return pgm_read_byte(&this->extended_data);} 
};

// raw bitmap data in current format, stored in PROGMEM
uint8_t raw_bitmap[] PROGMEM = {8, 8, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};

// instance of new bitmap class, also stored entirely in PROGMEM and with extended data
struct Bitmap bitmap PROGMEM = {raw:raw_bitmap, extended_data:123};

void setup() {
  Serial.begin(9600);
  Serial.println(sizeof(bitmap));             // outputs -> 3 (2 bytes for raw ptr, 1 for extended data)
  Serial.println(bitmap.getWidth());          // outputs -> 8
  Serial.println(bitmap.getHeight());         // outputs -> 8
  Serial.println(bitmap.getExtendedData());   // outputs -> 123
}

void loop() {
}

Of course it's not entirely necessary to use the raw ptr the way I have here, I just did it like that to show that it could be done in a way that would maintain backward compatibility with the existing code, should you want to.
Myndale
 
Posts: 507
Joined: Sat Mar 01, 2014 1:25 am

Re: Display functions optimization

Postby cyberic » Thu Jan 22, 2015 9:12 am

Great work everyone!

Is there an optimised drawBitmap function that we can use right now, keeping the same bitmap format?

Do you think there could be a special case when I want to 'draw' on the whole screen?
a kind of memcpy?

Thx
cyberic
 
Posts: 27
Joined: Thu May 08, 2014 5:36 pm

Re: Display functions optimization

Postby rodot » Thu Jan 22, 2015 4:37 pm

The second version Myndale posted is compatible with current bitmaps and is 6x faster, but it only supports black color and don't do screen border check (so you can inadvertently write out of the screen buffer which will lead to unstable behaviors).
If someone adds border and colors check I will replace the default drawBitmap library with it :)
User avatar
rodot
Site Admin
 
Posts: 1290
Joined: Mon Nov 19, 2012 11:54 pm
Location: France

Re: Display functions optimization

Postby Myndale » Fri Jan 23, 2015 12:01 am

Try this one instead, it's 230uS and also supports WHITE and INVERT. I've also added another version that's 170 bytes smaller but ~20uS slower. I was writing this with the intention of converting it to assembly but a quick check of the LST file reveals the compiler is pretty much doing what I was going to do anyway.

At some point I'll do another version to handle all the other cases i.e. clipping, flipped etc. It won't be 230uS but should still be much faster than the existing function.

Code: Select all
#include <SPI.h>
#include <Gamebuino.h>
Gamebuino gb;

const byte sprite[] PROGMEM = {
  16, 16, 0x1f,0xf8,0x1f,0xf8,0x1f,0xfc,0x1f,0xff,0x1f,0xff,0xf,0xff,0xf,0xff,0x7,0xff,0x87,0xff,0x3,0xff,0x1,0xff,0x0,0x7f,0x2,0x1f,0x0,0x0,0x0,0x0,0x40,0x0,};

void setup(){
  gb.begin();
}

void loop(){
  long start, finish;
  int drawTime;

  if(gb.update()){
    const int count = 101;  // this has to be odd so we can see INVERT

    start = millis();
     for (int i=0; i<count; i++)
     gb.display.drawBitmap(1, 31, sprite);
     finish = millis();
     drawTime = 1000L*(finish-start)/count;
     gb.display.print(F("drawBitmap: "));
     gb.display.print(drawTime);
     gb.display.println(F("ns"));
     
    start = millis();
    for (int i=0; i<count; i++)
      drawBitmapUnrolled(17, 31, sprite, BLACK);
    finish = millis();
    drawTime = 1000L*(finish-start)/count;
    gb.display.print(F("unrolled: "));
    gb.display.print(drawTime);
    gb.display.println(F("ns"));
     
    start = millis();
    for (int i=0; i<count; i++)
      drawBitmapUnrolled2(33, 31, sprite, BLACK);
    finish = millis();
    drawTime = 1000L*(finish-start)/count;
    gb.display.print(F("unrolled2: "));
    gb.display.print(drawTime);
    gb.display.println(F("ns"));
  }
}

/* 782 bytes, 230uS */
void drawBitmapUnrolled(int8_t x, int8_t y, const uint8_t *bitmap, const uint8_t color) {
  int8_t h = pgm_read_byte(bitmap + 1);
  const int8_t byteWidth = (pgm_read_byte(bitmap) + 7) >> 3;
  bitmap += 2;   
  uint8_t * screen_line = gb.display.getBuffer() + (y / 8) * LCDWIDTH_NOROT + x;
 
  if (color == BLACK)
  {
    uint8_t mask = _BV(y & 7);
    while (h--)
    {
      uint8_t * ptr = screen_line;
      uint8_t i = byteWidth;
      while (i--)
      {
        const uint8_t pixels = pgm_read_byte(bitmap++);
        if (pixels & 0x80) ptr[0] |= mask;
        if (pixels & 0x40) ptr[1] |= mask;
        if (pixels & 0x20) ptr[2] |= mask;
        if (pixels & 0x10) ptr[3] |= mask;
        if (pixels & 0x08) ptr[4] |= mask;
        if (pixels & 0x04) ptr[5] |= mask;
        if (pixels & 0x02) ptr[6] |= mask;
        if (pixels & 0x01) ptr[7] |= mask;
        ptr += 8;
      }
      y++;
      if (!(y & 7))
        screen_line += LCDWIDTH_NOROT;
      mask = (mask & 0x80) ? 1 : (mask<<1);
    }
  }
 
  else if (color == WHITE)
  {
    uint8_t mask = ~_BV(y & 7);
    while (h--)
    {
      uint8_t * ptr = screen_line;
      uint8_t i = byteWidth;
      while (i--)
      {
        const uint8_t pixels = pgm_read_byte(bitmap++);
        if (pixels & 0x80) ptr[0] &= mask;
        if (pixels & 0x40) ptr[1] &= mask;
        if (pixels & 0x20) ptr[2] &= mask;
        if (pixels & 0x10) ptr[3] &= mask;
        if (pixels & 0x08) ptr[4] &= mask;
        if (pixels & 0x04) ptr[5] &= mask;
        if (pixels & 0x02) ptr[6] &= mask;
        if (pixels & 0x01) ptr[7] &= mask;
        ptr += 8;
      }
      y++;
      if (!(y & 7))
        screen_line += LCDWIDTH_NOROT;
      mask = (mask & 0x80) ? (mask<<1)+1 : 0xfe;
    }
  }
 
  else  // invert
  {
    uint8_t mask = _BV(y & 7);
    while (h--)
    {
      uint8_t * ptr = screen_line;
      uint8_t i = byteWidth;
      while (i--)
      {
        const uint8_t pixels = pgm_read_byte(bitmap++);
        if (pixels & 0x80) ptr[0] ^= mask;
        if (pixels & 0x40) ptr[1] ^= mask;
        if (pixels & 0x20) ptr[2] ^= mask;
        if (pixels & 0x10) ptr[3] ^= mask;
        if (pixels & 0x08) ptr[4] ^= mask;
        if (pixels & 0x04) ptr[5] ^= mask;
        if (pixels & 0x02) ptr[6] ^= mask;
        if (pixels & 0x01) ptr[7] ^= mask;
        ptr += 8;
      }
      y++;
      if (!(y & 7))
        screen_line += LCDWIDTH_NOROT;
      mask = (mask & 0x80) ? 1 : (mask<<1);
    }
  }
 
}

/* 612 bytes, 250uS */
void drawBitmapUnrolled2(int8_t x, int8_t y, const uint8_t *bitmap, const uint8_t color) {
  int8_t h = pgm_read_byte(bitmap + 1);
  const int8_t byteWidth = (pgm_read_byte(bitmap) + 7) >> 3;
  bitmap += 2;   
  uint8_t * screen_line = gb.display.getBuffer() + (y / 8) * LCDWIDTH_NOROT + x;
  uint8_t mask = _BV(y & 7);
  if (color == WHITE)
    mask = ~mask;
  while (h--)
  {
    uint8_t * ptr = screen_line;
    uint8_t i = byteWidth;
    if (color == BLACK)
      while (i--)
      {
        const uint8_t pixels = pgm_read_byte(bitmap++);
        if (pixels & 0x80) ptr[0] |= mask;
        if (pixels & 0x40) ptr[1] |= mask;
        if (pixels & 0x20) ptr[2] |= mask;
        if (pixels & 0x10) ptr[3] |= mask;
        if (pixels & 0x08) ptr[4] |= mask;
        if (pixels & 0x04) ptr[5] |= mask;
        if (pixels & 0x02) ptr[6] |= mask;
        if (pixels & 0x01) ptr[7] |= mask;
        ptr += 8;
      }
    else if (color == WHITE)
      while (i--)
      {
        const uint8_t pixels = pgm_read_byte(bitmap++);
        if (pixels & 0x80) ptr[0] &= mask;
        if (pixels & 0x40) ptr[1] &= mask;
        if (pixels & 0x20) ptr[2] &= mask;
        if (pixels & 0x10) ptr[3] &= mask;
        if (pixels & 0x08) ptr[4] &= mask;
        if (pixels & 0x04) ptr[5] &= mask;
        if (pixels & 0x02) ptr[6] &= mask;
        if (pixels & 0x01) ptr[7] &= mask;
        ptr += 8;
      }
    else // invert
      while (i--)
      {
        const uint8_t pixels = pgm_read_byte(bitmap++);
        if (pixels & 0x80) ptr[0] ^= mask;
        if (pixels & 0x40) ptr[1] ^= mask;
        if (pixels & 0x20) ptr[2] ^= mask;
        if (pixels & 0x10) ptr[3] ^= mask;
        if (pixels & 0x08) ptr[4] ^= mask;
        if (pixels & 0x04) ptr[5] ^= mask;
        if (pixels & 0x02) ptr[6] ^= mask;
        if (pixels & 0x01) ptr[7] ^= mask;
        ptr += 8;
      }
    y++;
    if (!(y & 7))
      screen_line += LCDWIDTH_NOROT;
    if (color == WHITE)
      mask = (mask & 0x80) ? (mask<<1)+1 : 0xfe;
    else
      mask = (mask & 0x80) ? 1 : (mask<<1);
  }
}
Myndale
 
Posts: 507
Joined: Sat Mar 01, 2014 1:25 am

Re: Display functions optimization

Postby cyberic » Fri Jan 23, 2015 10:41 am

Thx Myndale!
cyberic
 
Posts: 27
Joined: Thu May 08, 2014 5:36 pm

Re: Display functions optimization

Postby rodot » Sat Jan 24, 2015 8:38 am

Thanks Myndale :)
Would it significantly affect the performances to check that the pixels are drawn in the screen ? Because the case where your bitmap overlay the edge of the screen is pretty common, in my opinion it's a must-have feature... it's a shame, but I didn't manage to implement it properly with your optimized. I'm sure it would be a matter of minutes for your to implement it :P
User avatar
rodot
Site Admin
 
Posts: 1290
Joined: Mon Nov 19, 2012 11:54 pm
Location: France

PreviousNext

Return to Software Development

Who is online

Users browsing this forum: No registered users and 32 guests

cron