Having fun optimizing a sprite routine

Libraries, utilities, bootloaders...

Re: Having fun optimizing a sprite routine

Postby Sorunome » Sat Jan 09, 2016 4:40 pm

[x] and [y] etc. are set by the compiler itself, this seems rather odd to me. Are you using the arduino IDE and trying to compile as gamebuino? Which version of the arduino IDE?
User avatar
Sorunome
 
Posts: 629
Joined: Sun Mar 01, 2015 1:58 pm

Re: Having fun optimizing a sprite routine

Postby deeph » Sat Jan 09, 2016 5:12 pm

Yes, I'm using arduino IDE unhanced release 1.0.5 and compiling as gamebuino. Don't know if there's a newer version, though.

edit : nevermind, using version 1.6.7 it seems to compile.
deeph
 
Posts: 52
Joined: Mon Jul 13, 2015 6:09 am
Location: France

Re: Having fun optimizing a sprite routine

Postby Sorunome » Sun Jan 10, 2016 8:24 pm

Glad you got it working, and yeah, I remember it only working in 1.6.x + or something.
User avatar
Sorunome
 
Posts: 629
Joined: Sun Mar 01, 2015 1:58 pm

Re: Having fun optimizing a sprite routine

Postby deeph » Sun Mar 13, 2016 1:26 pm

Ok I get it to work as I wanted, but is there a way to XOR the sprite instead of OR it ? Thanks.
deeph
 
Posts: 52
Joined: Mon Jul 13, 2015 6:09 am
Location: France

Re: Having fun optimizing a sprite routine

Postby Sorunome » Sun Mar 13, 2016 5:09 pm

Try this slight modification (untested)
Code: Select all
void ultraDraw4(byte data[], char x, char y){
        uint8_t* buf = ((y&0xF8)>>1) * 21 + gb.display.getBuffer() + x;
        asm volatile(
        "ldi R16,8\n\t"
        "cpi %[rotnumber],0\n\t"
        "breq LoopAligned\n"
        "LoopStart:\n\t"

          "ld R17,Z+\n\t"
          "eor R18,R18\n\t"
          "mov R19,%[rotnumber]\n\t"
          "LoopShift:\n\t" // carry is still reset from the cpi instruction or from the dec
            "rol R17\n\t"
            "rol R18\n\t"
            "dec R19\n\t"
            "brne LoopShift\n\t"

          "ld R19,X\n\t"
          "eor R19,R17\n\t"
          "st X+,R19\n\t"
         
          "ld R19,Y\n\t"
          "eor R19,R18\n\t"
          "st Y+,R19\n\t"
         
          "dec R16\n\t"
          "brne LoopStart\n\t"
        "rjmp End\n"
        "LoopAligned:\n\t"
          "ld R17,Z+\n\t"
          "ld R18,X\n\t"
          "eor R18,R17\n\t"
          "st X+,R18\n\t"
          "dec R16\n\t"
          "brne LoopAligned\n"
        "End:\n"
        ::"x" (buf),"y" (buf + 84),"z" (data),[rotnumber] "r" (y%8):"r16","r17","r18","r19");
}
User avatar
Sorunome
 
Posts: 629
Joined: Sun Mar 01, 2015 1:58 pm

Re: Having fun optimizing a sprite routine

Postby deeph » Sun Mar 13, 2016 6:21 pm

It's working but I need to be able to switch the copying logic (and/or/xor), mainly to mask sprites. Is there a way to do self modifying code, or adding an extra parameter to the routine ? Thank you !! :)
deeph
 
Posts: 52
Joined: Mon Jul 13, 2015 6:09 am
Location: France

Re: Having fun optimizing a sprite routine

Postby Sorunome » Mon Mar 14, 2016 12:19 pm

Not easily within one routine.
You can make multiple, however
you see there are three key point lines:
"eor R19,R17\n\t"

"eor R19,R18\n\t"

"eor R18,R17\n\t"


eor is for xor, or is for or, and is for and
User avatar
Sorunome
 
Posts: 629
Joined: Sun Mar 01, 2015 1:58 pm

Re: Having fun optimizing a sprite routine

Postby deeph » Mon Mar 14, 2016 5:57 pm

Mmh ok, I think I'm going to do the mask with the classic sprite routine and everything else with the asm one. Thank you :)
deeph
 
Posts: 52
Joined: Mon Jul 13, 2015 6:09 am
Location: France

Re: Having fun optimizing a sprite routine

Postby Sorunome » Tue Mar 15, 2016 11:04 pm

hmm, you could actually do some more dark magic optimizing with a clipped routine directly in the asm code, i currently don't have my arduino dev enviroment set up, though :(
User avatar
Sorunome
 
Posts: 629
Joined: Sun Mar 01, 2015 1:58 pm

Re: Having fun optimizing a sprite routine

Postby deeph » Wed Mar 16, 2016 6:00 am

Yes, I'm using yours : http://gamebuino.com/forum/viewtopic.php?f=12&t=3244&start=10#p10176

Here's a new version of my tilemapper using it for drawing (minus sprites masking) :

Image

Code: Select all
#include <SPI.h>
#include <Gamebuino.h>

#define MAP_WIDTH 15
#define MAP_HEIGHT 15
const int PROGMEM map_test[]={
  0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,
  0x05,0x01,0x01,0x01,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,
  0x05,0x01,0x00,0x00,0x01,0x07,0x01,0x01,0x01,0x01,0x05,0x05,0x05,0x05,0x05,
  0x05,0x01,0x00,0x00,0x00,0x00,0x07,0x07,0x07,0x00,0x01,0x05,0x05,0x05,0x05,
  0x05,0x01,0x00,0x03,0x00,0x00,0x00,0x03,0x00,0x00,0x00,0x01,0x05,0x05,0x05,
  0x05,0x01,0x00,0x00,0x01,0x01,0x01,0x01,0x01,0x00,0x00,0x00,0x01,0x05,0x05,
  0x05,0x01,0x00,0x01,0x01,0x02,0x02,0x02,0x01,0x01,0x00,0x00,0x01,0x05,0x05,
  0x05,0x05,0x01,0x01,0x02,0x02,0x02,0x02,0x02,0x01,0x09,0x0b,0x01,0x05,0x05,
  0x05,0x01,0x00,0x01,0x01,0x02,0x02,0x02,0x01,0x01,0x0a,0x0c,0x01,0x05,0x05,
  0x05,0x01,0x00,0x09,0x0b,0x01,0x02,0x02,0x01,0x00,0x00,0x00,0x01,0x05,0x05,
  0x05,0x05,0x01,0x0a,0x0c,0x00,0x00,0x02,0x00,0x00,0x00,0x01,0x05,0x05,0x05,
  0x05,0x05,0x01,0x01,0x00,0x00,0x03,0x02,0x08,0x01,0x01,0x01,0x05,0x05,0x05,
  0x05,0x05,0x05,0x05,0x01,0x01,0x00,0x02,0x01,0x05,0x05,0x05,0x05,0x05,0x05,
  0x05,0x05,0x05,0x05,0x05,0x05,0x01,0x02,0x05,0x05,0x05,0x05,0x05,0x05,0x05,
  0x05,0x05,0x05,0x05,0x05,0x05,0x02,0x01,0x05,0x05,0x05,0x05,0x05,0x05,0x05 };

#define TILE_WIDTH 8
#define TILE_HEIGHT 8
#define TILES_PASSABLE_6End 4
#define TILES_ANIMATED_START 3
#define TILES_ANIMATED_6End 5
#define ANIMATION_FREQUENCY 500 // ms

const byte tiles[]={
  0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
  0x00,0x00,0x08,0x00,0x40,0x04,0x00,0x00,
  0x02,0x01,0x48,0x24,0x00,0x04,0x42,0x20,
  0x00,0x40,0x10,0x60,0x00,0x40,0x20,0x00,
  0x00,0x40,0x00,0x70,0x00,0x40,0x20,0x00,
  0xff,0xf7,0x7b,0xf7,0xdf,0xef,0xdd,0xff,
  0xff,0xfb,0xbd,0xfb,0xef,0xf7,0xee,0xff,
  0x70,0xdc,0xaa,0xf1,0xf5,0xea,0xfc,0x78,
  0x1e,0x21,0x2d,0xe9,0xe5,0x29,0x21,0x1e,
  0x80,0x70,0x4c,0x12,0x26,0x12,0x01,0x01,
  0x03,0x0d,0x16,0x3a,0x34,0x68,0xbc,0x99,
  0xa1,0x01,0xa2,0x46,0x92,0x4c,0xf0,0x80,
  0x98,0xb5,0x7a,0x2d,0x3a,0x1d,0x0c,0x03,
 };
const byte* tiles_pointer = tiles;

const byte player_charset[]={
  0x00,0x1e,0xed,0xa5,0xed,0xa5,0x1e,0x04,
  0x00,0xde,0xed,0xe5,0xad,0xe5,0x9e,0x04,
  0x00,0xde,0xad,0xe5,0xed,0xe5,0x9e,0x04,
  0x00,0x5e,0xa5,0xe5,0x65,0xe5,0xa5,0x5e,
  0x00,0x7e,0xd5,0xe5,0x65,0xa5,0xa5,0x7e,
  0x00,0x7e,0xa5,0xa5,0x65,0xe5,0xd5,0x7e,
  0x00,0x04,0x1e,0xa5,0xed,0xa5,0xed,0x1e,
  0x00,0x04,0x9e,0xe5,0xad,0xe5,0xed,0xde,
  0x00,0x04,0x9e,0xe5,0xed,0xe5,0xad,0xde,
  0x00,0x5e,0xa5,0xed,0x65,0xed,0xa5,0x5e,
  0x00,0x7e,0xd5,0xed,0x65,0xed,0xe5,0x7e,
  0x00,0x7e,0xe5,0xed,0x65,0xed,0xd5,0x7e,
 };
const byte* player_charset_pointer = player_charset;

const byte PROGMEM player_mask01[]={
  8,8,0x3c,0x7e,0x7f,0x7e,0x7e,0x3c,0x38,0x3c };
const byte PROGMEM player_mask02[]={
  8,8,0x3c,0x7e,0x7f,0x7e,0x7e,0x3c,0x7c,0x7e };
const byte PROGMEM player_mask03[]={
  8,8,0x3e,0x7f,0x7f,0x7f,0x7f,0x3e,0x7f,0x36 };
const byte PROGMEM player_mask04[]={
  8,8,0x3e,0x7f,0x7f,0x7f,0x7f,0x7f,0x7f,0x36 };
const byte PROGMEM player_mask05[]={
  8,8,0x1e,0x3f,0x7f,0x3f,0x3f,0x1e,0x0e,0x1e };
const byte PROGMEM player_mask06[]={
  8,8,0x1e,0x3f,0x7f,0x3f,0x3f,0x1e,0x1f,0x3f };
const byte* player_masks_pointer[]={
  player_mask01,player_mask02,player_mask02,player_mask03,player_mask04,player_mask04,player_mask05,player_mask06,player_mask06,player_mask03,player_mask04,player_mask04 };

Gamebuino gb;

void setup(){
  gb.begin();
  byte player_x = 2;
  byte player_y = 2;
  byte player_direction = 3;
  byte player_animation = 0;
  int camera_x = player_x*TILE_WIDTH-LCDWIDTH/2+4;
  camera_x = camera_x*(camera_x > 0)+(MAP_WIDTH*TILE_WIDTH-LCDWIDTH-camera_x)*(camera_x > MAP_WIDTH*TILE_WIDTH-LCDWIDTH);
  int camera_y = player_y*TILE_HEIGHT-LCDHEIGHT/2+4;
  camera_y = camera_y*(camera_y > 0)+(MAP_HEIGHT*TILE_HEIGHT-LCDHEIGHT-camera_y)*(camera_y > MAP_HEIGHT*TILE_HEIGHT-LCDHEIGHT);
  while(1){
    if(gb.update()){
      draw_map(camera_x, camera_y);
      draw_player(player_x*TILE_WIDTH-camera_x, player_y*TILE_WIDTH-camera_y, player_direction, player_animation);
      char x_temp = -gb.buttons.repeat(BTN_LEFT, 1)*(player_x > 0)+gb.buttons.repeat(BTN_RIGHT, 1)*(player_x < MAP_WIDTH-1);
      char y_temp = -gb.buttons.repeat(BTN_UP, 1)*(player_y > 0)+gb.buttons.repeat(BTN_DOWN, 1)*(player_y < MAP_HEIGHT-1);
      if(x_temp || y_temp){
        player_direction = (1+x_temp)*(x_temp != 0);
        player_direction = (2+y_temp)*(y_temp != 0 || player_direction == 0);
        if(TILES_PASSABLE_6End-pgm_read_byte(map_test+(player_y+y_temp)*MAP_WIDTH+player_x+x_temp) > 0){
          for(byte i = 1; i <=8; i++){
            player_animation += 1;
            player_animation *= (player_animation < 3 && i < 8);
            camera_x = (player_x*TILE_WIDTH-LCDWIDTH/2+4+i*x_temp);
            camera_x = camera_x*(camera_x > 0)+(MAP_WIDTH*TILE_WIDTH-LCDWIDTH-camera_x)*(camera_x > MAP_WIDTH*TILE_WIDTH-LCDWIDTH);
            camera_y = player_y*TILE_HEIGHT-LCDHEIGHT/2+4+i*y_temp;
            camera_y = camera_y*(camera_y > 0)+(MAP_HEIGHT*TILE_HEIGHT-LCDHEIGHT-camera_y)*(camera_y > MAP_HEIGHT*TILE_HEIGHT-LCDHEIGHT);
            gb.display.clear();
            draw_map(camera_x, camera_y);
            draw_player(player_x*TILE_WIDTH-camera_x+i*x_temp, player_y*TILE_WIDTH-camera_y+i*y_temp, player_direction, player_animation);
            gb.display.update();
          }
          player_x += x_temp;
          player_y += y_temp;
        }
      }
    }
  }
}

void loop(){
}

void draw_map(int camera_x, int camera_y){
  for(byte y = 0; y <= 6; y++){
    for(byte x = 0; x <= 11; x++){
      int tile_x = camera_x/TILE_WIDTH+x;
      int tile_y = camera_y/TILE_HEIGHT+y;
      if(tile_x >= 0 && tile_x < MAP_WIDTH && tile_y >= 0 && tile_y < MAP_HEIGHT){
        byte tile_num = pgm_read_byte(map_test+tile_y*MAP_WIDTH+tile_x);
        tile_num += (tile_num >= TILES_ANIMATED_START && tile_num <= TILES_ANIMATED_6End)*millis()/ANIMATION_FREQUENCY%2;
        draw_sprite(tiles_pointer+tile_num*TILE_HEIGHT, x*TILE_WIDTH-camera_x%TILE_WIDTH, y*TILE_HEIGHT-camera_y%TILE_HEIGHT);
      }
    }
  }
}

void draw_player(int x, int y, byte direction, byte animation){
  gb.display.setColor(WHITE, BLACK);
  gb.display.drawBitmap(x, y, player_masks_pointer[direction*3+animation]);
  draw_sprite(player_charset_pointer+(direction*24+animation*8), x, y);
}


void draw_sprite(const byte data[], char x, char y){  // routine by Sorunome
  uint8_t* buf = (((y+8)&0xF8)>>1) * 21 + x + gb.display.getBuffer();
  asm volatile(
  "mov R20,%[y]\n\t"
  "ldi R17,7\n\t"
  "add R20,R17\n\t"
  "brmi End\n\t"
  "cpi %[y],48\n\t"
  "brpl End\n\t"
  "inc R20\n\t"
  "ldi R16,8\n\t"
  "andi R20,7\n\t"
  "cpi R20,0\n\t"
  "breq LoopAligned\n"
  "LoopStart:\n\t"
  "tst %[x]\n\t"
  "brmi LoopSkip\n\t"
  "cpi %[x],84\n\t"
  "brcc LoopSkip\n\t"
  "ld R17,Z\n\t"
  "eor R18,R18\n\t"
  "mov R19,R20\n\t"
  "clc\n\t"
  "LoopShift:\n\t" // carry is still reset from the cpi instruction or from the dec
  "rol R17\n\t"
  "rol R18\n\t"
  "dec R19\n\t"
  "brne LoopShift\n\t"
  "tst %[y]\n\t"
  "brmi LoopSkipPart\n\t"
  "ld R19,X\n\t"
  "eor R19,R17\n\t"
  "st X,R19\n\t"
  "LoopSkipPart:\n\t"
  "cpi %[y],40\n\t"
  "brpl LoopSkip\n\t"
  "ld R19,Y\n\t"
  "eor R19,R18\n\t"
  "st Y,R19\n\t"
  "LoopSkip:\n\t"
  "eor R18,R18\n\t"
  "ldi R19,1\n\t"
  "add R26,R19\n\t" // INC DOESN'T CHANGE CARRY!
  "adc R27,R18\n\t"
  "add R28,R19\n\t"
  "adc R29,R18\n\t"
  "add R30,R19\n\t"
  "adc R31,R18\n\t"
  "inc %[x]\n\t"
  "dec R16\n\t"
  "brne LoopStart\n\t"
  "rjmp End\n"
  "LoopAligned:\n\t"
  "tst %[x]\n\t"
  "brmi LoopAlignSkip\n\t"
  "cpi %[x],84\n\t"
  "brcc LoopAlignSkip\n\t"
  "ld R17,Z\n\t"
  "ld R18,X\n\t"
  "eor R18,R17\n\t"
  "st X,R18\n\t"
  "LoopAlignSkip:\n\t"
  "ldi R18,1\n\t"
  "add R26,R18\n\t"
  "adc R27,R20\n\t"
  "add R30,R18\n\t"
  "adc R31,R20\n\t"
  "inc %[x]\n\t"
  "dec R16\n\t"
  "brne LoopAligned\n"
  "End:\n\t"
  ::"x" (buf - 84),"y" (buf),"z" (data),[y] "r" (y),[x] "r" (x):"r16","r17","r18","r19","r20");
}


I don't know how much times faster/smaller it is, but it is.

I'll probably make a little game out of this :)

Again, thanks !
deeph
 
Posts: 52
Joined: Mon Jul 13, 2015 6:09 am
Location: France

Previous

Return to Software Development

Who is online

Users browsing this forum: No registered users and 19 guests

cron