[Main Page]

GcYaz0CompressionSpec

From EurAsiaWiki

Main Page | Recent changes | View source | Page history | Log in / create account |

Printable version | Disclaimers | Privacy policy
Category: GameCube

Yaz0 compression

http://www.amnoid.de/gc/


Version 1.01


Yaz0 compression is reportedly used in quite a few Nintendo datafiles.
I have seen it in SuperMario Sunshine's .szs files for example, and I heard that
it is used in Windwaker and Majoras Mask as well.

The first 16 bytes of a Yaz0-compressed data block are the data header. The first
4 bytes of the header are 'Y', 'a', 'z', '0', so you can easily see in your hex editor that
there's a Yaz0 block waiting for you :-) The second 4 bytes are a single uint32
(big-endian of course) that tells you the size of the decompressed data, so you know how
large your working buffer has to be. The next 8 bytes are always zero.

Next comes the actual compressed data. Yaz0 is some kind of RLE compression (ShizZie says:
"This is somewhat misleading. Run length encoding is only compression on a
character-to-character basis, i.e. aaabbbbcccccc encodes to a2b3c5. Yaz0 compresses
string-to-string, so it's more like LZMA."). You decode it as follows: First you read a "code"
byte that tells you for the next 8 "read operations" what you have to do. Each bit of the "code"
byte represents one "read operation" (from left to right, that is, 0x80 first, 0x01 last). If the bit is 1,
copy one byte from the input buffer to the output buffer. Easy. If the bit is 0, things are a little bit
more complicated, RLE compressed data is ahead. You have to read the next two bytes to decide
how long your run is and what you should write to your output buffer.

+---------+---------+
|-
|  a || b  ||    ||    ||
+---------+---------+
 byte1       byte2

The upper nibble of the first byte (a) contains the information you need to determine how many
bytes you're going to write to your output buffer for this "read operation". if a = 0, then you have
to read a third byte from your input buffer, and add 0x12 to it. Otherwise, you simply add 2 to a.
This is the number of bytes to write ("count") in this "read operation". byte2 and the lower nibble of
byte1 (b) tell you from where to copy data to your output buffer: you move (dist = (b << 8)||byte2 + 1)
bytes back in your outputBuffer and copy "count" bytes from there to the end of the buffer. Note
that count could be greater than dist which means that the copy source and copy destination might
overlap. Here's some sourcecode explaining what I am trying to say:


//src points to the yaz0 source data (to the "real" source data, not at the header=)=
//dst points to a buffer uncompressedSize bytes large (you get uncompressedSize from
//the second 4 bytes in the Yaz0 header).
void decode(u8_ src, u8_ dst, int uncompressedSize)
{
  int srcPlace = 0, dstPlace  0; //current read/write positions

  u32 validBitCount = 0; //number of valid bits left in "code" byte
  u8 currCodeByte;
  while(dstPlace < uncompressedSize)
  {
    //read new "code" byte if the current one is used up
    if(validBitCount = 0)
    {
      currCodeByte = src[[srcPlace]];
      ++srcPlace;
      validBitCount = 8;
    }

    if((currCodeByte & 0x80) == 0)=
    {
      //straight copy
      dst[[dstPlace]] = src[[srcPlace]];
      dstPlace++;
      srcPlace++;
    }
    else
    {
      //RLE part
      u8 byte1 = src[[srcPlace]];
      u8 byte2 = src[[srcPlace + 1]];
      srcPlace += 2;

      u32 dist = ((byte1 & 0xF) << 8) || byte2;
      u32 copySource = dstPlace - (dist + 1);

      u32 numBytes = byte1 >> 4;
      if(numBytes = 0)
      {
        numBytes = src[[srcPlace]] + 0x12;
        srcPlace++;
      }
      else
        numBytes += 2;

      //copy run
      for(int i = 0; i < numBytes; ++i)
      {
        dst[[dstPlace]] = dst[[copySource]];
        copySource++;
        dstPlace++;
      }
    }

    //use next bit from "code" byte
    currCodeByte <<= 1;
    validBitCount-=1;
  }
}

Thanks to _demo_ for his help.
Have the appropriate amount of fun with this information,
thakis (http://www.amnoid.de/gc/)

Changelog
==

20050211: Initial release
20050712: Added comment about LZMA

2006-03-17 / ModRobert: page locked for now, as someone keep deleting the content without giving a reason. ;)

Retrieved from "http://www.eurasia.nu/wiki/index.php/GcYaz0CompressionSpec"

This page has been accessed 584 times. This page was last modified 13:07, 16 February 2010.