OBJECTPROCESSOR

# -------------------------------------------------------------------
# OBJECTPROCESSOR                  (c) Copyright 1995-1996 Nat! & KKP
# -------------------------------------------------------------------
# These are some of the results/guesses that Klaus and Nat! found
# out about the Jaguar. Since we are not under NDA or anything from
# Atari we feel free to give this to you for educational purposes
# only. Thanks to NEUROMANCER for many worthy corrections and 
# the GPU-object info.
#
# Please note, that this is not official documentation from Atari
# or derived work thereof (both of us have never seen the Atari docs)
# and Atari isn't connected with this in any way.
#
# Please use this informationphile as a starting point for your own
# exploration and not as a reference. If you find anything inaccurate,
# missing, needing more explanation etc. by all means please write
# to us:
#    nat@zumdick.rhein-main.de
# or
#    kkp@gamma.dou.dk
#
# If you could do us a small favor, don't use this information for
# those lame flame-wars on r.g.v.a or the mailing list.
#
# HTML soon ?
# -------------------------------------------------------------------
# $Id: op.html,v 1.28 1997/03/30 02:27:13 nat Exp $
#
# If there are two theories I put the more likely one first.
# -------------------------------------------------------------------

Things to know about the Objectprocessor (OP)
Registers
Object types
Objects as C-structs
OP Bugs
Small Discussion




Things to know about the Objectprocessor (OP):
==============================================
-1    Imagine a phrase being an entity of 64 bits (or 8 bytes for that
      matter).

0.    The object list is a linked list.

1.    The object list is traversed by the object processor for
      each! scanline.

2.    The Objectprocessor probably works like this:

      Whenever a new linebuffer needs to be filled, the OP is called to do
      its chore, while the videosystem is busy displaying the other linebuffer.
      The OP does its work by traversing the objectlist and interpreting 
      each object in sequence. Each object has per linebuffer the chance ONCE 
      to fill the linebuffer. (Note: that this does not mean necessarily
      per scanline, since with the special HDB2-mode
      it can happen that two linebuffers are used for each scanline!)

      It fills the linebuffer at a specified horizontal position for a 
      specified width. The data in the linebuffer is always overwritten 
      (except when the Read-Modify-Write bit is set). If the active object 
      has the transparent bit set, it will not overwrite values in the 
      linebuffer when its source pixel has the value zero. The 'transparency' 
      check is done before looking up the pixel's color in the CLUT 
      (1 - 256 color modes).

2.1   The sooner a object appears in the list the more in the background it 
      appears. The linebuffer is initialized by the video chip with the 
      linebuffer-backgroundcolor (BG) before the OP starts filling the 
      linebuffer.

      One may also assume that the OP normally traverses the
      linebuffer from left to right, except when the horizontal flip
      bit is set. (Very useful information indeed! (har) )

      Each bitmap object is made up of pixels. These pixels can be either
      contain the color itself (direct) as in CrY and True-Color modes
      or be an index into a Colorlookuptable (indirect).

2.2   We assume that the OP writes into the linebuffer locally, so that
      the object-data is read over the bus, but not written into the
      linebuffer over the bus (which would be way evil)

2.3   If all these theories are true, then the OP has on the average one
      scanline time to prepare the linebuffer. (In a setup where one
      linebuffer is used per scanline)

2.4   The videosystem can deal with 16bit RGB/CrY-color and 24bit RGB
      pixels, the size of the pixels the OP writes into the linebuffer
      and pulls out of the CLUT, depends on the pixel-type chosen for
      the videosystem.

2.5   The object in the objectlist are *modified* by the OP. This means
      that an object list is only good for one frame. You need to
      continually refresh your object list each VBLANK.

3.    The last object must be a STOP object.

4.    The Objectlist must be double-phrase aligned. This means
      that the lower nybble of the address must be zero.
      (Maybe this is wrong and it is just object alignment that you
      should take into account)

5.    The address of the image of an object must be (as expected)
      phrase aligned (zero in the lower 3 bits)

6.    There are five different objects that the Objectprocessor knows
      about. These are:

      1. Bitmapped Object
      2. Scaled bitmapped object
      3. GPU-Object (interrupts the GPU)
      4. Branch object
      5. Stop object (marks the end of the object list)

      The objects have different sizes. The minimum size of an object
      is a "phrase". Also note the alignment constraints.

      Object type    Number     Size in phrases  Alignment in phrases
      -------------------------------------------------------------
      BITMAP         0           2                       2
      SCALE          1           3                       4 !!
      GPU            2           1                       1
      BRANCH         3           1                       1
      STOP           4           1                       1


7.    To keep the Objectprocessor from fetching data (and wasting bandwidth)
      during the VBLANK you usually put two branch objects at the beginning
      of the display list, that branch to the stop object if the first
      displayable scanline has not been reached or the last displayable
      scanline has already been displayed.

7.1   The OP mustn not take than a scanlines worth of time to process the
      object list, else the display tears. (If using a single linebuffer
      per scanline)

8     The OP usually hogs the bus, when doing data transfers, since it is
      normally the most highly priorised 
      (interesting) device on the bus.

9     In the special mode where two linebuffers are used for each scanline,
      you should remember that the OP executes the object list twice. That
      will give you quite some headaches. For example sprites crossing 
      the "boundary" will have to be split in two objects, which will be
      really painful, if those sprites are scaled objects.
      Look for the branch object about an idea how
      to setup separate lists, for each linebuffer.



:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
9                        Your friendly OP-registers
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


RW: OLP ($F00020)
~~~~~~~~~~~~~~~~~
 32       28        24        20       16       12        8        4        0
  +--------^---------^---------^--------+--------^--------^--------^--------+
  |              low_word               |           high_word               |
  +-------------------------------------+-----------------------------------+

low_word:
high_word:

   The address of the object list. The 32 bit address is word swapped.
   So you gotta store it like this:

            move.l   #objlist,d0
            swap     d0
            move.l   d0,OLP

   It seems a good idea to set this on every VBL. (My programs run more
   predictable this way)



RW: OB ($F00010)
~~~~~~~~~~~~~~~~
 32       28        24        20       16       12        8        4        0
  +--------^---------^---------^--------^--------^--------^--------^--------+
0 |                                object-data                              |
  +-------------------------------------------------------------------------+

 64       60        56        52       48       44       40       36        32
  +--------^---------^---------^--------^--------^--------^--------^--------+
1 |                                object-data                              |
  +-------------------------------------------------------------------------+

object-data:
   This is used to pass data/pointer to the GPU when using a GPU object.
   Lord knows what the second phrase is for...



R: OBF ($F00026)
~~~~~~~~~~~~~~~~
 32       28        24        20       16       12        8        4        0
  +--------^---------^---------^--------^--------^--------^--------^-----+--+
  |                                  data                                :f |
  +----------------------------------------------------------------------+--+
   
data + flag (f):  
   The STOP objects' data field is copied here. 

flag (f):   
   The object processor flag. You can hook up an IRQ (Level 2) (?) 
   to this bit, which can in turn serve to interrupt the GPU and 
   the 68K (and possibly also the DSP). 
   This can be used to generate HBLANK-like interrupts, although the STOP 
   does seldom occur in the blanking period of the video chip, but 
   much sooner! 



:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
10             This is what a branch object looks like:
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Phrase #0:

   63      56        48       40       32        24       16       8    3   0
  +--------^---------^-----+---^--------^--------+--------+--+-----^----+---+
  |        unused          |      Link-address   | unused |CC|   VCnt   |011|
  +------------------------+---------------------+--------+--+----------+---+
                               42..........24      23..16 15.14 13...3   2..0
                                    21bits           8bit  2bit 11bits   3bits

   The branch objects are used to compare the current scanline
   with the value stored in the branch object. Depending on the
   branch instructions comparison mode, the branch is taken
   either on < == != or >. The taken branch taken uses the information
   from the Linkinfo and branches to the phrase-indexed
   object. If the comparison fails it simply examines and handles
   the next object in the list.

   Link-address:
      See the bitmapped object for more infos on the link address.

   VCnt:    
      This is the value you compare the vertical scanline
      counter with (VC). For CC code 10 the operation goes:

      if( object->YCnt < VC)
         goto object->link;


   Condition codes (CC):

       Values     Comparison/Branch
     --------------------------------------------------
        000       Branch on equal            (VCnt==VC)
        001       Branch on less than        (VCnt>VC)
        010       Branch on greater than     (VCntHC in the video chip (maybe for
   every scanline (?), you can branch when the OP detects, that it is
   filling the second linebuffer.

   Other theory: CC is 3 bits long and there exists a fifth value:

        100       Branch if on second halfline

      


:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
11                This is what a stop object looks like:
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Phrase #0 (1 of 1):

 63       56        48        40       32        24       16       8        0
  +--------^---------^---------^--------^--------^--------^--------^----+---+
  |                            data                                     |100|
  +---------------------------------------------------------------------+---+
   
   data:
      Data is copied into the object status register. 
      The lowest bit can be used to trigger IRQs, the rest of  can
      be used at the programmers whim.



:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
12.               This is what a bitmap object looks like:
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Phrase #0 (1 of 2):

 63       56        48        40       32        24       16       8        0
  +--------^---------^-----+------------^--------+--------^--+-----^----+---+
  |        data-address    |     Link-address    |   Height  |   YPos   |000|
  +------------------------+---------------------+-----------+----------+---+
      63 .............43        42.........24      23....14    13....3   2.0
           21 bits                 19 bits        10 bits     11 bits  3 bits
                                    (11.8)

   data-address:  Pointer to the bitmap      ***DESTROYED BY THE OP***
   link-address:  Pointer to the next object
   height:        Height in pixels
   y-pos:         Vertical position          ***DESTROYED BY THE OP***
   type:          Object type


   data-address:  bits 63-43
      An address is a memory address in terms of phrases. To get the
      byte address you have to shift it up by 3. (or in this example
      to get the data-address you would fetch the upper lword with
      the 68K and do):

         move.l   (a0),d0     ; fetch it  (bits 63-32)
         moveq #11,d1         ; or some other less lame way
         lsr.l d1,d0          ; shift it down for phrase address
         lsl.l d1,d0          ; shift it up for byte address

   link-address:  bits 42-24
      The link address strings the object list together. So it really
      is a linked list, not just an array. OK an array would have
      been better and the link could have been a number of phrases
      to skip. It misses the upper two bits two form a proper full
      24 bit address. This means that objects must reside in the
      lower 4 MB. This is addresses a phrase also, not a byte. For
      the byte address shift it up by three.

   height:
      The height of the object is also stored in the first phrase.
      This is the number of pixels an object has in it vertical extent.

   ypos:
      The YPos is predictably the vertical position of the object on
      the screen. The vertical position is the halfline vertical
      position. In video terms the first theoretically possible 
      _visible_ position (depending on your overscanning) will be
      at VDB See Video
      Therefore for non interlaced screens this value is Y * 2 + VDB, 
      for interlaced just Y + VDB.

    Theory 1:
      Like on the Falcon the screen is divided into two horizontal
      halflines. Except for really wide screens in excess of 1024
      pixels horizontally, you always stay in the first halfline.
      (That's why its eleven bits, and the height is only 10 bits.)
      A problem with this theory is, that the Xpos field is 12 bits
      anyway...

    Theory 2:
      This means that in interlace mode this is the "true"
      vertical position on the screen. In non-interlaced modes
      (non-flicker)  modes, you should multiply your Y-Pos by two and
      stuff that into the object.
      (That's why its eleven bits, and the height is only 10 bits.)

   type:
      Lastly the object type indicates with a 0 (000) that this object
      is a normal non-scaled bitmap object.


Phrase #1 (2 of 2):

 63       56        48        40       32       24       16        8        0
  +--------^-+------+^----+----^--+-----^---+----^----+---+---+----^--------+
  | unused   |1stpix| flag|  idx  | iwidth  | dwidth  | p | d |   x-pos     |
  +----------+------+-----+-------+---------+---------+---+---+-------------+
    63...55   54..49 48.45  44.38   37..28    27..18 17.15 14.12  11.....0
      9bit      6bit  4bit   7bit    10bit    10bit   3bit 3bit    12bit
                                    (6.4)

   Curiously there seem to be some unused bits in the top half of
   this second phrase. Anyway starting from the left:

   1stpix:           Pixels to skip
   flags  (flag):    How to handle the source data
   index  (idx):     Index into the CLUT
   iwidth:           Width of the image
   dwidth:           Offset to the next line of the image
   pitch:            Increment for the Datapointer
   depth:            Pixeldepth of the bitmap
   x-pos:            Horizontal position of the object


   1stpix:  bits 54-49
      this is a field of 6 bits that contains the number of
      'bits' to skip before fetching the first pixel. This must be
      used whenever your bitmap data isn't phrase aligned.
      Maybe most often used for CLUT modes.
      You get the value you want to write here by calculating:

      pixelindex * bits_per_pixel (f.e. 8 for 256 color mode)


   flags:   bits 48-45
      You can tell the Objectprocessor the way it should
      handle the display data. These are the values you set here:

             Bit3          Bit2          Bit1             Bit0
      ----------------------------------------------------------------
            Release     Transparent  ReadModifyWrite  Horizontal Flip

      Horizontal flip / aka Reflect:      
         Lets the Objectprocessor run its path from the other end 
         of the sprite data, which should effectively flip your 
         sprite data. 
         Ex:
            an eight bit sprite is normally drawn as                

                    01234567
  flipped           76543210
                    ^
                    |
             start at XPOS.


      ReadModifyWrite:  
         The object processor reads the the pixel from the line 
         buffer does something with the bitmap pixel value and the 
         linebuffer pixel value and stores the result back into the 
         linebuffer.

         For CrY-color the lower byte of the bitmap pixel value is 
         sign extended and added to the lower byte of the 
         linebuffer pixel value, thereby increasing or decreasing 
         (depending on the sign) the intensity of the linebuffer 
         pixel. This is a 'saturating add' meaning that you don't 
         wrap around, but subtractions stick at 0 and additions stick 
         at 255.
         The cry hues (upper byte) are mangled even more strangely, 
         the effect could (with the right values) be like looking 
         through a colored glass (your bitmap object with the 
         RMW-flag set) onto the background (the other bitmap objects 
         below it)
         This might be similar to what happens when gouraud-shading. 
         Refer to the blitter docs.

      Transparent:      
         When the source pixel is zero, this pixel will not be written. 
         This is the way to achieve transparent sprites with the GPU. 
         (Both CLUT and non-CLUT pixels)

      Release:    
         If cleared then the OP 'hogs' the bus for the time it takes to 
         fetch the scanline data of the object. If this bit is set, 
         then the bustime is shared with other processors. If you have 
         lotsa interrupts going, this might be worthwhile.
         Should apparently NOT be set on objects with more than 8 
         bitplanes, probably because then the OP might glitch. 

   index (idx):   bits 44-38
      Index into the ColorLookUpTable (CLUT)
      This information is only used for 1 - 2 or 4 bitplane objects,
      to determine the offset in the CLUT to use.

         1 bitplane           2 bitplane       4 bitplane
      -------------------------------------------------------
           iiiiiiii          iiiiii0         iiiii00

      The value is shifted left once and then used as an index into
      the CLUT. Note that in 2 + 4 bitplane modes not all bits are in
      used, because the lower bits are replaced with the pixel value.

      For example in 4-bits-per-pixel mode pixel #7 and an idx value 
      of 64 gives you an index of (64*2)+7 -> 135

      So you preload the CLUT with the colors you want to use, for
      example green at index #241. When you want to display a small
      green arrow on the screen (as a pointer) for example you set
      your object to transparent, and the index to 120. When the
      object pointer fetches a set pixel, it will write the green
      value into the linebuffer.

   iwidth:     bits 37-28
      Tell the OP how many *phrases* to draw in each line. This is 
      the actual number of phrases to draw, not the horizontal index 
      to index the next line (dwidth). This is probably not just  
            #pixels_to_draw / bits_per_pixel, 
      but rather the number of phrases the object spans. If a 32bit 
      object spans two phrases you should enter a two here.

   dwidth:  bits 27-18     
      The horizontal phrase offset the OP should use to index to the 
      next line. If you data is laid out in consecutive strips of 
      horizontal data like this:

      screen :
         00000000000
         11111111111
         22222222222
         33333333333

      memory :
         00000000000111111111112222222222233333333333

      then this will be just the same as . But if your data
      is laid out like this:

      00000000000xxxxx11111111111xxxxx22222222222xxxxx33333333333xxxxx

      you should set  to the proper offset so that adding
       to the phrase-address will bring you to the next line.
      (This might be useful for 'horizontally scrolling' objects).

   pitch (p):  bits 17-15
      If you so desire you can organize your bitmap data in even 
      stranger ways than one would think possible. With this value 
      you control the data-pointer that the OP uses to traverse your 
      bitmap data. This value is added to the data-pointer after the 
      last fetch. If you use a 0 you will be always fetching the same 
      phrase over and over again. Normally you set  to 1, to 
      advance through memory contiguously.
      This will come more into play, if you want to use 
      Z-buffering or/and optimize 
      your screen layouts for blits.

   depth (d):  bits 14-12  
      The number of bits of each pixel. This specifies the rez of the 
      object. You have the choice between direct pixel modes (16 or 
      24/32 bits) and indirect (CLUT) pixel modes. Note that using 
      transparency effectively reduces the number of available colors 
      by one (color #0).

      Values:

         0  1 bits per pixel  2 colors       CLUT
         1  2 bits per pixel  4 colors       CLUT
         2  4 bits per pixel  16 colors      CLUT
         3  8 bits per pixel  256 colors     CLUT
         4  16 bits per pixel 65536 colors   CRY
         5  24 bits per pixel 16 Mio Colors  TrueColor
         6  unused
         7  unused

   xpos:    bits 11-0    (-2048 to +2047)
      The horizontal position of the object on the screen (or in the 
      linebuffer if you will).
      Therefore xpos=0 is the leftmost pixel in the linebuffer. If you
      are overscanning (linebuffer (HDB) starts outside the visible
      area of the screen), then you will have some cut off.
      See the video documentation
      If you have a really big sprite, like f.e. a huge "scrolling"
      background bitmap, you should remember that the data which goes off 
      to either side of the screen still requires memory fetches! Therefore
      it might be wise to change the object definition. 
      Modify your big objects so that only what is seen is drawn.



:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
13.            This is what a scaled bitmap object looks like.
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Phrase #0 (1 of 3):

 63       56        48        40       32       24       16        8    3   0
  +--------^---------^-----+---^--------^--------+--------^--+-----^----+---+
  |       data-address     |    Link-address     |  Height   |   YPos   |001|
  +------------------------+---------------------+-----------+----------+---+
      63 .............43         42..........24   23 ..... 14 13 ..... 3 2.0
            21 bits                  19 bits        10 bits     11 bits  3 bits

   Except for the type, which is different, this is just
   the same as the first phrase of the bitmap (non-scaled)
   object.


Phrase #1 (2 of 3):  This is the same as the the 'bitmapped' object


Phrase #2 (3 of 3):

   63      56        48       40       32        24       16       8       0
  +--------^---------^---------^--------^--------+--------+--------+--------+
  |                  unused                      | remain | VScale | HScale |
  +----------------------------------------------+--------+--------+--------+
                                                   23..16   15...8   7....0
                                                     8bit     8bit    8bit

  remainder:   Keeps the VScale remainder ***DESTROYED BY THE OP***
  v-scale:     Vertical scaling factor
  h-scale:     Horizontal scaling factor


  The scale is a fractional representation, using 3 bits for the integer
  part and 5 bits for the fractional part. Or in ASCII-Graphics:

   76543210 00100000 or 0x20 is 1.0
   iiifffff 00010000 or 0x10 is 0.5

  The remainder is used by the objectprocessor for the vertical scaling,
  as a memory place. You should initialize it to 0.5 for best results,
  although in a lot of demo-code its initialized to 1.0.



:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
14.                     The elusive GPU-object
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Phrase #0 (1 of 1):

 63       56        48        40       32       24       16        8    3   0
  +--------^---------^---------^--------^--------^--------^--+-----^----+---+
  |                           data                           |   ypos   |010|
  +----------------------------------------------------------+----------+---+
       63................................14                   13.....3  2..0

ypos:
   when the VC  matches the value in ypos,
   then the GPU object is active. If all ypos bits are set then the GPU
   object is always active.

   The GPU gets an interrupt, it is believed that the OP is not halted 
   because of this action. You might want to stuff some information
   into the unsused parts, which the GPU could then read from the OP 
   registers. The GPU can then be used to control OP program flow using 
   OBF (F00026) and branch object condition 3.

   The whole GPU-object is copied to OB, so that the GPU can examine the
   data part to see which GPU object has triggered the IRQ.



:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
15 You can also look at the object in terms of C-structs, that's how
   they'd look like.
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

/* DON'T USE THESE BITFIELDS WITH ANYTHING ELSE THAN A
   ***GOOD*** C-COMPILER AND A MOTOROLA PROCESSOR
*/


   #define byte   unsigned char
   #define word   unsigned short
   #define lword  unsigned long
   #define phrase unsigned long long


   typedef struct
   {
       lword   data:21;
       lword   link:19;
       word height:10;
       word ypos:11;
       word type:3;
   } bitmap_obj_phrase_0;


   typedef struct
   {
      word  unused:9;
      word  firstpix:6;
      word  flags:4;
      word  index:7;
      word  iwidth:10;
      word  dwith:10;
      word  pitch:3;
      word  depth:3;
      word  x_pos:12;
   } bitmap_obj_phrase_1;


   typedef struct
   {
      lword   unused:24;
      word    remainder:8;
      word    v_scale:8;
      word    h_scale:8;
   } scale_obj_phrase_2;


   typedef struct
   {
       lword   unused:21;
       lword   link:19;
       word    conditioncode:2;
       word    unused:8;   ;; maybe index to register ?
       word    ypos:11;
       word    type:3;
   } branch_obj_phrase_0;


   typedef struct
   {
       phrase  unused:61;
       word type:3;
   } stop_obj_phrase_0;

   typedef struct
   {
       phrase  unknown:61;
       word type:3;
   } gpu_obj_phrase_0;


   typedef struct
   {
      stop_obj_phrase_0 p0;
   } stop_obj;


   typedef struct
   {
      branch_obj_phrase_0  p0;
   } branch_obj;


   typedef struct
   {
      gpu_obj_phrase_0  p0;
   } gpu_obj;


   typedef struct
   {
      bitmap_obj_phrase_0  p0;
      bitmap_obj_phrase_1  p1;
   } bitmap_obj;


   typedef struct
   {
      bitmap_obj_phrase_0  p0;
      bitmap_obj_phrase_1  p1;
      scale_obj_phrase_2   p2;
      /* need one padding phrase ? */
   } scale_obj;



BUGS:
=====

This might be a bug or not but you should be aware, that the OP is
a high priority bus device, that does not like to be interrupted
by higher priorised devices.
(See: Priorities for more info).

While the OP is walking along its object list and filling the linebuffer,
it is effectively shutting out the rest of the system during that time.
This might not be too convenient, if you have a high frequency interrupt
going (like maybe the DSP playing a Tracker module). 

The RMW-flag is said to be buggy, in that the last pixel of the RMW object
might be corrupted, unless the first pixel of the first following object
is cleared (strange!!)



SMALL DISCUSSION:
================
   Since the object processor walks the object list for each
   scanline, you should consider the following:

   If you have 64 bitmaps objects in your object list and a
   vertical rez of 240 lines going and a refreshrate of 60Hz
   the Objectprozessor is pulling

   60 hz * 240 lines * 64 objects * 2 phrases =  1.8 Mio phrases/s
   ~ 14.7 Mio bytes/s  for the object processor list alone!
      (ca. 14% of the systems bandwidth)


   If you figure you're using 128x128x16bit sprites fully visible,
   you're doing:

   128x128*16bits/64bits = 4096 phrases a sprite
   64 sprites in 60hz    = 3840 sprites
   yields 15728640 phrases/s or 120 Mbytes/s

   So it is fairly easy to unknowingly saturate the bus with
   a nice object list. (TEST THIS, possibly the OP is smart
   enough to detect, when the scanline is needed by the Video
   chip and stops processing the object list)

   It should be obvious that non-"truecolor" sprites still make
   lotsa sense, when you're using the OP heavily.

   It would have been better in our opinion, if Atari had used a
   small 2-Kbit hitbuffer (or single bit Z-Buffer) and reversed
   the object order, so that the nearest object comes first and
   the background last in the object list.

   With such a slightly more complicated scheme,the OP could
   run at a rather constant:

      hrez * vrez * refresh * average_bits_per_pixel
      ---------------------------------------------- phrases/s
               64


   If it is true that the OP has on average one scanline time to
   prepare the linebuffer, we can do a quick estimate how complex
   such a line can be:


   NTSC
         30 Hz refresh rate (2 refreshs a 1/60s)
         525 lines frame

   Therefore 525*30=15750 lines/s 

      13.3 mio phrase/s / 15750 lines ~ 825 phrases / scanline
   or ~ 3300 truecolor pixels / scanline

   this means that on a 320 pixel display you can have approximately
   ten layers of overlapping truecolor parallax (sans sprites)!!
   Or if you have a 320 pixel background, you can have about 80 
   32 bit wide truecolor sprites on the same scanline.

   Its doubtful that you'll reach these limits...

   Since the OP with a 320x200 rez is pulling data only on 200 lines
   of 525 scanlines, you can use up (without producing display errors)
   only ~40% of the Jaguars bus resources this way. Nice!

   Soon you'll find out that the designers did not give the branch 
   object a second link phrase, so that it would seeem that in effect 
   you're forced to assemble your OP-list in one continous memory
   block anyway. Or you would be restricted to using branch objects 
   only at the beginning of your OP-List, like this:

              +------------+
             /              \ 
            /                v
         branch....branch....stop
                      |
                      v
                    bitmap ----> scaled  ----> stop

   BUT, you can also deploy double branch objeczs, one acting as a Bcc
   the other as a BRA two get a two way connection.


NEEDED STUFF:
   Need to document the logic setting up objects, that cross
   boundaries (especially the scaled bitmaps)

Nat! (nat@zumdick.rhein-main.de)

Klaus (kkp@gamma.dou.dk)

$Id: op.html,v 1.28 1997/03/30 02:27:13 nat Exp $