Scott's graphics & demo coding notes
$Id: index.shtml,v 1.15 2001/08/28 18:18:27 sdlpci Exp $

If you have any info or suggestions, send me mail!
demo@umlautllama.com

main home || absynth.com || csh home


Topics




1.1 Basics

This document was originally intended to be notes about a project I had worked on with Rob Reay from 1996 thru 1998, but as I've been writing it, I've been re-implementing the code for a new project I'm working on for Zendragon Software. The primary goal of this document is to elaborate on some of the

Why code a demo? Why work on anything like this at all? Simple. Because it's there. Okay. That was a trite, sucky answer. But seriously, I originally worked on this because I wanted to create a platform so that I could easily try out different graphic hacks. After Rob & I implemented the initial version of this project, we had just that. I was able to try out algorithms I had thought of (Pallete Rotations, Static) or techniques I had discovered on the web (Glassmap).

It was also fun to work on, because we weren't trying to be accurate to anything. We would try out an effect, and even if it didn't do what we were expecting it to do, if it looked cool or interesting, we kept it around. Since there was no real pressure to get this completed to any particular spec, or to any particular timescale, we were free to do what we wanted, when we wanted to do it. (My favorite way to work. :)

Most of this information and code is being integrated into my Mujaki graphics engine.

Some conventions used in this document:
All examples of code or pseudo code are given in a preformatted monospaced font like this:


		if (this is an example)
		{
		    do some stuff in here
		}

When ranges of numbers are being expressed, the standard range identifyers will be used. [..] for inclusive, (..) for exclusive. For example, the range 10, 11, 12, 13 14, 15 can be signifyed with: [10..15] or (9..16) or [10..16) or (9..15].




1.2 C vs ASM coding and Optimizations

There's a tradeoff if you code in ASM. Sure your programs have the potential of being much quicker and efficient, but since (unlike motorola) intel never setup an assembly language for their chips, most assemblers for the PC have different syntaxes... So porting from one assembler to the next is annoying, perhaps difficult at best.

Also, in most cases, most compilers will do a better job at building efficient code than you can, besides, you gain portability. If you use GCC, EGCS, or DJGPP (excellent free compilers), try out "-funroll-loops", "-finline-functions". Try "-fomit-frame-pointer" if you feel like taking risks. ;) Check out your compiler's options for different optimization levels. Try them all out, note their performance. (Remember, this is "Computer Science"! ;))

Personally, I always use "-Wall" to show me ALL warnings, and "-pedantic" to be really stingy with warnings and syntax. These have helped show me lots of errors, and possible problems I wouldn't normally have seen. I also never let a module I write be "finished" until it will compile it without ANY warnings.

If you do end up using ASM for some core routines (blits, etc) you might want to document the hell out of those routines, and also do as much in C or C++ as you can, to let it be more portable, maintainable, and reusable.


Generally, Add's are quicker than multiplies... at least on Intel i86 processors, this is true. So wherever possible, when it makes sense, replace (X*2) with X+X. When it gets to be more than 4 or so, stick with the multiply.

Shifts are almost always quicker than multiplys divides. A handy trick is to remember that X/4 is the same thing as X>>2 also:

		X/2 = X>>1         X*2 = X<<1

		X/4 = X>>2         X*4 = X<<2

		X/8 = X>>3         X*8 = X<<3
If you wanted to go crazy... you'd think this would work: (but it doesn't)
		X/3 = X>>1 - X    

		X/5 = X>>2 - X

		etc.
But in all actuality:
		X>>1 - X    = -X/2	(not X/3)

		X>>2 - X    = -4/5 * X	(not X/5)

	    However, these are true:

		X<<1 + X    = 3X

		X<<2 + X    = 5X

		etc.

Similarly, the Modulo (%) operator is sometimes computationally intensive, so wherever possible, use a bitwise OR or AND wherever possible.

		Y = X % 4      becomes      Y = X & 0x03

		Y = X % 8      becomes      Y = X & 0x07

		etc.
You might want to re-arrange arbitrary bounds in your code to take advantage of this. For example, break up a circle into 256 steps instead of the standard 360. that way instead of something like:
		bounds_checked_angle = (angle + offset) % 360;
you use:
		bounds_checked_angle = (angle + offset) & 0xff;
It's much quicker.


When you're doing image manipulations, and you want to reference every pixel in an object, you might be tempted to do something like this:

    
    for ( x = 0  ;  x < image.width  ; x++)
    {
	for ( y = 0  ;  y < image.height  ; y++)
	{
	    // process pixel x,y

	    image[ (y * image.width) + x ] = foobar;
	}
    }

First of all, i should say that this will be explained in more detail down below in Section 2.1, below.

Obviously, here, for every pixel in the array, there is an extra multiply. If you look at this carefully, you will notice that you can move it out so that the y offset can be computed once per line. So, what should be done here, is a swap of the X and Y loops, and moving that multiply out so that it only gets done once per row. (More about exactly what is going on here with this offset below in section 2.1.)

But that's not all we can do. If you look at it again, you'll notice that we can just add the value of image.width on once per row increment, rather than un-necessarily remultiplying it once per line, giving us:

    int y_offset = 0;
    
    for ( y = 0  ;  y < image.height  ; x++)
    {
	for ( x = 0  ;  x < image.width  ; x++)
	{
	    // process pixel x,y

	    image[ y_offset + x ] = foobar;
	}

	y_offset += image.width;  // rather than (y_offset = y * image.width)
    }

So, in essence, we just replaced one multiply per pixel with one add per line. Quite a deal of improvements in savings there. Some REALLY GOOD optimizers might do some of this for you... but why rely on an optimizer, when later on, you might be using a language (like RSI's interpreted IDL language for example) that does no optimizations.

Just be sure to comment your code, so it makes sense when you come back to it later. Some optimizations make sense when you do them, but the code ends up looking like a tangled mess of punctuation later. (Especially if you're coding in Intercal.)




1.3 Lookup Tables

Wherever possible, replace computationally intensive functions with lookup tables. Two specific places that I use this, are for Sine and Random number functions.

Usually, you will only need to find the Sine of a single degree, or perhaps a half-degree. therefore, with a table (either precomputed to source code, or precomputed at runtime) of either 360 or 720 elements, you can have reduce the intensive math function sin(angle) down to a lookup table macro:

#define SIN(angle) sin_table[(angle)%360]
and also, with a little simple math, you can use the same thing for cosine as well.
#define COS(angle) sin_table[((angle)+180)%360]

Likewise, at runtime, you can create a table of a few thousand or so random numbers (if you're concerned with the time to compute a new random number) then reference it similarly. If you're really worried about lack of randomness, then make a new random number table occassionally. You mighr want to do this for games.. For graphics demos, it's not really necessary.

int rnd_pos = 0;
#define RND_MAX (1000)
#define RND_NUM() rnd_table[++rnd_pos]; rnd_pos = rnd_pos%RND_MAX;

One problem with doing this method is that if you try to reference negative angles, (-90) for example, you will go outside the bounds of the lookup table. This is bad. What I've done to compensate for this is to add in a very large multiple of 360, like so:

#define BIG_360_MULTIPLE (360*40)
And then replaced the SIN and COS macros with these:
#define SIN(angle) sin_table[(angle + BIG_360_MULTIPLE)%360]
#define COS(angle) sin_table[(angle + BIG_360_MULTIPLE + 90)%360]




2.1 Bitmaps and Blitting

This section will be all about 'Bitmaps' and a technique called 'Blitting'. There are two ways of getting image data out to the screen. One is with Raster Graphics, and one is with Vector Graphics. Raster graphics are what you're using right now to look at this document. You have rows and rows of pixels that are either on, off, or somewhere inbetween. Vector graphics instead store lines, or vectors, from one point on the display tube to another point on the display tube. Video games like "Tempest" and "Asteroids" use vector graphics. Older oscilloscopes also use a vector system for their display.

So, what we'll cover here are Bitmaps and Raster Graphics. What this basically is, is a section of memory that has the same or similar properties as the region of memory on your video card or the region of memory used by your grapics library of choice. Even though the 'local' buffer is just a huge contiguous block of memory (ie for 320x200, 8bit paletted, it's 64kb) think of it as a two-dimensional array. There is one element of the array for each pixel being displayed on the screen. For a paletted video mode, it's usually a single 8 bit byte for each pixel, for a '32 bit truecolor' mode, it's a 32bit value at each pixel. For argument's sake, I'm using the following terminology, borrowed from x86 asm:

The other part of this equation is how you get the image data from this 'local' buffer into the video hardware or graphics library. This is the process by which you take this chunk of memory and shove it into said hardware or library. It is called "Blitting". It can be done with one x86 asm opcode, or a full loop. (For a definition of "blitting", check out the entry for 'blit' in the jargon file.



2.1.1 Data Structures



2.1.2 Indexing (2d and 1d)



2.1.3 Shortcuts



2.1.4 Blitting




2.2 Bitmapped Fonts



2.2.1 Alpha channel



2.2.2 Pre-render bitmaps




2.3 Timing Lists, and Movement Control

I first used this method for event queueing back in 1995-6 for my PLFi project for my independant study in Computer Graphics. Since then, I've re-written it from memory for the current project. I have no problems re-using code, in fact part of the PPM loader for the Extrpolation demo code above was originally written for a 1994 project from my Graphics II class, (it was a 3-D object editor for AmigaDOS), and has been used for at least two other projects since then. I just felt that the original code from the PLFi project was very specific for that project, and I could re-write it to be more generic, which is exactly what I have done.

The files in the 2-d engine project relating to this are:

	    event_list.c	the generic event list handler code
	    event_list.h	protos & data structures
	    events.h		application specific event #define's
	    timing.c		handles time-checks & pops to-do events
	    timing.h		defines the timing resolution



2.3.1 Timing Event Lists

The event list as I designed it is a double-linked double list. There are two different types of elements in the list. There is the EV_TIME_EVENT element and the EV_EVENT element.

EV_TIME_EVENT

	typedef struct _ev_time_event {
	    struct _time time;            // the time of these events
	    EV_EVENT * events;            // the events for this specific time
	    struct _ev_time_event *next;  // next time node  
	} EV_TIME_EVENT:

EV_EVENT

	typedef struct _ev_event {
	    long id;		// what kind of event is it?
	    long l1;		// misc long parameter
	    long l2;		// misc long parameter
	    void *d1;		// misc data pointer parameter
	    void *d2;		// misc data pointer parameter

	    struct _ev_event *todo;  // next event todo node  
	    struct _ev_event *next;  // next event node  
	    
	} EV_EVENT:

There is also a set of associated lists here too. There is the primary EV_TIME_EVENT timing list, containing a list of time events from Zero-time to end-time. There is also the to-do EV_EVENT list, containing the list of events queued up to be processed.


Populating the event list is an entirely different problem all-together. For my old PLFi project, I wrote my own parser to read in the ascii text datafiles, which built up the lists as needed. For the Demo, I had some events hardcoded in with the source code. For the 2d engine, I plan on having a two-step process.

Step One: Reads in data from ascii text files using a Yacc/Lex grammar parser. This will check for errors, as well as building up the data structures to be used. Once everything loads in without any errors, the data structures will be saved out in binary form.

Step Two: The displayer executable will read in these pre-parsed, pre-error-checked binary files, and re-construct the data structures as built in the first step.

Doing it this way insures that the data cannot be looked at easily by prying eyes, as well as the fact that the data will be pre-parsed, so we know it will work fine at runtime. During testing, we can eliminate the save step, and parse-in and run the file live.



2.3.2 Movement control & roundoff




3.1 2-D Polygon routines



3.1.1 Lines - Horizontal and Vertical



3.1.2 Lines - Arbitrary



3.1.3 Polylines



3.1.4 Filled Rectangles



3.1.5 Filled Trigons




3.2 Flood Fill algorithms

The act of filling a region with a color can be tricky. At a quick glance, it seems like it should be a trivial problem, but you will quickly notice, if you start to think about it, that it is not. Here are two potential solutions to the problem.

3.2.1 Recursive Flood Fill

This one comes directly out of Computer Graphics: Principles and Practice. It works well, but being that it recurses for each pixel, it can sometimes use up all of your stack space, and crash your prgram. A simple 320x200 image could use 8 or more megabytes of stack space just to flood fill. It works, but it is memory intensive.

The basic algorithm is this:

    void _floodfill(x, y, oldcolor, newcolor)
    {
	if ((x,y) is beyond image bounds)
	{
	    return;
	}

	if ( the pixel at (x,y)  ==  oldcolor )  /* conditional */
	{
	    set the color at (x,y) to newcolor;
	    _floodfill(x, y+1, oldcolor);
	    _floodfill(x, y-1, oldcolor);
	    _floodfill(x+1, y, oldcolor);
	    _floodfill(x-1, y, oldcolor);
	}
    }

And to kick it off: (for example)

    _floodfill(mouse_x, mouse_y, GetColorAt(mouse_x, mouse_y), new_color);

What it will do is that for every adjoining pixel which is the same color as what was initially passed in, it will set those pixels to the new color. But as you can see, for each of the directions, off of that pixel, it will call itself if the pixel is the original color.

One variation is to change the border conditional to check for a boundary color instead of a contiguous color. That way you could have a "fill to a different color as a border" rather than a "fill all of the selected color".



3.2.2 Pat's Algorithm

Thanks to Patrick Stein for this algorithm!

To do a flood fill of a region, the first that we need to do is to create some data structure to store the region. I am going to use a simple linked list where each node contains information about a single, horizontal line that is in the area of interest.

    typedef struct Extent {
	unsigned int x1;
	unsigned int x2;
	unsigned int yy;

	unsigned int checkedNeighbors;

	struct Extent* next;
    } Extent;

This structure stores the two horizontal coordinates and the vertical coordinate of the horizontal line in the area of interest. It also stores a flag to see whether we have already checked the space above and below this line. And, it keeps track of the next item in the linked list.

Now, here is some pseudo-code for flood filling an area, starting with a given point (sx,sy).

First, we will prime the pump.

    Empty the list of Extents.

    Keep going left until you hit a border to find x1.
    Keep going right until you hit a border to find x2.

    Add (x1,x2,sy) to your list of Extents (with the
	``checkedNeighbors'' field set to false).

Borders are places where two adjacent pixels differ signficantly enough. For most purposes, differing at all is differing enough.

Then, we will keep exploring regions that are above or below regions we have already found until we have checked all of the neighbors of all regions of interest. The Check pixels in the neighboring row here is broken out below.

    while there are extents with checkedNeighbors set false
	Pick an Extent with unchecked neighbors.

	Set yy to Extent's yy

	Set sx to Extent's x1
	Set sy to Extent's yy-1
	Check pixels in the neighboring row

	Set sx to Extent's x1
	Set sy to Extent's yy+1
	Check pixels in the neighboring row

	Set the Extent's ``checkedNeighbors'' to true.

To check pixels in the neighboring row, we have to loop through the whole horizontal extents of the region we are trying to check. But, we can fast-forward through parts that we already know are in the region of interest.

    while sx is less than or equal to Extent's x2
	If pixel at (sx,sy) matches (sx,yy)
	    If (sx,sy) is already accounted for in the Extents list
		Set x2 from the x2 in the Extent containing (sx,sy)
	    Otherwise
		Go left (in sy's row) until you hit a border to find x1.
		Go right (in sy's row) until you hit a border to find x2.
		Add (x1,x2,sy) to your list of Extents (with the
		``checkedNeighbors''
	    Set sx to x2 + 1

	Increment sx

The reason we can set sx to x2+1 and still increment sx is because we know that part of the region of interest ends at x2. This means that x2+1 is *not* in the region of interest. The next potential pixel for the region of interest is x2+2.

And, that does it for finding the region. Now, you can go through and do what you want with those regions. You may wish to keep the list of Extents sorted so that it is a little quicker to look for things in the list and a little simpler to use the list once you're done with it.

This algorithm will never check any pixels that are outside of the border region. It will check isolated pixels inside the region of interest four times, but it will only check true border pixels two times at the most. And, it can glide over areas that have already been found. There are algorithms that do fewer checks but require more memory. I think this one is a good trade off.




3.3 3-D manipulations and basic Matrix Algebra




4.1 DVF (Digital Video Feedback) effects




4.2 PET (Pixel Enhancement Technology) effects




4.3 Laserfont & Scatter

I was looking at a laser drawing some text (in the movie "Real Genius" I think.) And I realized a way to achieve a similar effect.

First, we need to draw text. We'll use a very simple font. The one in the demo source code was designed on an 8x8 grid, with no more than five lines per character. Some letters are pretty crude, but it ends up looking pretty neat and stylized.


Use a DVF with a simple fade, and draw most of the text string each frame, finishing the string on the next frame. For added realism, scatter the beam a little by adding in a random value to each vertex of each character. Plus or minus one pixel should be enough. Pick a random number between 0 and 2, inclusive, then subtract one from it to give you a value in the range [-1..+1]. A pre-generated random number table will really accelerate the code.

For added effect, connect the last vertex in one character with the first vertex in the next character with a dim line. Also, varying the intesity of the "laser" lines as they're drawn will add to the effect, producing a shimmering look to the text.




4.4 Glassmap routine




4.5 Posterization (Segmentation)


This is mainly a palette change on a paletted video mode, or a simple image color modification on non-paletted video modes. The "Posterization" filter in Adobe Photoshop, or "Segment" in the Image Magick toolkit accomplish the same effect.

As this plot shows, the raw data intensities are the dark red plot, while the resulting intensities are the green plot. All you basically do is do a modulo of the inputted intensity with some number, where that number is (number of total intensities)/number of steps. For this example, there are 10 steps. so if a value in the original image is 0-9 it gets the result of 0. If the value in the original is 10-19, it gets the value of 10, and so on. The above mandril image on the right shows the result of a four-level posterization. Each of the three compositional colors (red/green/blue) have been done independently, as to improve the overall number of colors, while still leaving it segmented.

Related things would be to reduce the number of palette colors down to say 4, then figuring out the four best fit colors. This is easiest to do with a grayscale image.




4.6 Pixelization (Mosaic)


This one is relatively easy to do. There are actually two different ways to do it. The above examples show a Mosaic of 5 and 10 for the center and right images, respectively. The "Mosaic" filter in Adobe Photoshop accomplishes the same effect.

For pixel(x,y) in the result image
take the value of the original image's pixel at (x%N),(y%M)
where N and M are the size of the resulting "pixel".

The other way to do this is to average all of the pixels in the N by M region on the original image corresponding to the pixel on the result image. As would be expected, this is MUCH slower, but produces a more accurate result..

A speedup would be to compute out a single row then copy that line M times from the result to the result. There's no reason to re-compute the next M-1 lines, as they will be identical anyway. This improves the speed drasticly. (Try it both ways, and check out the results.)




4.7 Blur / Low-Res & Bilinear Interpolation


As would be expected, there are also many ways to do this.

The first way to do this is to actually average a bunch of pixels in the source into the current pixel. This method is quite computationally intensive, but generally looks much better, and is more accurate. You can either average all 9 pixels in the original (the bordering pixels, and the one under the current) the vertical and horizontal neighbors (5 pixels) or the diagonal neighbors (5 pixels). I will refer to these using cardinal direction notation in the comments below. (North, South, etc.)

    for ( y = 0  ;  y < image.height  ; x++)
    {
	y_offset = y * image.width;

	for ( x = 0  ;  x < image.width  ; x++)
	{
	    result[x][y] = ( source[x+0][y+0] +     // current position

			     source[x+0][y-1] +     // North
			     source[x+0][y+1] +     // South
			     source[x-1][y+0] +     // West
			     source[x+1][y+0] +     // East

			     source[x-1][y-1] +     // North West
			     source[x+1][y-1] +     // North East
			     source[x+1][y+1] +     // South East
			     source[x-1][y+1] +     // South West
			    ) / 9;
	}
    }

As you can see, this will take quite a bit of time to do. There is a lot of math involved. This is 9 adds and a divide for EVERY pixel. For a low-resolution 320x240 display, that would be (320 * 200)*9 adds, or 691,200 adds alone. Yow!

You might want to try other effects, like selecting pixels two-pixels away, instead of direct neighbors, or not include the current position in the average (which brings the pixel count to 8 or 4, which you can use a standard shift instead of a divide for the average.) Experiment... See what you like!


The other is Low-Res interpolation. One of the least accurate ways of doing this would be to basically Pixelate the image as above, but instead of "filling" the squares with the solid value, fill them with the average value for that pixel as it progresses along X and Y. (Make a lower resolution version of the image, then interpolate the missing pixels in it.) An example is shown below. "A", "B", "C" and "D" are copied from the same coordinates in the original image. All of the pixels inbetween are interpolated.

A (A*0.75)+(B*0.25) (A*0.50)+(B*0.50) (A*0.25)+(B*0.75) B
(A*0.75)+(C*0.25) (B*0.75)+(D*0.25)
(A*0.50)+(C*0.50)
... etc ...
(B*0.50)+(D*0.50)
(A*0.25)+(C*0.75) (B*0.25)+(D*0.75)
C (C*0.75)+(D*0.25) (C*0.50)+(D*0.50) (C*0.25)+(D*0.75) D

Naturally, this will have to be done for R,G, and B, or whatever component colors there are in the colorspace you're using. As would be expected, the following optimizations can be made:

      (G*0.50)+(H*0.50)   becomes   (G+H)/2       or  (G+H)>>1

      (G*0.75)+(H*0.25)   becomes   (G+G+G+H)/4   or  (G+G+G+H)>>2

      etc...
You can also interpolate when you enlarge an image larger than a 1:1 image pixel to screen pixel ratio. Game machines & video cards will do this in one or two directions. They call it "bilinear" or "trilinear" interpolation.

Another way this can be done is that if you go through and do:

It's more time intensive, doing it in 3 passes, and doesn't really offer much over just straight math, but it looks clearer.

The easiest way to algorithmicly do this is to first interpolate between A and B for every horizontal. With this, you'll end up with horizontal stripes of interpolated data, with blank space between them. Then Interpolate the vertical segments that are unfilled. This is pure bilinear interpolation.

I do this in the FMV Source, in the file sample.c at the bottom;


/*
 *  Do a bilinear interpolation
 *
 *   A . . . B
 *   . . . . 
 *   . . . .
 *   . . . .
 *   C       D
 *
 *  first determine the A..B value for every row. 
 *  then do the A..C values for every column.
 */
void
Image_Upsample(
        IMAGE * img,
        int amount
)
{
    IMAGE * newimg;
    PIXEL p1, p2;
    int x,y,w;
    int y1,y2;

    int a = amount -1;

    // range check first...
    if (!img) return;
    if (!img->data) return;
    if (!img->height || !img->width || !img->bits) return;

    // pass one .. do horizontals...
    for (y=0 ; yheight-a ; y += amount)
    for (x=0 ; xwidth-a ; x += amount)
    {
            p1.r = img->data[y*img->width+x].r;
            p2.r = img->data[y*img->width+x+amount].r;

            for (w=1 ; wdata[y*img->width+x+w].r =
                img->data[y*img->width+x+w].g =
                img->data[y*img->width+x+w].b = (int) (
                    ((float)p1.r) * (float)(amount - w)/(float)(amount) +
                    ((float)p2.r) * (float)(         w)/(float)(amount)
                    );
            }
    }

    // pass two .. do verticals...

    for (y=0 ; yheight-a ; y += amount)
    {
        y1 = y*img->width;
        y2 = (y+amount)*img->width;

        for (x=0 ; xwidth-a ; x++)
        {
                p1.r = img->data[y1+x].r;
                p2.r = img->data[y2+x].r;

                for (w=1 ; wdata[y1+x+(w*img->width)].r =
                    img->data[y1+x+(w*img->width)].g =
                    img->data[y1+x+(w*img->width)].b = (int) (
                        ((float)p1.r) * (float)(amount - w)/(float)(amount) +
                        ((float)p2.r) * (float)(         w)/(float)(amount)
                        );
                }
        }
    }

}



4.8 Image Enhancement & Extrapolation

This is a neat one. My source for this is here. It's basically an implementation of the Interpolation/Extrapolation technique as found at SGI's old GRAFICAObscura . It's a simple technique that lets you do all sorts of image processing using the same procedure. (Brightness, contrast, tint, sharpness, saturation). The "Unsharp Mask" in Adobe Photoshop uses this technique.

You need to have two images to create the final image. One is the original, the other is the "degenerate" image. What you basically do is have the degenerate be the reverse extreme of what you want to do. To sharpen, use a blurred degenerate, to increase brightness, use a black degenerate, etc.

Then, for each pixel:


	dest_pixel = (1.0 - alpha) * degen + alpha * orig


	degen   is the corresponding pixel on the degenerate image

	orig    is the corresponding pixel on the original image

Where alpha is a value from 0.00 to 1.0 for interpolation or 1.0 to >1.0 for extrapolation.

So if you want to sharpen an image, blur the image out to be the degenerate image, then use an alpha of 2.0 or greater or something. There are demonstration images at the SGI site showing results from using different values. I'll eventually have my own demo images here once i write a simple test app.

Why would you want to do this instead of some other technique? A lot of times it's easier to create the degenerate case for an image rather than diving in and generating the enhanced image. This is definitely the case (as i have no idea how to do it otherwise) for sharpening, and saturation enhancements. You can also do multiples at once. If the degenerate is a blurred grayscale version of the image, the extrapolated image will be sharpened, and more saturated. Don't think about it too long. It'll start to rot your mind when you realize that it really does work. It's odd.

Since you're probably going to be using this for realtime effects, You're probably going to want to optimize the code, at least to reduce the number of floating point operations.

One optimization i've thought of was to create a two-dimensional array for each resultant pixel. The rows of the array indicate the original image pixel value. The columns of the array indicate the degenerate image pixel value. If we assume that the image has an 8 bit range [0..255] per color per pixel, also that all three of red, green, and blue are all 8 bits each, and that the manipulation is the same for all three colors, then we only need one of these tables for the entire image.
A 255x255 array has 65,536 elements in it. That means that for each time the alpha value changes, you need to just do 65,536 calculations, then for each color of each pixel it's just a memory de-reference,rather than a calculation. Obviously, if the source images is small enough, this isn't necessary. ie, a 320x200 image would need three times more calculations than the table would require.




4.9 Lens Effects

This is a neat one. I haven't coded this one yet, although it would be simple to do. With this effect, you can easily do "Tunnel" effects, or make it look like you're looking through a lens, shower-door glass, or water.

This is basically a two-dimensional lookup table. The lookup table contains the offsets for each pixel. Those offsets can either be the absolute pixel positions or relative to that position.

To determine which pixel to blit to position x,y on the display buffer, you look at element x,y in the lens table. The values in the lens table tell you the coordinates of the pixel in the source image to copy to the display buffer. The following example shows a 1:1 lens, where it will just copy the data over directly.

0,0 1,0 2,0 3,0 4,0
0,1 1,1 2,1 3,1 4,1
0,2 1,2 2,2 3,2 4,2
0,3 1,3 2,3 3,3 4,3
0,4 1,4 2,4 3,4 4,4
0,0 0,0 0,0 0,0 0,0
0,0 0,0 0,0 0,0 0,0
0,0 0,0 0,0 0,0 0,0
0,0 0,0 0,0 0,0 0,0
0,0 0,0 0,0 0,0 0,0
A pass-through 5x5 absoulte lens table
A pass-through 5x5 relative lens table

And now, here's an example where the image gets flipped along the vertical and the horizontal. Do note, that there are easier and quicker ways of doing this. This is just a simple example of the lens.

4,4 3,4 2,4 1,4 0,4
4,3 3,3 2,3 1,3 0,3
4,2 3,2 2,2 1,2 0,2
4,1 3,1 2,1 1,1 0,1
4,0 3,0 2,0 1,0 0,0
4,4 2,4 0,4 -2,4 -4,4
4,2 2,2 0,2 -2,2 -4,2
4,0 2,0 0,0 -2,0 -4,0
4,-2 2,-2 0,-2 -2,-2 -4,-2
4,-4 2,-4 0,-4 -2,-4 -4,-4
A horiz/vert-flip 5x5 absoulte lens table
A horiz/vert-flip 5x5 relative lens table

As you can see, there are probably advantages to either way. There's more math for the relative table, whereas the absolute is just the dereference.

	Absolute referencing:

		result_x = lens_table[x][y].x;
		result_y = lens_table[x][y].y;

	Relative referencing:

		result_x = x + lens_table[x][y].x;
		result_y = x + lens_table[x][y].y;

For added effect, if you animate the the lens arrays, you can make it look like the lens or water or whatever is rippling.

The advantages of using an absolute-referenced lens are obvious. However, a relative lens can have some not-so-obvious advantages. The relative lens can be stored as a 24 bit image, where one color, say red, stores the X offset, and another color, like green stores the Y offset. Have the center of the range (127 for an 8 bit per color image) be a 0-offset, then everything lesser than 127 represents a negative offset, and everything greater is positive. This does limit the offset to a max of 127 in either direction though. It does let you use other methods for creating and viewing the lens though, so it does have its advantages.

	Relative referencing using 8bbc 24bit image data:

		result_x = x + -127 + lens_image[x][y].r;
		result_y = x + -127 + lens_image[x][y].g;

Since I haven't yet experimented with this one, I'm not sure the best way to store the data is yet... whether the x & y offsets should be stored in seperate arrays, or in one array together or if it even matters, as well as whether or not the offsets should be absolute positions or relative positions.

Also, as you might expect, a relative lens can be positioned around the scren in a different location for each frame, thus giving a "roaming lens" effect. This would look like moving a magnifying lens over the source image. (assuming that the lens array is a magnifying type of lens.




4.10 TV Static

This effect entails generating static which looks like the snow you see on a television which is not tuning in any channel. This is actually a lot more difficult than it sounds. If you were to just plot out pixels row by row, column by column, with the brightness range of [0..255] It will look nothing like static. The first thing you should do is to limit the color depth. I used four shades of gray, and with a random number function I was able to retrieve the color values with:

		pixel_value = (( random_number() & 0x07)) << 3)
This gave me a value of 0, 8, 16, or 24. The paletted video mode i was using had a range from black to white in palette entries [0..64]

The big problem here, is that the standard random number function isn't. You can of course simplify this by pre-generating a few thousand random numbers with the above values, saving the bitwise and as well as the shift. If you were to just plot out the pixels exactly as it gives them to you, you'll see definite patterns in the visuals. (Especially if you're using a table for your random numbers.)

A trick I found to reduce the repetition so that you don't see it happening is that you draw every other pixel for each time you draw the frame. It doesn't seem like it should work, but oddly enough, it does. My code pregenerated 4096 random numbers. You might have to try different amounts of random numbers; 4095, 4097, etc.

    int frame_no = 0;
    int pos;           // position in the video buffer

    while (1)
    {
	frame_no++;

	if (frame_no & 1)
	{
	    for ( pos = 0 ; pos < BUFFER_SIZE; pos++ )
	    {
		buf[pos++] = (( random_number() & 0x07)) << 3)
	    }
	} else {
	    for ( pos = 0 ; pos < BUFFER_SIZE; pos++ )
	    {
		buf[++pos] = (( random_number() & 0x07)) << 3)
	    }
	}
    }



4.11 Alpha Channel Effects

If you notice above, the image data structure is 32 bits large. 8 bits for each of the Red, Green, Blue, and Alpha. Technically, we could store it in a 24 bit structure, but this added 8 bits can help us.

Other image in the Alpha channel




4.12 Cellular Automata

This is a really easy way to have dynamic images, and self-animating images. It's basically a set of rules which, when applied to some data, will produce the next "generation" of data. One of the most common versions of this is "JH Conway's Game Of Life".

JH Conway's Game of Life uses the following set of rules:

If a pixel has this many neighbors
this happens
meaning...
fewer than two
it dies
fill with background color
two
it survives
leave it alone (do nothing)
three
it thrives
increment the color
more than three
it dies
fill with background color


Examples:
- - - - -
- - - # -
- # - # -
- - # # -
- - - - -
A "Glider"

More examples are available from links in the Appendix.




A.1 References

Source Code

Related Work

Some Needed Libraries

Other References On The Web

Books

Related & Referenced Applications