Issue 9 of qb:tm

It's the end of the series as we know it

Hello and welcome to the sixth part of my assembly tutorial series. This part is also the very last one. No!!! I hear you screaming ;-) But the reason for this is basically that I have covered almost all of the important aspects of assembly programming. There's more to learn- there's always more to learn, but I think you'll have more use of other documents from now on. This series has been concentrating on the general aspects of asm programming. Now you are ready to start exploring stuff that interests you in particular. Maybe you want to learn more about interrupts, or you may be interested only in I/O port programming. There are lots of good documents to download about every aspect of assembly programming. It's not necessary for me to explain everything.

Grab Absolute Assembly 2.1

So, what should this last part be about? For the first time in this series, it hasn't been easy for me to decide. Therefore, this part will cover some miscellaneous stuff, mostly about some of the general aspects of assembly programming. I'll try to share my experiences of asm programming with you so you don't have to all the mistakes I've done.

Numbers in assembler:
When programming in assembler, it's very important to know how numbers are stored in the registers, the stack and the memory. It's especially important to know the difference between positive and negative numbers and how they are stored.

We begin by looking at positive integer values, i.e. the numbers 0,1,2,..,n: Positive integer values are the easiest ones to store. When you need to use a positive integer value in your asm code, the first thing you want to ask is: How high numbers do I need to store? It's easy to select the number of bits that needs to be used for storing a certain value. If you use n bits, the biggest number that can be stored is 2^n - 1, so if you use 8 bits, you can store any number from 0 up to 2^8 - 1 = 255. With 16 bits you can store any number up to 65535 and so forth. If you only need to use a number smaller than 100, you shouldn't use 16 bits to store it as 8 bits are enough.

The different bits in a binary number are numbered from right to left. The rightmost bit is called the least significant bit and it has the bit number 0. The leftmost bit is called the most significant bit. If it's an 8-bit numbered this bit is bit no. 7.

If you want to calculate what number you get if you set a certain bit to 1, you can calculate this with the following formula:

n = 2^b

where n is the number you want to know and b is the number of the bit. So if you set bit 5 to one, the number you get is 2^5 = 32.

Numbers that can only be positive are called unsigned values. A signed value is a value that can be both positive and negative. It's called signed numbers because the most significant bit of such numbers tell the sign of the number. If we have a positive value, the most significant bit is 0, but if we have a negative value, the most significant bit is 1. So 0 means + and 1 means -. This means that a signed number needs one extra bit just to store the sign of the number, so we get a lower maximum value with a certain amount of bits if we use signed numbers than an unsigned number. The biggest signed number that can be stored with n bits is 2^(n-1) - 1. So if we have an 8-bit signed number, it can have the maximum value of 2^(8-1) - 1 = 127.

The smallest signed number that can be stored with n bits is -(2^(n-1)), so with 8 bits, you can store any negative integer down to -(2^(8-1)) = -128. Thus, an 8-bit signed number can store all interger values from -128 to 127. Positive values are stored in the same way no matter if the number is signed or unsigned, but negative values are stored in an interesting form: The most significant bit is set to 1 to indicate it's a negative number, and the rest of the bits represents the maximum possible value with those bits plus one minus the absolute value of the negative number plus one. So if you have the 8-bit binary number -10, the most significant bit will be a 1 to indicate it's a negative number, and the rest of the bits will represent the number 127 + 1 - 10 = 118. This form of storing positive and negative numbers is called the two's complement.

If you want to convert a number from positive to negative or negative to positive, you could use the SUB instruction to subtract a value from 0, like this:

SUB AX, 0

This would work fine for all numbers. But you could also use the special assembly instruction NEG. NEG works just in the same way, but it's faster, takes less place in the memory and it's easier to understand what happens when you see a NEG instruction than if you see a SUB instruction. The syntax for NEG is simply:

NEG destination

The destination operand can be either a register or a memory pointer. So if you have the number -10 in AL and execute the instruction NEG AL, AL will become 10.

Sometimes it's necessary to convert an 8-bit number to a 16-bit number or a 16-bit number to a 32-bit number. This is easy if you have a positive number. For example: If you have a positive integer in AL and you want that number to be treated as a 16-bit value in AX, you only need to make sure AH is 0. But what about negative numbers then? How do you convert a number in the two's complement form upwards? Luckily, there are two instructions that does this for you. They're called CBW and CWD, short for Convert Byte to Word and Convert Word to Doubleword. Their syntaxes are:

CBW
CWD

Note that they don't have any input/output operands. That's because these instructions only works with the AX register. If you have an 8-bit number that you want to convert to a 16-bit number, you should put it in AL, execute a CBW instruction, and you'll get the correct 16-bit value in AX, no matter if it's a positive or a negative integer you've converted.

If you want to convert a 16-bit value to 32 bits, you should put it in AX and execute a CWD instruction. The result will be stored in DX:AX.

When to create an assembly routine and how to do it:
A little assembly code can often mean a huge improvement for a BASIC program. However, it's important to know when to use it and when to avoid it. There's no use to rewrite everything in assembler. There are two reasons you may have to write an assembly routine: It will speed up your program, or it will enable you to do something that you cannot do in QuickBASIC.

The speed issue is the most common reason for using assembly code. It's a well known fact that in an average program, 90% of the execution time is used to execute 10% of the code. By rewriting those 10% of the code into assembler, 90% of the time, your program will run faster. It's important to learn how to find those 10% of source code. Usually, it's something that is repeatedly called by the main program loop and involves a lot of calculations. If you manage to pinpoint the part of your program that slows it down the most, you should consider rewriting it in assembler.

It's also a good idea to use assembler when you need to do something that is hard to handle in QuickBASIC. One example of this was the keyboard handling example in the previous part of this series. The keyboard functions in QuickBASIC doesn't let you control the keyboard at the level we often need for computer games, but with a little assembly, we get all the control we want.

When you know what you want to rewrite in assembler, you may think it's going to be easy to implement your idea into a working routine, but you may be surprised at how hard it is to write the routine. Transforming your ideas directly to assembler can be very hard because you have to take so many technical issues into account. You may find that you run out of registers, that you start writing really unstructured code or that you simply don't know how to do some things in assembly.

My experience in assembly programming has told me that the best way to go is this:
First, start writing your routine in BASIC. This way, you'll be sure that you know every step in the process of writing the routine, and you don't have to mind about registers and other low-level stuff.
Then, start optimizing your code. Try to push your BASIC code to its limits. Even if the code still is terribly slow, you will discover how you can make it faster and better. If you wrote everything in assembler from the beginning, you probably would have missed these enhancements. Put comments everywhere in your source code so that you know what each line does.
When you have an optimized BASIC version, start rewriting it in assembler. You shouldn't try to make the final version right from the beginning though. Start by writing a pseudo-version, where you use variable names instead of registers and skip some low-level stuff like setting the memory registers correctly when you want to get a number from the memory. Make your assembly code similar to the BASIC version, so that you know exactly what you're doing. Put LOTS of comments into the code so that you know what every line does.

When you have your pseudo-version finished, start transforming it into a working version. Change variable names into registers, use the correct code to get a number from the memory and so forth. Put in even more comments, as the code will get harder to understand. The commenting should be so extensive that you can tell what each line of assembly code does just from looking at the comments. Divide the source into separate parts by using spacing and commenting, so that you can isolate the different parts of the routine just by looking at the layout of the source code. Even if you know what you're doing right now, you may have forgotten it after a few days. Looking at your own asm source without understanding it is really frustrating. DO NOT try to optimize your assembly code yet, try to make it work first. You will probably have a lot of bugs when you first run your routine, and you must fix them all before continuing.

Now when you have a working assembly routine, you may want to try optimizing it even further by using smart assembly code. This should be the final step in the creation of your routine. If you start optimizing right away, you're risking to lose control over your coding, but if you have a working routine without optimizations to look back on, you will still know what you're doing and you know that your routine works even if you don't manage to optimize it so much.

Make sure your comments still explains what the routine does. Your optimized code will be harder to understand than the original version, so you need to watch out.

If you follow these steps, you stand a good chance of getting a really well- written and terribly fast assembly routine!

Optimizing your assembly code:
It's one thing to optimize a BASIC program by using assembly code, but you can also optimize your assembly code to gain even more speed.

Optimizing assembly code is a whole science itself, and I could probably write another tutorial series as long as this one just about asm optimization if I knew enough about it. There's so much to say about assembly optimization that I can only cover some of the basics here.
When optimizing an assembly routine, it's not enough to just make the separate instructions run faster by using smarter code. Many times you also need to look at the code as a whole and ask yourself what the code's actually doing. Your code may run at an near-optimal speed based on what it does, but maybe it's doing more than it has to. For example, you may have a routine that has a very time-consuming MUL instruction inside of a loop. You manage to make it run a lot faster by using a SHL instruction instead, and you think you've been very clever. But if you had taken another look, you might have discovered that you're actually doing the same multiplication over and over, and it would have been enough to do it once before the loop and save the result in a register for later use. By moving the MUL instruction above the loop you would probably have gained much more than by changing it to a SHL instruction.

Keep in mind though, that because your assembly code is so fast from the beginning compared to BASIC code, it's often unnecessary to mind about optimizing it. If your routine runs adequately fast for your program, it's better to leave it unoptimized since it's more readable that way. But it can also be really important to optimize your assembly code because it runs so often. Concentrate on optimizing loops and other assembly code that runs frequently. If you don't have room for all your variables within the registers, use the stack or the memory for the values you use the least, and save the free registers for the code inside loops.

Switching Registers:
Sometimes you need to exchange the values of two registers. If you only use MOV to exchange them, you will need a free register for temporary storage of one of the values. For example, if we have one value in AX and one value in BX and we want them to switch places, we could do this:

MOV CX, AX ; Store value of AX temporarily in CX
MOV AX, BX ; AX = BX
MOV BX, CX ; BX = CX (which is the old value of AX)

If you need to do this and you have no free register, the only solution seems to be to push a value to free a register, but there's a better way. You can use the instruction XCHG, short for eXCHanGe, to do it. The syntax for XCHG is:

XCHG destination, source

Where the source and destination operands can be registers or memory pointers. However, the source and the destination cannot both be memory pointers at the same time. With XCHG you won't need a temporary register, and thus the stack doesn't need to be used either.

Earlier parts of Petter Holmberg's assembly series can be found in the Archive. They are in issues 4 thru 8.

Remove your jumps!
Another way to make your code faster is to avoid jumps in the code. This includes loops. Let's suppose you want to increase the AL register four times. The following solution seems obvious:

MOV CL, 4
IncLoop:
INC AL
LOOP IncLoop

But what's actually happening when the computer executes this code? Well, this is how the computer sees it:

Set CL to 4
Increase AL
Decrease CL
Is CL 0? No: Jump back one line
Increase AL
Decrease CL
Is CL 0? No: Jump back one line
Increase AL
Decrease CL
Is CL 0? No: Jump back one line
Increase AL
Decrease CL
Is CL 0? Yes: Continue execution

Now suppose that we rewrote the code into this:

INC AL
INC AL
INC AL
INC AL

When this code is executed, this is what the computer does:

Increase AL
Increase AL
Increase AL
Increase AL

As you can see, a loop isn't always the best choice. Replacing loops with repeated code is called unrolling. If the loop is executed so many times that it's not practical to unroll it completely, you can do a partial unrolling. For example, if you have a loop that executes 80 times, you can change it to a loop that executes 10 times with 8 copies of the loop code inside it.

It's not only certain instructions that makes an assembly routine slow: Wait states, bus transfers, the prefetch queue and dynamic RAM refreshes can also steal time, and avoiding these bad guys can be really hard.

For every new CPU that's released, assembly optimization becomes less important, not only because the processors get faster and faster, but also because they execute time consuming assembly instructions more efficiently. A MUL instruction for example, isn't as slow compared to an ADD instruction in a Pentium as it was on the 80286. On the other hand, the Pentium can execute several assembly instructions simultaneously if they fulfill certain requirements, so it's possible to optimize asm code for a Pentium by using these new features.

Finding errors:
One of the biggest pains of writing assembly code is to find and eliminate the bugs. Once you're done writing an assembly routine and tests it for the first time, it almost never works properly. Since the error often hangs your computer so that you have to restart it, you'll probably get no hint at where the error is. All there is to do is to start going through your code and search for bugs. This can be very hard and time-consuming. Here are a few ways to make the debugging easier:

First of all, "execute" the code in your head and try to see what the routine is actually doing. Write it down on paper and you may discover that you're doing something wrong.

If you still have problems finding the errors, try to execute only a part of your program. Take the first snippet of code, insert a JMP to the end of the routine and return the values that you're working with to the BASIC program so you can look at them and see if they're correct so far. Then, move down the JMP instruction a couple of lines and see if everything still is correct. If you continue doing this, you will eventually find out where the computer hangs and where the error is.

Keep in mind that some assembly instructions doesn't work with certain combinations of registers, memory pointers and direct values. if DEBUG won't accept an instruction that may seem correct, it may be because you're using a combination of input/output values that cannot be handled by that instruction.

Also, make sure the call to and return from the assembly routine is done correctly. Many times, I've struggled to find the errors in completely correct assembly code that won't run, just to find out that the error was in the CALL ABSOLUTE line in QuickBASIC. Check this before you start searching for errors in the assembly code.

There's an error in CALL ABSOLUTE assembly routines that's so common it's important that you know about it. It's very nasty because you don't have to write a single line incorrectly to get this error. It has to do with the DS register. Be VERY careful when you set the DS register. You already know that it's important to push the value of DS before you change it and POP it back at the end of your program, but there's another thing you have to know about DS. When you set it to another value than the original you won't be able to fetch values passed from QuickBASIC. Consider the following code snippet:

PUSH DS ; Preserve DS
MOV BX, A000 ; Set DS to A000h
MOV DS, BX
MOV AX, [BP+08] ; Get a variable from QB
POP DS ; Restore DS

Even if this seems correct, AX won't get the value of a QB variable. The value it will recieve is the value at DS:BP+8, which isn't the correct address. You must make sure DS points to the segment address it points to from the beginning if you want to read from or write to QB variables.

Doing nothing:
If you feel an urge to do nothing for a while, the assembly language has an instruction just for that: NOP. The NOP instruction takes up one byte in the memory, and when it's executed... Well, nothing happens at all! The syntax for NOP is:

NOP

Even though it doesn't do anything, it takes up a little amount of time. Actually, NOP is equivalent to XCHG AL, AL, which, of course, doesn't change anyhing.

It may seem silly to have such an instruction in the assembly language, but there's actually some use for it sometimes, for example if you want to leave some empty space in a program for data to be initialized later on. The NOP instruction occupies one byte in the memory.

And that was the final thing I had to say. I hope you've enjoyed this tutorial series and that you've managed to understand everything. Thanks to everyone who's sent me emails with comments on and suggestions for this series. If you want to learn more about assembly programming, there's LOTS of good info to find on the Internet. For example, Check out:

http://cs.smith.edu/~thiebaut/ArtOfAssembly/ArtofAsm.html

I thought I could finish with a list of all the assembly instructions we've covered in this series. This list divides them into groups based on what they do and explains their different functions. Try to tell what they mean before looking at the explanations and see if you know them all!

Good luck with your assembly programming!

List of assembly instructions:

Name     Meaning
Data transfer:
MOV       Copy values between registers and the memory
PUSH      Put values on the stack
POP       Get values from the stack
LODS      Load string
STOS      Store string
MOVS      Move string
IN        Read values from I/O ports
OUT       Write values to I/O ports
NOP       Do nothing
Jumps:
CALL      Call subroutine
RET       Return from subroutine
RETF      Return from subroutine in another segment
JMP       Jump to another offset address
LOOP      Perform a loop
INT       Call interrupt routine
IRET      Return from interrupt routine
Conditional:
CMP       Compare two values
TEST      Compare two values
JB        Jump if Below
JBE       Jump if Below or Equal
JE        Jump if Equal
JAE       Jump if Above or Equal
JA        Jump if Above
JL        Jump if Less (signed)
JLE       Jump if Less or Equal (signed)
JGE       Jump if Greater or Equal (signed)
JG        Jump if Greater (signed)
JNB       Jump if Not Below
JNBE      Jump if Not Below or Equal
JNE        Jump if Not Equal
JNAE      Jump if Not Above or Equal
JNA       Jump if Not Above
JNL       Jump if Not Less (signed)
JNLE      Jump if Not Less or Equal (signed)
JNGE      Jump if Not Greater or Equal (signed)
JNG       Jump if Not Greater (signed)
Data manipulation:
ADD       Add two values together
SUB       Subtract value from another
INC       Increase value by one
DEC       Decrease value by one
MUL       Multiply to values together (unsigned)
IMUL      Multiply to values together (signed)
DIV       Divide value by another (unsigned)
IDIV      Divide value by another (unsigned)
NEG       Negate value (invert its sign)
SHL       Shift bits to the left (unsigned)
SHR       Shift bits to the right (unsigned)
SAL       Shift bits to the left (signed)
SAR       Shift bits to the right (signed)
ROL       Rotate bits to the left
ROR       Rotate bits to the right
AND       Logical AND (1 if both bits are 1)
OR        Logical OR (1 if one or both bits are 1)
XOR       Logical XOR (1 if one bit is 1 and one bit is 0)
NOT       Logical NOT (invert bit)
CBW       Convert byte value to word value (signed)
CWD       Convert word value to doubleword value (signed)

Tell Petter that Norwegians give better massages than Swedes at this address.