Assembly language programming tutorial part 6: Closing up By Petter Holmberg of Enhanced Creations Edited version (original version posted in QB:tm) Hello and welcome to the sixth part of my assembly tutorial series. This part is also the very last one. No!!! I hear you screaming ;-) But the reason for this is basically that I have covered almost all of the important aspects of assembly programming. There's more to learn- there's always more to learn, but I think you'll have more use of other documents from now on. This series has been concentrating on the general aspects of asm programming. Now you are ready to start exploring stuff that interests you in particular. Maybe you want to learn more about interrupts, or you may be interested only in I/O port programming. There are lots of good documents to download about every aspect of assembly programming. It's not necessary for me to explain everything. So, what should this last part be about? For the first time in this series, it hasn't been easy for me to decide. Therefore, this part will cover some miscellaneous stuff, mostly about some of the general aspects of assembly programming. I'll try to share my experiences of asm programming with you so you don't have to all the mistakes I've done. Numbers in assembler: When programming in assembler, it's very important to know how numbers are stored in the registers, the stack and the memory. It's especially important to know the difference between positive and negative numbers and how they are stored. I've promised to discuss this earlier, so here it comes! We begin by looking at positive integer values, i.e. the numbers 0,1,2,..,n: Positive integer values are the easiest ones to store. When you need to use a positive integer value in your asm code, the first thing you want to ask is: How high numbers do I need to store? It's easy to select the number of bits that needs to be used for storing a certain value. If you use n bits, the biggest number that can be stored is 2^n - 1, so if you use 8 bits, you can store any number from 0 up to 2^8 - 1 = 255. With 16 bits you can store any number up to 65535 and so forth. If you only need to use a number bigger than 100, you shouldn't use 16 bits to store it as 8 bits are enough. The different bits in a binary number are numbered from right to left. The rightmost bit is called the least significant bit and it has the number 0. The leftmost bit is called the most significant bit. If it's an 8-bit numbered this bit is bit no. 7. If you want to calculate what number you get if you set a certain bit to 1, you can calculate this with the following formula: n = 2^b where n is the number you want to know and b is the number of the bit. So if you set bit 5 to one, the number you get is 2^5 = 32. Numbers that can only be positive are called unsigned values. A signed value is a value that can be both positive and negative. It's called signed numbers because the most significant bit of such numbers tell the sign of the number. If we have a positive value, the most significant bit is 0, but if we have a negative value, the most significant bit is 1. So 0 means + and 1 means -. This means that a signed number needs one extra bit just to store the sign of the number, so we get a lower maximum value with a certain amount of bits if we use signed numbers than an unsigned number. The biggest signed number that can be stored with n bits is 2^(n-1) - 1. So if we have an 8-bit signed number, it can have the maximum value of 2^(8-1) - 1 = 127. The smallest signed number that can be stored with n bits is -(2^(n-1)), so with 8 bits, you can store any negative integer down to -(2^(8-1)) = -128. Thus, an 8-bit signed number can store all interger values from -128 to 127. Positive values are stored in the same way no matter if the number is signed or unsigned, but negative values are stored in an interesting form: The most significant bit is set to 1 to indicate it's a negative number, and the rest of the bits represents the maximum possible value with those bits minus the absolute value of the negative number plus one. So if you have the 8-bit binary number -10, the most significant bit will be a 1 to indicate it's a negative number, and the rest of the bits will represent the number 127 - 10 + 1 = 118. This form of storing positive and negative numbers is called the two's complement. If you want to convert a number from positive to negative or negative to positive, you could use the SUB instruction to subtract a value from 0, like this: SUB AX, 0 This would work fine for all numbers. But you could also use the special assembly instruction NEG. NEG works just in the same way, but it's faster, takes less place in the memory and it's easier to understand what happens when you see a NEG instruction than if you see a SUB instruction. The syntax for NEG is simply: NEG destination the destination operand can be either a register or a memory pointer. So if you have the number -10 in AL and execute the instruction NEG AL, AL will become 10. Sometimes it's necessary to convert an 8-bit number to a 16-bit number or a 16-bit number to a 32-bit number. This is easy if you have a positive number. For example: If you have a positive integer in AL and you want that number to be treated as a 16-bit value in AX, you only need to make sure AH is 0. But what about negative numbers then? How do you convert a number in the two's complement form upwards? Luckilly, there are two instructions that does this for you. They're called CBW and CWD, short for Convert Byte to Word and Convert Word to Doubleword. Their syntaxes are: CBW CWD Note that they don't have any input/output operands. That's because these instructions only works with the AX register. If you have an 8-bit number that you want to convert to a 16-bit number, you should put it in AL, execute a CBW instruction, and you'll get the correct 16-bit value in AX, no matter if it's a positive or a negative integer you've converted. If you want to convert a 16-bit value to 32 bits, you should put it in AX and execute a CWD instruction. The result will be stored in DX:AX. When to create an assembly routine and how to do it: A little assembly code can often mean a huge improvement for a BASIC program. However, it's important to know when to use it and when to avoid it. There's no use to rewrite everything in assembler. There are two reasons you may have to write an assembly routine: It will speed up your program, or it will enable you to do something that you cannot do in QuickBASIC. The speed issue is the most common reason for using assembly code. It's a well known fact that in an average program, 90% of the execution time is used to execute 10% of the code. By rewriting those 10% of the code into assembler, 90% of the time, your program will run faster. It's important to learn how to find those 10% of source code. Usually, it's something that is repeatedly called by the main program loop and involves a lot of calculations. If you manage to pinpoint the part of your program that slows it down the most, you should consider rewriting it in assembler. It's also a good idea to use assembler when you need to do something that is hard to handle in QuickBASIC. One example of this was the keyboard handling example in the previous part of this series. The keyboard functions in QuickBASIC doesn't let you control the keyboard at the level we often need for computer games, but with a little assembly, we get all the control we want. When you know what you want to rewrite in assembler, you may think it's going to be easy to implement your idea into a working routine, but you may be surprised at how hard it is to write the routine. Transforming your ideas directly to assembler can be very hard because you have to take so many technical issues into account. You may find that you run out of registers, that you start writing really unstructured code or that you simply don't know how to do some things in assembly. My experience in assembly programming has told me that the best way to go is this: First, start writing your routine in BASIC. This way, you'll be sure that you know every step in the process of writing the routine, and you don't have to mind about registers and other low-level stuff. Then, start optimizing your code. Try to push your BASIC code to its limits. Even if the code still is terribly slow, you will discover how you can make it faster and better. If you wrote everything in assembler from the beginning, you probably would have missed these enhancements. Put comments everywhere in your source code so that you know what each line does. When you have an optimized BASIC version, start rewriting it in assembler. You shouldn't try to make the final version right from the beginning though. Start by writing a pseudo-version, where you use variable names instead of registers and skip some low-level stuff like setting the memory registers correctly when you want to get a number from the memory. Make your assembly code similar to the BASIC version, so that you know exactly what you're doing. Put LOTS of comments into the code so that you know what every line does. When you have your pseudo-version finished, start transforming it into a working version. Change variable names into registers, use the correct code to get a number from the memory and so forth. Put in even more comments, as the code will get harder to understand. The commenting should be so extensive that you can tell what each line of assembly code does just from looking at the comments. Divide the source into separate parts by using spacing and commenting, so that you can isolate the different parts of the routine just by looking at the layout of the source code. Even if you know what you're doing right now, you may have forgotten it after a few days. Looking at your own asm source without understanding it is really frustrating. DO NOT try to optimize your assembly code yet, try to make it work first. You will probably have a lot of bugs when you first run your routine, and you must fix them all before continuing. If you haven't hurried too much these bugs should all be about low-level issues such as the writing of data to the wrong position in the memory or not returning to QBASIC in the right way. They should not have to do with the purpose of the routine itself. Now when you have a working assembly routine, you may want to try optimizing it even further by using smart assembly code. This should be the final step in the creation of your routine. If you start optimizing right away, you're risking to lose control over your coding, but if you have a working routine without optimizations to look back on, you will still know what you're doing and you know that your routine works even if you don't manage to optimize it so much. Make sure your comments still explains what the routine does. Your optimized code will be harder to understand than the original version, so you need to watch out. If you follow these steps, you stand a good chance of getting a really well- written and terribly fast assembly routine! Optimizing your assembly code: It's one thing to optimize a BASIC program by using assembly code, but you can also optimize your assembly code to gain even more speed. Optimizing assembly code is a whole science itself, and I could probably write another tutorial series as long as this one just about asm optimization if I knew enough about it. There's so much to say about assembly optimization that I can only cover some of the basics here. When optimizing an assembly routine, it's not enough to just make the separate instructions run faster by using smarter code. Many times you also need to look at the code as a whole and ask yourself what the code's actually doing. Your code may run at an near-optimal speed based on what it does, but maybe it's doing more than it has to. For example, you may have a routine that has a very time-consuming MUL instruction inside of a loop. You manage to make it run a lot faster by using a SHL instruction instead, and you think you've been very clever. But if you had taken another look, you might have discovered that you're actually doing the same multiplication over and over, and it would have been enough to do it once before the loop and save the result in a register for later use. By moving the MUL instruction out of the loop you would probably have gained much more than by changing it to a SHL instruction. Keep in mind though, that because your assembly code is so fast from the beginning compared to BASIC code, it's often unnecessary to mind about optimizing it. If your routine runs adequately fast for your program, it's better to leave it unoptimized since it's more readable that way. But it can also be really important to optimize your assembly code because it runs so often. Concentrate on optimizing loops and other assembly code that runs frequently. If you don't have room for all your variables within the registers, use the stack or the memory for the values you use the least, and save the free registers for the code inside loops. Sometimes you need to exchange the values of two registers. If you only use MOV to exchange them, you will need a free register for temporary storage of one of the values. For example, if we have one value in AX and one value in BX and we want them to switch places, we could do this: MOV CX, AX ; Store value of AX temporarily in CX MOV AX, BX ; AX = BX MOV BX, CX ; BX = CX (which is the old value of AX) If you need to do this and you have no free register, the only solution seems to be to push a value to free a register, but there's a better way. You can use the instruction XCHG, short for eXCHanGe, to do it. The syntax for XCHG is: XCHG destination, source Where the source and destination operands can be registers or memory pointers. However, the source and the destination cannot both be memory pointers at the same time. With XCHG you won't need a temporary register, and thus the stack doesn't need to be used either. Another way to make your code faster is to avoid jumps in the code. This includes loops. Let's suppose you want to increase the AL register four times. The following solution seems obvious: MOV CL, 4 IncLoop: INC AL LOOP IncLoop But what's actually happening when the computer executes this code? Well, this is how the computer sees it: Set CL to 4 Increase AL Decrease CL Is CL 0? No: Jump back one line Increase AL Decrease CL Is CL 0? No: Jump back one line Increase AL Decrease CL Is CL 0? No: Jump back one line Increase AL Decrease CL Is CL 0? Yes: Continuse execution Now suppose that we rewrote the code into this: INC AL INC AL INC AL INC AL When this code is executed, this is what the computer does: Increase AL Increase AL Increase AL Increase AL As you can see, a loop isn't always the best choice. Replacing loops with repeated code is called unrolling. If the loop is executed so many times that it's not practical to unroll it completely, you can do a partial unrolling. For example, if you have a loop that executes 80 times, you can change it to a loop that executes 10 times with 8 copies of the loop code inside it. It's not only certain instructions that makes an assembly routine slow: Wait states, bus transfers, the prefetch queue and dynamic RAM refreshes can also steal time, and avoiding these bad guys can be really hard. For every new CPU that's released, assembly optimization becomes less important, not only because the processors get faster and faster, but also because they execute time consuming assembly instructions more efficiently. A MUL instruction for example, isn't as slow compared to an ADD instruction in a Pentium as it was on the 80286. Also, a Pentium CPU can execute several assembly instructions simultaneously if they fulfill certain requirements, such as certain positioning of the instructions in the memory and the order of multiple operations. So it's possible to optimize asm code specifically for a Pentium by using these new features. Finding errors: One of the biggest pains of writing assembly code is to find and eliminate the bugs. Once you're done writing an assembly routine and tests it for the first time, it almost never works properly. Since the error often hangs your computer so that you have to restart it, you'll probably get no hint at where the error is. All there is to do is to start going through your code and search for bugs. This can be very hard and time-consuming. Here are a few ways to make the debugging easier: First of all, "execute" the code in your head and try to see what the routine is actually doing. Write it down on paper and you may discover that you're doing something wrong. Keeping track of the state of the registers is often important. Maybe some things doesn't work like you thought they did. Then look it up in a document somewhere to make sure you know what you're doing! If you still have problems finding the errors, try to execute only a part of your program. Take the first snippet of code, insert a JMP to the end of the routine and return the values that you're working with to the BASIC program so you can look at them and see if they're correct so far. Then, move down the JMP instruction a couple of lines and see if everything still is correct. If you continue doing this, you will eventually find out where the computer hangs and where the error is. Also, keep in mind that some assembly instructions doesn't work with certain combinations of registers, memory pointers and direct values. if DEBUG won't accept an instruction that may seem correct, it may be because you're using a combination of input/output values that cannot be handled by that instruction. Also, make sure the call to and return from the assembly routine is done correctly. Many times, I've struggled to find the errors in completely correct assembly code that won't run, just to find out that the error was in the CALL ABSOLUTE line in QuickBASIC. Check this before you start searching for errors in the assembly code! Doing nothing: If you feel an urge to do nothing for a while, the assembly language has an instruction just for that: NOP. The NOP instruction takes up one byte in the memory, and when it's executed... Well, nothing happens at all! The syntax for NOP is: NOP Even though it doesn't do anything, it takes up a little amount of time. Actually, NOP is equivalent to XCHG AL, AL, which, of course, doesn't change anyhing. It may seem silly to have such an instruction in the assembly language, but there's actually some use for it sometimes, for example if you want to leave some empty space in a program for data to be initialized later on. And that was the final thing I had to say. I hope you've enjoyed this tutorial series and that you've managed to understand everything. Thanks to everyone who's sent me emails with comments on and suggestions for this series. If you want to learn more about assembly programming, there's LOTS of good info to find on the Internet. For example: Check out: http://cs.smith.edu/~thiebaut/ArtOfAssembly/ArtofAsm.html I thought I could finish with a list of all the assembly instructions we've covered in this series. This list divides them into groups based on what they do and explains their different functions. Try to tell what they mean before looking at the explanations and see if you know them all! Good luck with your assembly programming! Petter List of assembly instructions: Name Meaning ------------------------------------------------------------------- Data transfer: MOV Copy values between registers and the memory PUSH Put values on the stack POP Get values from the stack LODS Load string STOS Store string MOVS Move string IN Read values from I/O ports OUT Write values to I/O ports NOP Do nothing Jumps: CALL Call subroutine RET Return from subroutine RETF Return from subroutine in another segment JMP Jump to another offset address LOOP Perform a loop INT Call interrupt routine IRET Return from interrupt routine Conditional: CMP Compare two values (using subtractions) TEST Compare two values (using logical AND) JB Jump if Below JBE Jump if Below or Equal JE Jump if Equal JAE Jump if Above or Equal JA Jump if Above JL Jump if Less (signed) JLE Jump if Less or Equal (signed) JGE Jump if Greater or Equal (signed) JG Jump if Greater (signed) JNB Jump if Not Below JNBE Jump if Not Below or Equal JNE Jump if Not Equal JNAE Jump if Not Above or Equal JNA Jump if Not Above JNL Jump if Not Less (signed) JNLE Jump if Not Less or Equal (signed) JNGE Jump if Not Greater or Equal (signed) JNG Jump if Not Greater (signed) Data manipulation: ADD Add two values together SUB Subtract value from another INC Increase value by one DEC Decrease value by one MUL Multiply to values together (unsigned) IMUL Multiply to values together (signed) DIV Divide value by another (unsigned) IDIV Divide value by another (unsigned) NEG Negate value (invert its sign) SHL Shift bits to the left (unsigned) SHR Shift bits to the right (unsigned) SAL Shift bits to the left (signed) SAR Shift bits to the right (signed) ROL Rotate bits to the left ROR Rotate bits to the right AND Logical AND (1 if both bits are 1) OR Logical OR (1 if one or both bits are 1) XOR Logical XOR (1 if one bit is 1 and one bit is 0) NOT Logical NOT (invert bit) CBW Convert byte value to word value (signed) CWD Convert word value to doubleword value (signed)