File Manipulation In QBasic/FreeBASIC #4: Database Indexing
Written by Stéphane Richard (Mystikshadows)
INTRODUCTION:
Welcome to the 4th part of the series on File Manipulation. It's been a while since I last wrote about this subject. After exchanging a few emails since way back then with a few people about the subject, it seemed that alot of them wated to build on my binary file example. Let's first do a little trip down memory lane and tell you what the first three parts were all about.
- Part 1 - Sequential Files
In this first part I talked about what sequential files were and gave you the code to create a sequential file that held book related information. The code provided read information from the sequential file at the end I gave the code to create the data file itself. - Part 2 - Random Access Files
In this second part we covered random access files and what they are good for. We created an example that allowed you to manage a list of contact information that used random access files as it's storage media. - Part 3 - Binary Files
In this last part of the series we talked about what binary files were exactly, what role they play and how they worked. We created an example that essentially created a storage system that could be defined at runtime and later retrieved.
In this 4th part of the series I will be extending the last example in 3rd part by giving the database structure we created some indexing features so that you can create indexes to present the records in any order you need for the same of your project. We will give our structure the ability to have index references so that al relative indexes can be loaded in one shot and used accordingly. So let's get started shall we?
WHAT AN INDEX ACTUALLY IS:
An Index is a file that is created and used to present records in a database file in a preset order. That order is typically determined by one of the fields that define a database. It is a technique that saves the trouble of actually sorting every record in a database (which can be quite time consuming in a 2 gigabyte file). Imagine having to load up 35,000 employee records, sorting them all in the order we need and them saving all this back into the database file. Instead you can use index files so that you don't have to worry about loading huge files and sorting them every time you need them in a specific order. The main reasons for using indexes are:
- Smaller files to handle when the time comes to sort them.
- Since records are sorted you can use algorithms to search for a specific records instead of looping through them until you find the right one.
- Increased search speed. Not just because the files are smaller or you can use algorithms to do searches but because index operations are typically done in memory, not on the data file itself except when you need to lead or save the index file.
This last feature we will detail later in this tutorial but for now, what would you say if I told you that in that 35,000 record database, you could find any record located anywhere in the database with 20 comparisons maximum? if you didn't use that you would have to loop through the index records and compare each field in the index with the search criteria which could mean 35,000 IFs in a loop if your record happens to be the last one in the index file. I'll teach you the method later in this document.
WHAT WE NEED TO CREATE INDEX FILES:
In a typical index file there are only two fields to worry about, the record number or file position and the field value that will be used for comparison later in the search operations. So in our example database file we created in the 3rd part of the series, if we want the database indexed on the employee's name, the index file would have the record number and the employee's name as actual fields. The question is now what else is needed in order to make this index functionality within our database structure.
- The Name Of The Index File:
Of course this will help us know exactly which file to either look for to open the index file or which filename to use when we generate this index file. - A Relation To The field Used:
We need to know this so that we know which field to compare when we go about doing the search algorithms later. This will be stored in our database structure and controled outside the index file itself. - The Sorting Order:
There's two possible values here, Ascending sort which sorts from A to Z and the Descending sort which sorts from Z to A. - A Flag To Know We Have Indexes To Open:
Ths will be in our DBHeader structure it will be a simply byte field that will have either 0 if there are no index files or the total number of index files that are to be opened.
The main thing to remember here is that index files are best stured outside the main data file because it gives the main data file more room for extra records this way. Now, with all this information at hand we will be able to modify the code in the 3rd part to allow it to hold and manage index files pretty easily.
IMPLEMENTING THE INDEX FILES:
It's now time to change our structures so that index files can be integrated to it. You can find the code to this right here. We will need to create a user define type to hold the index definitions within the header of our file. Since we are referencing fields we will add these index references after our field definition sections. So then, first we change our HeaderInformation User Defined Type to the following:
' -------------------------------------- ' The header to the database structure ' -------------------------------------- TYPE HeaderInformation DatabaseName AS STRING * 30 DatabaseVersion AS LONG Created AS STRING * 10 ' STRING representing a date Modified AS STRING * 10 ' STRING representing a date FieldCount AS INTEGER IndexCount AS INTEGER ' Added to manage the number of index files RecordCount AS LONG HeaderLength AS INTEGER ' Length of this header structure RecordLength AS LONG ' Length of the Values taken by a record RecordOffset AS LONG ' Position of the start of the 1st Record END TYPE
Next I create a structure that will hold the information described above. The header of the database will include one of these for each index file we define. The structure is as follows:
' ------------------------------------------------------------- ' This structure holds information about each index reference ' ------------------------------------------------------------- TYPE IndexInformation IndexNumber AS LONG IndexName AS STRING * 50 FieldReference AS INTEGER ' Index reference in Fields() array SortOrder AS INTEGER ' 0 Ascending, 1 Descending END TYPE
Finally I also created to user defined types that will hold the contents of the index files. We will be creating one index that will present the records in the employee name ordered alphabetically from A to Z and an index will be by hourly rate and will be in descending order (so that the bigger salary appear first in the list). The user defined types are as follows:
' --------------------------------------------------------------- ' This structure holds information about each name index record ' --------------------------------------------------------------- TYPE NameIndexInformation RecordNumber AS LONG NameValue AS STRING * 40 END TYPE ' --------------------------------------------------------------- ' This structure holds information about each rate index record ' --------------------------------------------------------------- TYPE RateIndexInformation RecordNumber AS LONG HourlyRate AS DOUBLE END TYPE
With all this in place we create some arrays and variables to work with these user defined types. Below you'll see just what I added, for the original code you can refer to the included source file in the third part of the series above. Here are the variable declarations we'll need.
' --------------------------- ' Globally Shared Variables ' --------------------------- DIM SHARED DBIndexes(1 TO 2) AS IndexInformation ' 2 Index File References DIM SHARED WorkLength AS INTEGER DIM SHARED TotalLengths AS INTEGER DIM SHARED IndexCounter AS LONG DIM SHARED InnerLoop AS LONG ' Used for Index Sorting DIM SHARED OuterLoop AS LONG ' Used for Index Sorting DIM SHARED WorkIndex AS STRING '$DYNAMIC DIM SHARED NameIndex(1) AS NameIndexInformation ' Employee Name Index DIM SHARED RateIndex(1) AS RateIndexInformation ' Hourly Rate Index
Now,, just like we created an AddField() subroutine to save lines when defining fields for are DBFields() array we will also create a subroutine to add index to our DBIndexes() array. In our example we'll create two indexes but keep in mind you don't always have a fixed number of indexes to deal with, so using an array can become essential in these situations where you don't know how many items you're dealing with. Here's the subroutine to add index definitions.
SUB AddIndex (index AS INTEGER, FileName AS STRING, FieldReference AS INTEGER, SortOrder AS INTEGER) DBIndexes(index).IndexNumber = index DBIndexes(index).IndexName = FileName DBIndexes(index).FieldReference = FieldReference DBIndexes(index).SortOrder = SortOrder END SUB
We will use the subroutine above to add our two index definitions to the DBIndexes() array. If you remember the two index files with be one to order the records in alphabetical order by employee name, the second index will be by hourly rate in reverse order (biggest to smallest salary). The code to do that is as defined below:
CALL AddIndex(1, "EMPNAME", 2, 0) CALL AddIndex(2, "EMPRATE", 10, 1)
After populating the DBHeader user defined type as well as the DBFields and DBIndexes arrays as shown above we can define our recordOffset field and save our complete definition as the header of the database (field definition and index file definitions included). Here is the code that does precisely that:
' ----------------------------------------- ' This decides where the record offset is ' based on if index files exist or not ' ----------------------------------------- IF DBHeader.IndexCount = 0 THEN DBHeader.RecordOffset = WorkLength + TotalLengths ELSE DBHeader.RecordOffset = WorkLength + TotalLengths + (DBHeader.IndexCount * LEN(DBIndexes(1))) + 177 END IF ' ---------------------------------------------------------------- ' Write The contents of the file definition to the database file ' ---------------------------------------------------------------- DatabaseHandle = FREEFILE OPEN RTRIM$(DBHeader.DatabaseName) + ".df" FOR BINARY AS #DatabaseHandle PUT #DatabaseHandle, , DBHeader FOR Counter = 1 TO UBOUND(DBFields) PUT #DatabaseHandle, , DBFields(Counter) NEXT Counter FOR Counter = 1 TO UBOUND(DBIndexes) PUT #DatabaseHandle, , DBIndexes(Counter) NEXT Counter
At this point our database definition is saved to the data file and our record ofset is calculated accordingly. In our example we'll add 4 records to the database just so we can see a good difference later when we display them. To do so, we'll basically populate our CurrentEmployee structure with values and save it to the data file. Here is the code to do that.
' ---------------------------------------------------------------------- ' Assign values to the First Employee to be Saved to the database file ' ---------------------------------------------------------------------- CurrentEmployee.EmployeeNumber = 1 CurrentEmployee.EmployeeName = "Stephane Richard" CurrentEmployee.Address1 = "I don't know" CurrentEmployee.Address2 = "And I never will" CurrentEmployee.City = "Somewhere" CurrentEmployee.State = "Out There" CurrentEmployee.ZipCode = 12345 CurrentEmployee.Telephone = "(000) 000-0000" CurrentEmployee.Fax = "(000) 000-0000" CurrentEmployee.HourlyRate = 37.5 ' --------------------------------------------- ' Write this information to the database file ' --------------------------------------------- PUT #DatabaseHandle, , CurrentEmployee ' ---------------------------------------------------------------------- ' Assign values to the Second Employee to be Saved to the database file ' ---------------------------------------------------------------------- CurrentEmployee.EmployeeNumber = 2 CurrentEmployee.EmployeeName = "Kristian Virtanen" CurrentEmployee.Address1 = "Ain't got a clue" CurrentEmployee.Address2 = "blueberry hill" CurrentEmployee.City = "South of North" CurrentEmployee.State = "down here" CurrentEmployee.ZipCode = 12121 CurrentEmployee.Telephone = "(111) 111-1111" CurrentEmployee.Fax = "(111) 111-1111" CurrentEmployee.HourlyRate = 38.5 ' more than fair on salaries. ' --------------------------------------------- ' Write this information to the database file ' --------------------------------------------- PUT #DatabaseHandle, , CurrentEmployee ' ---------------------------------------------------------------------- ' Assign values to the Second Employee to be Saved to the database file ' ---------------------------------------------------------------------- CurrentEmployee.EmployeeNumber = 3 CurrentEmployee.EmployeeName = "Michael Wirth" CurrentEmployee.Address1 = "Am not saying" CurrentEmployee.Address2 = "Highway To Hell" CurrentEmployee.City = "Not Paradise" CurrentEmployee.State = "down here" CurrentEmployee.ZipCode = 131313 CurrentEmployee.Telephone = "(333) 333-3333" CurrentEmployee.Fax = "(333) 333-3334" CurrentEmployee.HourlyRate = 48.2 ' more than fair on salaries. ' --------------------------------------------- ' Write this information to the database file ' --------------------------------------------- PUT #DatabaseHandle, , CurrentEmployee ' ---------------------------------------------------------------------- ' Assign values to the Second Employee to be Saved to the database file ' ---------------------------------------------------------------------- CurrentEmployee.EmployeeNumber = 4 CurrentEmployee.EmployeeName = "Dave Osborne" CurrentEmployee.Address1 = "You wish" CurrentEmployee.Address2 = "But I don't" CurrentEmployee.City = "Definitaly Here" CurrentEmployee.State = "down here" CurrentEmployee.ZipCode = 141414 CurrentEmployee.Telephone = "(444) 444-4444" CurrentEmployee.Fax = "(444) 444-4445" CurrentEmployee.HourlyRate = 58.6 ' more than fair on salaries. ' --------------------------------------------- ' Write this information to the database file ' --------------------------------------------- PUT #DatabaseHandle, , CurrentEmployee
We now have our data file with the records we want to work with. The next step is to populate our index arrrays and save them to their respective index file. To do that we'll loop through the records of the database and add items to our index arrays to begin with. Here is the code to do that:
' -------------------------------------------------------------------- ' Position our file pointer and loop to read and add values to index ' -------------------------------------------------------------------- CLS IndexCounter = 0 REDIM NameIndex(1 TO 4) AS NameIndexInformation REDIM RateIndex(1 TO 4) AS RateIndexInformation SEEK #DatabaseHandle, DBHeader.RecordOffset FOR Counter = 1 TO DBHeader.RecordCount GET #DatabaseHandle, , CurrentEmployee IndexCounter = IndexCounter + 1 ' --------------------------------------- ' Redimension Name Index and Add Values ' --------------------------------------- NameIndex(IndexCounter).RecordNumber = DBHeader.RecordOffset + ((Counter - 1) * LEN(CurrentEmployee)) NameIndex(IndexCounter).NameValue = CurrentEmployee.EmployeeName ' --------------------------------------- ' Redimension Rate Index and Add Values ' --------------------------------------- RateIndex(IndexCounter).RecordNumber = DBHeader.RecordOffset + ((Counter - 1) * LEN(CurrentEmployee)) RateIndex(IndexCounter).HourlyRate = CurrentEmployee.HourlyRate NEXT Counter
This loop populated both our index arrays with elements, you can see that since we're using binary files the record positions are calculated as the offset of the first record plus the record number times the length of a record. Right now the arrays have elements but they are not sorted in the order they need to be. The code that follows sorts both arrays so that records will be in the intended order we'll need.
' ------------------------------- ' Sort Name Index Based On Name ' ------------------------------- FOR OuterLoop = 1 TO UBOUND(NameIndex) FOR InnerLoop = OuterLoop + 1 TO UBOUND(NameIndex) IF NameIndex(OuterLoop).NameValue > NameIndex(InnerLoop).NameValue THEN SWAP NameIndex(OuterLoop), NameIndex(InnerLoop) END IF NEXT InnerLoop NEXT OuterLoop ' ------------------------------- ' Sort Name Index Based On Name ' ------------------------------- FOR OuterLoop = 1 TO UBOUND(RateIndex) FOR InnerLoop = OuterLoop + 1 TO UBOUND(RateIndex) IF RateIndex(OuterLoop).HourlyRate < RateIndex(InnerLoop).HourlyRate THEN SWAP RateIndex(OuterLoop), RateIndex(InnerLoop) END IF NEXT InnerLoop NEXT OuterLoop
The index files are now created and sorted accordingly and held in their respective arrays. Though you could wait until later in the program to save them let's save them now so that they are saved and we don't have to worry about them. But at this point the index arrays are now ready to be used by the program. So let's save them with the following piece of code.
' ------------------------------------ ' Create and populate the Name Index ' ------------------------------------ WorkIndex = RTRIM$(DBIndexes(1).IndexName) + ".IDX" IndexHandle = FREEFILE OPEN WorkIndex FOR BINARY AS #IndexHandle FOR Counter = 1 TO UBOUND(NameIndex) PUT #IndexHandle, , NameIndex(Counter) NEXT Counter CLOSE #IndexHandle ' ------------------------------------ ' Create and populate the Rate Index ' ------------------------------------ WorkIndex = RTRIM$(DBIndexes(2).IndexName) + ".IDX" IndexHandle = FREEFILE OPEN WorkIndex FOR BINARY AS #IndexHandle FOR Counter = 1 TO UBOUND(NameIndex) PUT #IndexHandle, , RateIndex(Counter) NEXT Counter CLOSE #IndexHandle
Let's now program the display routines. We will first display the records in the order they were saved in the data file. Then, we will display them in alphabetical name order from A to Z and finally we'll display them in hourly rate so you can see that the index arrays are as they are supposed to be. Here's the code to display the records in all three mentioned orders.
' --------------------------------------------------------------------------- ' Display the records in 1 natural order, 2 name order, 3 hourly rate order ' --------------------------------------------------------------------------- PRINT "NATURAL ORDER IN THE DATABASE:" PRINT "------------------------------" SEEK #DatabaseHandle, DBHeader.RecordOffset FOR Counter = 1 TO DBHeader.RecordCount GET #DatabaseHandle, , CurrentEmployee PRINT LTRIM$(STR$(CurrentEmployee.EmployeeNumber)); " - "; PRINT CurrentEmployee.EmployeeName; PRINT USING "##.##"; CurrentEmployee.HourlyRate NEXT Counter PRINT ' --------------------------------------------------------------------------- ' Display the records in 1 natural order, 2 name order, 3 hourly rate order ' --------------------------------------------------------------------------- PRINT "RECORDS SORTED BY NAME:" PRINT "-----------------------" FOR Counter = 1 TO UBOUND(NameIndex) SEEK #DatabaseHandle, NameIndex(Counter).RecordNumber GET #DatabaseHandle, , CurrentEmployee PRINT LTRIM$(STR$(CurrentEmployee.EmployeeNumber)); " - "; PRINT CurrentEmployee.EmployeeName; PRINT USING "##.##"; CurrentEmployee.HourlyRate NEXT Counter PRINT ' --------------------------------------------------------------------------- ' Display the records in 1 natural order, 2 name order, 3 hourly rate order ' --------------------------------------------------------------------------- PRINT "RECORDS SORTED BY HOURLY RATE (Bigger to smaller rate):" PRINT "-------------------------------------------------------" FOR Counter = 1 TO UBOUND(RateIndex) SEEK #DatabaseHandle, RateIndex(Counter).RecordNumber GET #DatabaseHandle, , CurrentEmployee PRINT LTRIM$(STR$(CurrentEmployee.EmployeeNumber)); " - "; PRINT CurrentEmployee.EmployeeName; PRINT USING "##.##"; CurrentEmployee.HourlyRate NEXT Counter
There you have it, the way it works is that in the case of the indexed views (the last 2 loops), instead of looping through the database file, we loop through the index arrays and position the pointer in the data file to the right position as it is stored in the index array (the RecordNumber field of the user defined type). When the pointer is positioned we simply load the record and display it and repeat this for each element of the array. Below is a sample output of what you should see when you run the program.
NATURAL ORDER IN THE DATABASE: ------------------------------ 1 - Stephane Richard 37.50 2 - Kristian Virtaken 38.50 3 - Michael Wirth 48.20 4 - Dave Osborne 58.60 RECORDS SORTED BY NAME: ----------------------- 4 - Dave Osborne 58.60 2 - Kristian Virtaken 38.50 3 - Michael Wirth 48.20 1 - Stephane Richard 37.50 RECORDS SORTED BY HOURLY RATE (Bigger to smaller rate): ------------------------------------------------------- 4 - Dave Osborne 58.60 3 - Michael Wirth 48.20 2 - Kristian Virtanen 38.50 1 - Stephane Richard 37.50
As you can see the indexed record are exactly in the expected order. There is one more thing to cover in this tutorial. How to search for a specific record in the data file. OF course the first and oldest method of finding a record is to browse through the index and compare the values. something like the following accomplishes this quite easily.
FOR Counter = 1 TO UBOUND(NameIndex) IF UCASE$(TRIM$(SearchValue)) = UCASE$(RTRIM$(NameIndex(Counter).NameValue)) THEN SEEK #DatabaseHandle, NameIndex(Counter).RecordNumber GET #DatabaseHandle, , CurrentEmployee PRINT LTRIM$(STR$(CurrentEmployee.EmployeeNumber)); " - "; PRINT CurrentEmployee.EmployeeName; PRINT USING "##.##"; CurrentEmployee.HourlyRate EXIT FOR END IF Next Counter
This will loop until it finds the record represented by the SearchValue string variable. If it find it it will display it on the screen, if not, nothing gets displayed. This is great for rather small data files or if you need to perform a specific operation to each record of the data file. But what if you had a huge data file (maybe 500,000 records of information). A loop like the one above can become long and cumbersome to execute just to find a single record. There is a better way and yes, we are about to cover how it works. Let me described to you a search algorithm called the dichotomic search.
THE SEARCH ALGORITHM EXPLAINED:
A Dichotomic Search is actually quite simple in concept. It's only prerequesite is that the array of indexed values being used are sorted alphabetically or numerically (when dealing with numerical values). The general idea of a dichotomic search is what's called a divide and conquer algorithm. Imagine you have a list of 30 alphabetically sorted employee name and you want to find one of these names in particular.
In order to achieve this you will need to keep track of three things. The lower bound of the list (typically the first element in the array at the beginning of the search), the upper bound of the list (the last element of the array at the beginning of the search) and of course which element you are currently comparing. Hence we will need three variables like so:
' ---------------- ' Work Variables ' ---------------- DIM LowerBound AS LONG DIM UpperBound AS LONG DIM Current AS LONG
Now, assuming the list of names is in an array, we start these variables with these startup values:
' ------------------------ ' Startup Initialisation ' ------------------------ LowerBound = LBOUND(NameIndex) UpperBound = UBOUND(NameIndex) IF INT(UpperBound - (LowerBOund - 1) / 2) = UpperBound - (LowerBOund - 1) / 2 THEN Current = INT(UpperBound - (LowerBOund - 1) / 2) ELSE Current = INT(UpperBound - ((LowerBOund - 1) / 2) + .5) END IF
What we did here is assign the upper and lower limits to the upper and lower bounds of the array. The aim of this search is to start at the middle of the list and start comparing. The IF statement I added here will make it so that if we have an uneven number of elements (say we have 31 names instead of 30) the current variables will get the value 16 which is the element in the exact middle of the list when you have an uneven number of elements in the array.
At this point we are ready to begin our search loop. We start by testing the current element to be sure we do need to go into the loop or not. Then once in the loop we cut the list by halves on every itteration by changing either the lower bound (if the search value is greater than the current field) or the upper bound (if the search value is smaller than the current element) and positioning our Current variable to the middle of the new range of elements. Take a look at the code below and I'll explain it after.
' -------------------------------------------------- ' If we happen to be on the right record already ' there's no need to go in the loop so we skip it. ' -------------------------------------------------- IF NameIndex(Current).NameValue = SearchValue THEN DichotomicSearch = NameIndex(Current).RecordNumber ELSE DO WHILE UpperBound <> LowerBound OR NameIndex(Current).NameValue <> SearchValue ' -------------------------------------------------- ' This will either cut the list upward or downward ' Depending on if the current record is greater or ' lower than the search value being compared. ' -------------------------------------------------- IF NameIndex(Current).NameValue < SearchValue THEN UpperBound = Current - 1 ELSE LowerBound = Current + 1 END IF ' ----------------------------------------------- ' Once we know which way we need to cut down we ' we reposition Current to the middle of the ' new valid range to look in and compare with ' ----------------------------------------------- IF INT(UpperBound / 2) = UpperBound / 2 THEN Current = INT(UpperBound / 2) ELSE Current = UpperBound / 2 + .5 END IF ' ----------------------------------------------------- ' If the current Element is equal to the search value ' we simply exit the loop as we are done comparing ' ----------------------------------------------------- IF NameIndex(Current).NameValue = SearchValue THEN EXIT DO END IF LOOP ' ----------------------------------------------------------- ' Since Current could be anywhere when the loop is exited ' we set it to 0 if the current record is not the right one ' which means the record was not found. ' ----------------------------------------------------------- IF NameIndex(Current).NameValue <> SearchValue THEN Current = 0 END IF END IF
In order to explain what exactly is happening in this code, I will give you a sample run so that you see how everything works together. We will use our list of 31 elements as an example. Let's assume the record we are looking for is in position 30 in the list.
Description | Lower Bound | Upper Bound | Current | Range |
1. Initial State 2. First Cut 3. Second Cut 4. Third Cut 5. Fourth Cut |
1 17 24 28 30 |
31 31 31 31 31 |
16 16 16 16 16 |
31 15 8 4 2 |
As you can see from this table we found our record in 5 comparisons (including the original position we started in). In the traditional approach we would have had to loop 30 times in the data file to compare each record in the data file until we find a match. In a 31 record data file we saved 26 comparisons. Imagine if we had a 5000 or 500,000 record database, imagine all the comparisons we would save in a data file that size. I won't let you figure it out, any records in a 500,000 thousand records could be found in 19 comparisons or less. You would save 499,981 comparisons if you happen to be looking for the last record in the database. Imagine the speed gains from a system like this.
IMPORTANT NOTES ABOUT USING THESE TECHNIQUES:
There are a few things to consider when reading this tutorial and/or attempting to use the included sample source file. These are recommendation in order to help you make the most out of this code and make adapting the code to your project's needs that much easier.
- Database Records and all index files must be in sync:
This means that when you add a record to the data file itself you'll need to be sure to add a corresponding record to each of the maintained index files. This can be done on each file write to the data file or in bulk when exiting the data entry module for example. Likewise, if you delete a record from teh data file, you need to delete it's corresponding index file record in all index files. - There is a rule for when and how you open your files:
This is more of a balancing act rather than a rule. There'something about open and closing files the least amount of time possible. There is also a recomemndation that states to open a file for the shortest time possible. The balance between how often you open a data file versus how long you open them for make up this recommendation. Sometimes it makes sense to open the fil and keep it opened, other times it makes more sense to open the file, do your work on it and close it right back up. - Use dynamic arrays whenever possible:
In the case of this tutorial we knew alot of factors about the database system we created, not every database situation will give you these upfront like I did here. The code is already mean to use LBOUND and UBOUND wherever possible to adjust if you add more indexes or fields to the definitions but just to be sure use them unless you're 100% sure the current set values will not change in your proejct ever. - The code here is tested in QuickBasic:
This should mean that it will work fine in QBasic, QuickBasic 4.5, QuickBASIC P.D.S. 7.1, VB-DOS 1.0 and quite possibly FreeBasic (in -lang qb mode). To have this code work in FreeBasic's native mode you will need to remove the CALL in my subroutine calls as well as changing RTRIM$ to TRIM$ and the rest should work as expected.
By following htese recommendations you will create a system that can adapt to just about any database jobs you can throw at it. After all, what good is it to create something like this if it's not going ot be used in atleast 2 if not all your data file related projects? The important thing here, as with any project is to build it for the needs of today AND tomorrow. This is how you build libraries of stuff you need very often and that you can rely on (if tested accordingly of course). This means that in project like this one it's worth taking the time to plan
IN CONCLUSION:
When would you need something like what I explained in this document? The answer might actually be quite surprising. When you mention databases people automatically think business applications, reporting, Oracle, SQL Server and other database engines like MySQL, PostgreSQL and others. Depending on the type of 3D game youre making, having the ability to move around a file of object definitions for example might prove quite useful especially in games that need to update an environment according to other given sets of parameters. There are other game and tool/utilities that can probably benefit from a good database system like this one too. Maybe you can think of some of these projects already. The bottom line is a database system with good indexing and search algorithms have been used thousands of times in places many people never would imagine and might be more useful even to game designers than expected.
Needless to say the subject coverred here is a more complex one by definition. It's quite possible that not everything listed and explained here might have been crystal clear. A few other things could be coded in order to create a really complete usable system as well. But I blieve this will definitaly put you in the right frame of mind. However, if anything is unclear, feel free to email me about it. I will then do my best to make it clear for you so that you know everything you need to know to start your database related project. Until next time, happy coding, reading, and database indexing.
Stéphane Richard
srichard@adaworld.com