Fast read from a file
Thomas Laguzzi

FIRST OF ALL: Before you read this tutorial save the file CREATE.BAS and be ready to run it every time is asked. (You can also compile it for speed ). That program creates an Empty dummy file used for the programs of this tutorial. The file created is in C:\TEST.DAT . If you already have that file, modify the source of the program... Now, let's go!

Have you ever tried to open a big file as binary and store it into memory?

All the programmers I know (Ok, I don't know a lot of programmers:-) for get a certain data from a file reads it byte per byte. This method is really slow, specially with big files. For example, create a file big 200 kb (using CREATE.EXE with the value of 204800), the run this program:

OPEN "C:\test.dat" FOR BINARY AS #1 'Open the file
T = TIMER                           'Take the initial time
FOR i = 1 TO 204800                 'Read 204800 bytes with the one-byte method
GET #1, , a%                        'Get the byte
NEXT i                              'Repeat
PRINT TIMER - T                     'Print the elapsed time
CLOSE #1 'Close the file
END

This programs reads every byte of the file in the var a (integer). If you take a look on the LED of your HD, you don't see anything...

On my 133 the program takes 9 seconds, that depends on the speed of your CPU, but it's still slow. You can say: why must I read 200 kb of data into a Basic program? For an Image, for example. If you're using SVGA each screen in 640x480x256 needs 300 KB of memory. If you want to load a BMP you can't wait 9+ seconds!!!
The first solution is use long integers instead of simple integers. Try to use
GET #1, , a& instead of GET #1, , a&

8,5 seconds... hum... only half second of difference . Ok, after this stupid try, I'll write the right solution of the problem. Q(uick)Basic let you get strings too. For example, if you have a file with ABCDEF and you use this:

A$=" "
GET #1, , A$
 

A$ will contain the first character, A. Now, Try this:

A$=" "
GET #1, , A$

A$ will contain ABCD. I think you can see the solution. If you use a string long 1024 characters you'll get 1 kb every time you access the file. Try this code:

OPEN "C:\TEST.DAT" FOR BINARY AS #1
Buffer$= STRING$(1024," ") ' Creates a buffer long 1 KB
T= TIMER
FOR I% = 1 TO 200          ' The file can contain 200 buffers (200Kb)
Get #1 , , Buffer$         ' Get the buffer
NEXT I%
PRINT TIMER - T
CLOSE #1
END

The result: 0 . Yes, I on my 133 get a big 0. Don't worry, the code works good. If you insert a PRINT Buffer$ instruction after the Get, you'll get a number greater than 0. You get 0 because the code is executed in less that 1/18 of second that's the minimum sensibility of the DOS Timer. Try to generate a file big 10 MB (10485760), then use:

FOR I% = 1 to 10240

instead of the old code. 4.5 seconds: the half of the time for read 200 kb byte per byte... Also, use

FOR I% = 1 TO 2560

and

Buffer$= STRING$ (4096," ")

Now we use a 4 KB buffer, let's see the result... the same!
Now, we'll create a program that reads a file of unknown size with a variable buffer. We'll use some variables:

We use the Remaining var because if we have a file big 1026 bytes with a buffer of only 1024 bytes, two bytes are remaining. For Calculate the two vars, use the formula:

HowMany& = FileSize& \ BuffSize&
Remaining& = FileSize& MOD BuffSize&

Note that you should use the '\' operator and not the '/' one.
Here's a commented program, that's better than a teoric explanation:
Try to generate a file with different sizes, and try to use different buffer sizes...

CLS
BuffSize& = 16384                    'Here's the buffer size: modify as you want...
OPEN "C:\TEST.DAT" FOR BINARY AS #1  'Open the file
FileSize& = LOF(1)                   'The LOF function returns the size of the file in bytes...
HowMany& = FileSize& \ BuffSize&     'Calculate how many buffers are in the files
Remaining& = FileSize& MOD BuffSize& ' The Remaining bytes
Buffer$=String$(BuffSize&," ")       'Generates a buffer
T = TIMER
For I&=1 to HowMany&                 'Read the data from the file
Get #1, , Buffer$     '... Here You can process the data that you have read from the file
Next I&
Buffer$ = String$(Remaining&," ")    'Creates a buffer for the remaining bytes
Get #1, , Buffer$                    'And read from the file
PRINT TIMER - T                      'Elapsed time
CLOSE #1
END

If you have a file with many numeric values, you can extract them using Peek-Poke, ecc... I can't write how to make it on this tutorial, but if you want to know it, write me, I'll tell you! (or I'll write another tutorial!)
You can say that you have understand this topic if you can make a COPYING program, that copies a file from another using a buffer.



This article originally appeared in The BASIX Fanzine Issue 16 from September 1999.