-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                       MANIPULATING SOUND IN QBASIC
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
(A while ago I found a bunch of ZIPs which contained posts from old Qbasic
BBS's. Most of the posts were from 1991-93. A lot of the information in the
ZIP's was either very primitive or was just code with no explanation of
how it works. I did however find this one good tutorial which explains some
very cool stuff about the use of sound in Qbasic. This document has been
left in its original state. -ed)
 From:  EDWARD SCHLUNDER          Sent: 07-24-93 05:18
   To:  MATTHEW MCLIN             Rcvd: -NO-
   Re:  FORMAT OF MOD, SAM (1/9)
<-=-=-=-=- Matthew Mclin to All -=-=-=-=>
 MM> Does anybody know the format of MOD/SAM/WAV/VOC file? Info on any
 MM> of those formats (how to read/write/play them using a PC Speaker
or
 MM> LPT 1 with a mono DAC) would be greatly appreciated.
      You know, you are quite lucky that I just decided to pickup the
   Pascal echo even though I'm not a Pascal programmer. I have ALL of
   these file formats! Lucky you! I have had to search high and low all
   over the place for this junk and you're getting it all in one shot.
      Not only do I have those file formats, but I also understand how
   to play them back on the PC's Internal Speaker, LPT DACs, and Sound
   Blaster. I'll be posting that too.
      I have been interested in this field for quite a while, that's how
   I gather up all this information. If I had enough ambition, time, and
   patience, I'd probably write a book on it all because there is not
   ONE SINGLE book that explains how to play digital sound directly (ie,
   without specail drivers), with such drivers, what the file formats
   are, and includes code to do all that stuff.
      Gee, I bet that would make a lot of money, perhaps I should do
   that after all.... Those guys on the 80XXX Assembler echo would
   probably be able to do a better job as they are more knowledgable on
   this, but most of them are into writing demos and creating
   faster/better MOD players..
      Ok, since this will take up a lot of room, I'll be splitting it up
   into seperate messages. The simpilest stuff goes in this message.
 MM> I would also like info on raw sound data and how to edit/play it.
      Newbe to Digital Sound, eh? Well, you've come to the right place
   for information, or rather, the right person has come to you. Ok, the
   basics. A digital sound file is basically just a bunch of volume
   settings. On the PC, a volume setting of 128 is normally silence.
   Values farther away from 128 in either direction are louder depending
   on its distance from 128. 0 and 255 are the loudest volumes. One
   thing I should make clear, 128 is not nessicarily silence. When
   making a recording, there is always background noise. So, what may
   sound like silence to you, is actually 126-130 or so.
      Now, you have probably seen those neat little graphs that some
   programs make when displaying a digital sound file. VEdit (which
   comes with the Sound Blaster) shows the waveform in the modify part
   of it. If you wanted to display a graph yourself, you could just load
   in a byte from the file, then, use that byte for the Y location. The
   X location is where in the file you are at (which byte). You just
   keep loading in bytes until the end of the screen. I could go on and
   on, but this is just a message, not a book! Hmm, you said you wanted
   to play a digital sound file on the PC's Internal Speaker and on a
   printer port DAC. Well, here comes that part. I'll explain usage of
   printer port DACs first because they are easier to understand. To
   play a VOC, WAV, SND, etc file on the DAC, you just read in one byte
   from the file, output it to the printer port, and do it again but on
   the next byte. To get the I/O address of the printer port, read the
   word at memory location 40h:8h for LPT1, 40h:0Ah for LPT2, 40h:0Ch
   for LPT3, and if on a non-ps/2, 40h:0Eh for LPT4.
      The internal speaker is a bit more tricky, you have to do certain
   things to set it up correctly before outputting sound. Before you do
   ANY sound output, you must do the following (sorry, I'm not a Pascal
   programmer, so this is in Assembler):
   Out   43h, 0B6h                     ;Please make note: This code was
   Out   42h, 0FFh                     ;written by a friend of mine in
   Out   42h, 0                        ;australia named Phil Inch. He
   Out   43h, 90h                      ;posted code in the 80x86 Assembler
   In    ax, 61h                       ;echo (GTPN, not Fido) for the
   Or    ax, 3                         ;public domain. Thanks Phil!!
   Out   61h, ax
      Ok, the above sets the timer chip up correctly. From there it is
   pretty simple. Get a byte from the sound file. Divide the byte by a
   'shift' number (I'll explain about this later). Then, output this new
   byte to port 42h. Repeat this for the whole file.
      Ok, now, about that shift value. The PC's Internal Speaker wasn't
   designed for playing digital sound on it, it's just that brainy guys
   like Phil have figured out how to do with software what should have
   been done with hardware.. Anyway, the PC's Internal Speaker isn't
   very loud, so the range of volumes is much less than on a Sound
   Blaster or printer port DAC. This shift value varies from computer to
   computer, it depends on the size of your speaker and other stuff.
   Genernally, a shift value of 4 works on all computers. On my
   computer, I can get anyway with 3 on most files. The smaller the
   shift value, the louder the file will be played, but too small a
   shift value will cause distortion. Experiment! After you are finished
   playing the sound file, you must put the timer chip back the way it
   was supposed to be, or otherwise the next program that tries to make
   a noise on the internal speaker will make the noise but will not
   stop! Here is the code for that (again, sorry about the Assembler,
   it's just that I'm not a Pascal programmer):
   Out   43h, 0B6h
   In    ax, 61h
   And   ax, 0FCh
   Out   61h, ax
      There, that should do it. I hope I haven't totally confused you.
   Please write back if you have ANY questions what-so-ever. Gee, I'm
   already on line 107, time to go to a new message!
<-=-=-=-=- Matthew Mclin to All -=-=-=-=>
 MM> Note that these .MOD
 MM> and .SAM files are in the Amiga Module format (just incase there are
 MM> any others). Oh, there's also the .SND files. Or even .MID/.MDI files
 MM> if you can play them thru a DAC on an LPT port or the PC Speaker. Note
 MM> that I don't have a Sound Blaster (or any other sound card). Thanks.
SAM Files:
      As far as I know, these do not contain any header or specific
   structure. They are just raw sound files. The only trick you have to
   remember about these files are that they are signed, which means that
   when the 7th bit is set, the number is negative. When the 7th bit is
   clear, the number is positive. This is completely different from
   digital sound files that originated on the PC. Remember, MOD and SAM
   files originated from the Amiga, so they have this weird encoding.
     To convert a signed file to an unsigned file, just read in one byte
   from the original file. Add 128 to that byte. Output the answer to a
   new file. In the Amiga world, a byte of 0 is equalivilent to silence.
   A byte of -128 (and +128) is as loud as it gets on the Amiga.  On the
   PC, however, 0 (and 255) is as loud as it gets. A byte of 128 is
   equalivilent to silence on the PC. So, when we add 128 to a -128, we
   get a zereo, which is the same volume for a 128 on the Amiga.
      The following text was written by Edward Schlunder and was based
   on information provided by Tony Cook on the GT Power Network's 80x86
   Assmebler echo.
                               WAV File Format
                       By: Edward Schlunder. 5-17-93
 BYTE(S)        NORMAL CONTENTS               PURPOSE/DESCRIPTION
-----------------------------------------------------------------------
 00 - 03        "RIFF"                        Just an identification block.
                                              The quotes are not included.
 04 - 07        ???                           This is a long integer. It
                                              tells the number of bytes long
                                              the file is, includes header
                                              size.
 08 - 11        "WAVE"                        Just an other I.D. thing.
 12 - 15        "fmt "                        Just an other I.D. thing.
 16 - 19        16, 0, 0, 0                   Size of header to this point.
 20 - 21        1, 0                          Format tag.
 22 - 23        1, 0                          Channels
 24 - 27        ???                           Sample rate, or (in other
                                              words), samples per second.
 28 - 31        ???                           Average bytes per second.
 32 - 33        1, 0                          Block align.
 34 - 35        8, 0                          Bits per sample. Ex: Sound
                                              Blaster can only do 8, Sound
                                              Blaster 16 can make 16.
                                              Normally, the only valid values
                                              are 8, 12, and 16.
 36 - 39        "data"                        Marker that comes just before
                                              the actual sample data.
 40 - 43        ???                           The number of bytes in the
                                              sample.
VOC File Format:
      This file format was written by Phil Inch on the 80x86 Assembler
   echo on the GTPN. Thanks Phil!!
BYTE(S)        NORMAL CONTENTS               PURPOSE/DESCRIPTION
-----------------------------------------------------------------------
00 - 19        "Creative Voice File", 26     Just an identification block.
                                             The quotes are not included,
                                             and the 26 is byte 26 (1Ah) which
                                             is an end-of-file marker. There-
                                             fore, if you TYPE a VOC file, you
                                             will just see Creative Voice File.
20 - 21        26, 00                        This is a low byte, high
                                             byte sequence which gives
                                             the offset of the first
                                             block of sound data in the
                                             file.  Currently this is 26
                                             ( 00 x 256 + 26 ) which is
                                             the length of the header,
                                             but it's probably good
                                             programming practice to
                                             read and use this value
                                             anyway in case the format
                                             changes later.
22 - 23        10,1                          These bytes give the
                                             version number of the VOC
                                             file, subnumber first, then
                                             main number. The default,
                                             as you can see, is 1.10.
24 - 25        41,17                         These bytes are "check
                                             digits". These allow you to
                                             be absolutely SURE that you
                                             are working with a VOC
                                             file.  To use them, convert
                                             the version number (above)
                                             and this number to
                                             integers. Do this with the
                                             formula below, where for
                                             convention the above bytes
                                             have been listed as byte1,
                                             byte2.
                                             (byte2*256)+byte1
                                             Therefore, for the default
                                             values we get the following
                                             integers:
                                             (1 x 256)+10 =  266 (17
                                             x 256)+41    = 4393
                                             When you add the two
                                             results, you get 4659.  If
                                             you do these calcs and get
                                             4659, then you can be
                                             almost certain you're
                                             working with a VOC file.
OK, that takes care of the header information.  I hope you realise that
I'll never get a registration for VOCHDR now!  Oh well <sigh> perhaps
people will buy my games!
   Having gotten to byte 26, we now start encountering data blocks.
There are eight types in all, conveniently numbered 0 - 7.  For each
block, the first byte will always tell you the type.
For notational convenience, bx means byte x, eg b5 means byte 5.
BLOCK 0 - THE "END BLOCK"
   Structure:     Byte 1: '0' to denote "end block" type
   This block is located at the END of a VOC file.  When a VOC player
   encounters a block 0, it should stop playing the VOC file.
BLOCK 1 - THE "DATA BLOCK"
   Structure:     Byte 1: '1' to denote "data block" type
                       2: \
                       3: | These bytes give the length:
                       4: / b2 + (b3*256) + (b4*65536)
                       5: Sampling rate: Calculated as 1000000 / (256-b5)
                       6: Pack type byte:
                              0 = data is not packed
                              1 = data is packed to four bits
                              2 = data is packed to 2 bits
                              3 = data is packed to 1 bit
                       7: Actual sample data starts here
BLOCK 2 - THE "MORE DATA BLOCK"
   Structure:     Byte 1: '2' to denote "more data block" type
                       2: \
                       3: | These bytes give the length:
                       4: / b2 + (b3*256) + (b4*65536)
                       5: Actual sample data starts here
   The point of this is simple:  If you have a sample that you want to
   chop up into smaller portions (the maximum block length in a VOC file
   is 16,842,751 bytes but who's counting?), then define a "more data"
   block. This "carries over" the previously found sampling rate and
   pack type byte, so a "data block" should have been encountered
   earlier somewhere along the line.
BLOCK 3 - THE "SILENCE" BLOCK
   Structure:     Byte 1: '3' to denote "silence block" type
                       2: \
                       3: | These bytes give the length:
                       4: / b2 + (b3*256) + (b4*65536)
                          (Note that this value is usually 3 for a
                          silence block.)
                       5: Duration ( b5+(b6*255) ).  This gives the equivalent
                       6: number of bytes to "play" during the silence.
                       7: Sampling rate: Calculated as 1000000 / (256-b5)
   A silence block is used for long periods of silence.  When long
   silences are required, it's more efficient in size terms to insert
   one of these blocks, as seven bytes can then represent up to 65,536.
BLOCK 4 - THE "MARKER BLOCK"
   Structure:     Byte 1: '4' to denote "marker block" type
                       2: \
                       3: | The length of the block, as usual
                       4: /
                       5: Marker value, as low-high (ie b5 + (b6*255) )
                       6:
   The marker block is read by CT-VOICE.DRV.  When a marker block is
   encountered, the value in the marker value bytes (5 and 6) is copied
   into the status word specified when CT-VOICE was initialized.
   This allows your program to judge where in the sample you currently
   are, thus allowing for progress counters and the like.  It's also
   useful if you're trying to synchronize other processes to the playing
   of the sound.
   For example, by using appropriate marker blocks, you could send
   signals to your software to move the lips of a person on-screen in
   time with the speech in the VOC.  However, this does take some doing
   and a VERY good VOC editor!
BLOCK 5 - THE "MESSAGE BLOCK"
   Structure:     Byte 1: '5' to denote "message block" type
                       2: \
                       3: | The length of the block, as usual
                       4: /
                   5 - ?: Message, as ASCII text.
                       ?: 0, to denote end of text
   The message block simply allows you to embed text into a VOC file.
   Presumably you could use this to detect when other people have
   pinched your VOC files for their own applications.
BLOCK 6 - THE "REPEAT BLOCK"
   Structure:     Byte 1: '6' to denote "repeat block" type
                       2: \
                       3: | The length of the block, as usual
                       4: /
                       5: Number of times that data should be repeated
                       6: Total = 1 + b5 + (b6*255)
   Every "playable" data block between a block 6 and a block 7 will be
   repeated the number of times specified in b5 and b6.  Note that you
   add one to this value - the data blocks are ALWAYS played at least
   once.  However, if b5 and b6 are zero, then you really don't need a
   repeat block, do you!
   I'm told that you cannot "nest" repeat blocks, but I've never tried
   it. This limitation would only apply to CT-VOICE.DRV I would have
   thought, but it depends how good other VOC players are.
BLOCK 7 - THE "END REPEAT BLOCK"
   Structure:     Byte 1: '7' to denote "end repeat block" type
                       2: \
                       3: | The length of the block, as usual
                       4: /
   This, as explained, marks the end of the block of blocks (!) that you
   wish to repeat.  Note that the "length" is always zero, so I don't
   know why the length bytes are required at all.
   There, finally... Ahh. Well, next up is the MOD and SND file
formats...
This was picked up off the 80XXX Assembler echo on FidoNet. There are
many other file formats for MODs, but I have found this one to be most
complete
Protracker 2.3A Song/Module Format:
-----------------------------------
Offset  Bytes  Description
------  -----  -----------
   0     20    Songname. Remember to put trailing null bytes at the end...
               When written by ProTracker this will be only uppercase;
               there are only historical reasons for this. (And the
               historical reason is that Karsten Obarski, who made the
               first SoundTracker, was stupid.)
Information for sample 1-31:
Offset  Bytes  Description
------  -----  -----------
  20     22    Samplename for sample 1. Pad with null bytes. Will only
               be uppercase.  The samplenames are often used for storing
               messages from the author; in particular, samplenames
               starting with a '#' sign will generally be a message.
               This convention is a result of a player called
               IntuiTracker displaying all samples starting with # as a
               message to the person playing the module.
  42      2    A  WORD with samplelength for sample 1.  Stored as number of
               words.  Multiply by two to get real sample length in
               bytes. This is a big-endian number; for all PC
               programmers out there, this means that to get your
               8-bit-orginated format, you have to swap the two bytes.
  44      1    Lower four bits are the finetune value, stored as a
               signed four bit number. The upper four bits are not used,
               and should be set to zero. They should also be masked out
               reading; you can never be sure what some stupid program
               could have stored here...
   45      1   Volume for sample 1. Range is $00-$40, or 0-64 decimal.
   46      2   Repeat point for sample 1. Stored as number of words
               offset from start of sample. Multiply by two to get
               offset in bytes.
  48      2    Repeat Length for sample 1. Stored as number of words in
               loop. Multiply by two to get replen in bytes.
Information for the next 30 samples starts here. It's just like the info
for sample 1.
Offset  Bytes  Description
------  -----  -----------
  50     30    Sample 2...
  80     30    Sample 3...
   .
   .
   .
 890     30    Sample 30...
 920     30    Sample 31...
Offset  Bytes  Description
------  -----  -----------
.
 950      1    Songlength. Range is 1-128.
 951      1    This byte is set to 127, so that old trackers will search
               through all patterns when loading. Noisetracker uses this
               byte for restart, ProTracker doesn't.
 952    128    Song positions 0-127.  Each hold a number from 0-63 (or
               0-127) that tells the tracker what pattern to play at
               that position.
1080      4    The four letters "M.K." - This is something Mahoney &
               Kaktus inserted when they increased the number of samples
               from 15 to 31. If it's not there, the module/song uses 15
               samples or the text has been removed to make the module
               harder to rip. Startrekker puts "FLT4" or "FLT8" there
               instead. If there are more than 64 patterns, PT2.3 will
               insert
               M!K! here. (Hey - Noxious - why didn't you document the
                           part here relating to YOUR OWN PROGRAM? -Vishnu)
Offset  Bytes  Description
------  -----  -----------
1084    1024   Data for pattern 00.
   .
   .
   .
xxxx  Number of patterns stored is equal to the highest patternnumber
      in the song position table (at offset 952-1079).
  Each note is stored as 4 bytes, and all four notes at each position in
the pattern are stored after each other.
00 -  chan1  chan2  chan3  chan4
01 -  chan1  chan2  chan3  chan4
02 -  chan1  chan2  chan3  chan4
etc.
Info for each note:
 _____byte 1_____   byte2_    _____byte 3_____   byte4_
/                \ /      \  /                \ /      \
0000          0000-00000000  0000          0000-00000000
Upper four    12 bits for    Lower four    Effect command.
bits of sam-  note period.   bits of sam-
ple number.                  ple number.
      One thing you should keep in mind about MOD files is that they
   originated from the Amiga, so the samples are signed, see the
   discussion about SAM files for more information.
   Note:
      Sounder and Sound Tool both use the same file extension, but have
   different file formats. To tell the difference, Read the first 6
   bytes of the file. If it matches the magic number for Sound Tool .SND
   files, it is a Sound Tool file. Else, it's a Sounder file or a raw
   file.
Sounder File Format:
 BYTE(S)        NORMAL CONTENTS               PURPOSE/DESCRIPTION
-----------------------------------------------------------------------
 00 - 01        0, 0                          Bits per sample. Ex: Sound
                                              Blaster can only do 8,
                                              Sound Blaster 16 can make
                                              16. Normally, the only
                                              valid value is 0, which is
                                              the code for an 8 bit
                                              sample. Future versions of
                                              Sounder and DSOUND.DLL may
                                              allow 16 bit samples and
                                              such.
 02 - 03        ???                           Sampling rate. Currently,
                                              only 22 KHz, 11 KHz, 7.33
                                              KHz, and 5.5 KHz are
                                              valid. If given a value
                                              like 9 KHz, it will be
                                              played at the next closest
                                              rate (in this case, 11
                                              KHz). The sampling rate is
                                              calculated as follows:
                                              SampRate = Byte1 + (256 * Byte2)
 04 - 05        ???                           Volume to play the sample
                                              back at. Note: On the PC's
                                              Internal Speaker, there is
                                              a definite upper limit as
                                              to the volume, depending
                                              on the shift value (see
                                              below). The Sound Blaster
                                              and the Disney Sound
                                              Source aren't quite as
                                              restricted, but still are
                                              at some high value.
 06 - 07        4, 0                          Shift value. This is the
                                              number that each byte is
                                              divided by to "scale" the
                                              volume down to a point
                                              where the PC's Internal
                                              Speaker can handle it. See
                                              the discussion on playing
                                              back digitalized sound for
                                              more details.
   Information from Sounder text files and Sound Tool help (.HLP) files.
Sound Tool File Format:
 BYTE(S)        NORMAL CONTENTS               PURPOSE/DESCRIPTION
-----------------------------------------------------------------------
 00 - 05        "SOUND", 26                   Just an identification
                                              thing. Helps a lot when
                                              you are trying to
                                              distinguish between
                                              Sounder .SND files and
                                              Sound Tool .SND files.
 08 - 11        ???                           This is the number of
                                              bytes in the sample. It is
                                              calculated as follows:
       ByteSam = Byte1 + (256 * Byte2) + (512 * Byte3) + (768 * Byte4)
 12 - 15        ???                           This points to the first
                                              byte to play in the file.
                                              It is calculated the same
                                              way as the number of bytes
                                              in the sample (see above).
 16 - 19        ???                           This points to the last
                                              byte in the sample to
                                              play. Calculated the same
                                              as above.
 20 - 21        ???                           Sampling rate of the
                                              sample. Valid values are
                                              22 KHz, 11 KHz, 7.33 , and
                                              5.5 K, but if given a
                                              number not listed above,
                                              it will be played at the
                                              closest valid sampling
                                              rate. So, 9 KHz would be
                                              played at 11 Khz. This is
                                              calculated as follows:
                                              SamRate =  Byte1 + (256 *
                                              Byte2)
 22 - 23        ???                           Bits per sample. Ex: Sound
                                              Blaster can only do 8,
                                              Sound Blaster 16 can make
                                              16. Normally, the only
                                              valid value is 0, which is
                                              the code for an 8 bit
                                              sample. Future versions of
                                              Sounder and DSOUND.DLL may
                                              allow 16 bit samples and
                                              such.
 24 - 25        ???                           Volume to play the sample
                                              back at. Note: On the PC's
                                              Internal Speaker, there is
                                              a definite upper limit as
                                              to the volume, depending
                                              on the shift value (see
                                              below). The Sound Blaster
                                              and the Disney Sound
                                              Source aren't quite as
                                              restricted, but still are
                                              at some high value.
 26 - 27        4, 0                          Shift value. This is the
                                              number that each byte is
                                              divided by to "scale" the
                                              volume down to a point
                                              where the PC's Internal
                                              Speaker can handle it. See
                                              the discussion on playing
                                              back digitalized sound for
                                              more details.
 28 - 123       ???                           This is the name of the
                                              sample. It is followed by
                                              an ASCII 0.
   Information from Sounder text files and Sound Tool help (.HLP) files.
      Whoo! That was a TON of typing. WHOA!! I just literaly spend all
   night preparing those messages for you. I believe I started it around
   12:00 am and now it's 5:00 am! Let me apologize if I made any
   mistakes in the previous messages, hard to type perfectly when your
   eye lids keep falling down <g>.
      I don't know the file format for MDI and MID files, and I don't
   think that they can be played on the internal speaker or printer port
   DACs. Sorry!
      Well, I hope I've answered all your questions, if you get anymore,
just post to me! Have fun with the new information!