loading excel files

If you have questions about any aspect of QBasic programming, or would like to help fellow programmers solve their problems, check out this board!

Moderators: Pete, Mods

Post Reply
Guest

loading excel files

Post by Guest » Tue Jan 17, 2006 1:59 am

Is it possible to load excel files?

RayBritton
Veteran
Posts: 109
Joined: Thu Feb 10, 2005 12:38 pm

Post by RayBritton » Tue Jan 17, 2006 2:24 am

it is possible but realistically no

Quibbler

Post by Quibbler » Tue Jan 17, 2006 9:16 am

Sure you can do it. I've done it and it works well. It's just a matter of reading the excel file and looking for delimiters. It's a bit of trial and error to get the data you're looking for here's my prog can't remember how it works think it was looking for chr$(64)+chr$(0)+chr$(0) as a three byte delimiter then the length of the data is in there somewhere seems like second byte after the "00 00 64"

Code: Select all

B$ = CHR$(0) + CHR$(0) + CHR$(0)
C$ = CHR$(64) + CHR$(0) + CHR$(0)
DEFLNG I
M = 99
OPEN "O", #2, "C:\outfile.txt"
OPEN "r", #1, "C:\infile.XLS", 1
FIELD #1, 1 AS AA$
FOR I = 1 TO 1000000
IF EOF(1) GOTO 20
GET #1, I
B$ = RIGHT$(B$, 2) + AA$
IF B$ <> C$ GOTO 100
GET #1, I + 1
I2 = ASC(AA$)
FOR I1 = I + 2 TO I + 1 + I2
GET #1, I1
D$ = D$ + AA$
NEXT I1
PRINT #2, D$; "  "
D$ = ""
100 NEXT I
20 CLOSE #1
CLOSE #2

moneo
Veteran
Posts: 451
Joined: Tue Jun 28, 2005 7:00 pm
Location: Mexico City, Mexico

Re: loading excel files

Post by moneo » Tue Jan 17, 2006 2:01 pm

Anonymous wrote:Is it possible to load excel files?
If you try to load (read) Excel files directly into your QB program, you're asking for trouble.

A little "manual intervention" in advance from Excel will really simplify the problem, as follows:
1) Get into Excel and open the file.
2) Do a "Save as".
3) You can save the file as comma delimited, tab delimited or space delimited.
4) I always save it as Formatted Text (Space delimited) (*.prn), and it works fine.
5) Space delimited eliminates the need for parsing the text in QB to find comma or tab delimiters.
6) Now, open the *.prn file in QB and treat it just like a regular text file. View the file with an editor to determine in which position each field (column) starts, and insert these starting positions into your program as constants.
7) By viewing the text file first, you may discover additional problems that you'll need to handle. Excel files are not always created by professionals.
8] Sometimes the Excel file may have data that's wider than the column that it's in, in which case you'll have to make the column wider in Excel before saving the file.
9) The manual process of cleaning up the data first will save you a lot of grief later in your program.

Try it!
*****Moneo*****

Antoni
Veteran
Posts: 132
Joined: Wed Jun 15, 2005 3:01 pm
Contact:

Post by Antoni » Wed Jan 18, 2006 7:07 am

Two comments:
If you export the file as a cvs, and your windows regional settings are a comma as list separator, and nothing as thousands separator, you can import the each element in the table to QB by using the INPUT # statement in a loop.

And for correct formatting , sometimes is a good idea to copy the area you intend to export to a blank new sheet, then export as cvs from there, it could save you a lo of time in designing the reading program in QB.

moneo
Veteran
Posts: 451
Joined: Tue Jun 28, 2005 7:00 pm
Location: Mexico City, Mexico

Post by moneo » Thu Jan 19, 2006 2:53 pm

Antoni wrote:.....
And for correct formatting , sometimes is a good idea to copy the area you intend to export to a blank new sheet, then export as cvs from there, it could save you a lo of time in designing the reading program in QB.
An excellent suggestion to copy the area you intend to export to a blank new sheet. This allows you to make any necessary adjustments before exporting the file.

I personally prefer "space delimited" over "comma delimited" to avoid writing a parsing routine which sometimes can deviate too much time and debugging effort from the main program itself.
*****

moneo
Veteran
Posts: 451
Joined: Tue Jun 28, 2005 7:00 pm
Location: Mexico City, Mexico

Post by moneo » Fri Jan 20, 2006 8:07 pm

Warning re Space Delimited files.

I just remembered a problem I had about 3 years ago with a Excel file that was quite wide. When I converted it to a space delimited .prn file, for some strange reason it truncated each line at 240 characters. The characters from 241 on were placed at the end of the file, line by line.

I was able to get around the problem. but had to write a special little utility program to put the lines of the file together again.

Here's what Microsoft has to say about this at:
http://support.microsoft.com/kb/131554/en-us

"In Microsoft Excel, when you save a worksheet as a Formatted Text (Space Delimited) (.prn) file, characters beyond 240 are wrapped to the next line.
NOTE: If several rows on the same sheet contain text beyond 240 characters, the text begins wrapping at the row after the last row that contains text."

*****

User avatar
Quibbler
Coder
Posts: 16
Joined: Tue Jan 24, 2006 7:29 am
Location: Trinidad and Tobago

Post by Quibbler » Tue Jan 24, 2006 7:50 am

It's easier to read in the excel file as is particularly if you have a lot of them to deal with. I just used a HEX viewer to find the delimiters and the field length. Oh also I would never use excel it horrible to use and it discourages people from learning how to program - always write your own database program it may waste a bit of time at the begining but it will be more flexible.

Antoni
Veteran
Posts: 132
Joined: Wed Jun 15, 2005 3:01 pm
Contact:

Post by Antoni » Tue Jan 24, 2006 2:43 pm

90% of time Excel is used for it's great user input editing and cut and paste capabilities, and despite it's null input validating functions...

moneo
Veteran
Posts: 451
Joined: Tue Jun 28, 2005 7:00 pm
Location: Mexico City, Mexico

Post by moneo » Wed Jan 25, 2006 3:17 pm

Quibbler wrote:.....Oh also I would never use excel it horrible to use and it discourages people from learning how to program - always write your own database program it may waste a bit of time at the begining but it will be more flexible.
In general what you say here is true and valid. However, almost all users that resort to using Excel are not programmers, their job functions don't require that they know how to program, and therfore they are not interested in learning how to program.
Quibbler wrote:It's easier to read in the excel file as is particularly if you have a lot of them to deal with. I just used a HEX viewer to find the delimiters and the field length.
I assume here that you mean reading the .xls file as is in QB binary mode. I created a very simple Excel .xls file with 3 cells wide and 2 lines long. Then I used a hex editor to view the file. Holy cow! You have to be an expert in deciphering .xls files just to find the data!

The file has a bunch of header information which you have to wade through before you get to the data. I finally found the data by visual inspection. I can't figure out how I would find the start of the data from within a program. The "delimiters" between fields alse come in different flavors. Determining where the data finally ends is another challenge.

In summary, you perhaps are an expert on the Excel .xls file formats, but I would never recommend this approach to anyone else. These .xls files are native to Excel and not intended to be referenced by user programs. Different Excel versions may also introduce changes in the .xls file formats. That's why Excel offers so many standard export file options.
*****

RyanKelly
Coder
Posts: 48
Joined: Sun Jan 22, 2006 6:40 pm
Contact:

Post by RyanKelly » Wed Jan 25, 2006 7:44 pm

Hey, I don't know much about the excel file formats, but I can suggest a few techniques for deciphering an unknown file format. Most of these will seem obvious.

Create the simplest file with default settings.
Use this as your reference point.
Make a list of each file option and create a new file
for each option.
Write a simple program to compare each file byte by byte with your reference file and store the points of difference in a file that you can easily inspect.

Using your reference file, create alternative files with different data (say a single cell in a spread sheet set to a different value).
Again to a byte by byte $$$.

Using the information you've gained from this proccess, use a hex editor or a custom program to experimentally alter your reference file.

Start with intuitive ideas. Search for strings (in ASCII and Unicode).
Along the way you can look out for simple clues such as numerical values stores in byte, two byte, or four byte size location in both big endian and little endian form (certain tamper protection schemes invert the byte ordering of data).
Also look out for bit packing (several small values packed into one or two bytes).

Look out for indexes. If an identified piece of data show up in different places when you make changes to your reference file (say a string seems to move about), try to find a numerical value that changes by the difference in the file position. If the length of something changes, look for a value that changes as well.

Also note data that never changes, no matter what you do. These are often format identifiers. Look to checksum values and file length indicators (for example many wave players won't function if the length of the wave data isn't correct in the file header)

In most cases, file formats will succumb easily in this way. The real challenges come when data has been compressed.

This probably more effort then you are willing to expend on this project, and I'm sure you work out a simpler hack based on everyone elses suggestions, but if you have the time, cracking a file format can be a lot of fun.

User avatar
Patz QuickBASIC Creations
Veteran
Posts: 399
Joined: Wed Mar 02, 2005 9:01 pm
Location: Nashville, Tennessee
Contact:

Post by Patz QuickBASIC Creations » Thu Jan 26, 2006 6:30 pm

RyanKelly wrote:Hey, I don't know much about the excel file formats, but I can suggest a few techniques for deciphering an unknown file format. Most of these will seem obvious.

Create the simplest file with default settings.
Use this as your reference point.
Make a list of each file option and create a new file
for each option.
Write a simple program to compare each file byte by byte with your reference file and store the points of difference in a file that you can easily inspect.

Using your reference file, create alternative files with different data (say a single cell in a spread sheet set to a different value).
Again to a byte by byte $$$.

Using the information you've gained from this proccess, use a hex editor or a custom program to experimentally alter your reference file.

Start with intuitive ideas. Search for strings (in ASCII and Unicode).
Along the way you can look out for simple clues such as numerical values stores in byte, two byte, or four byte size location in both big endian and little endian form (certain tamper protection schemes invert the byte ordering of data).
Also look out for bit packing (several small values packed into one or two bytes).

Look out for indexes. If an identified piece of data show up in different places when you make changes to your reference file (say a string seems to move about), try to find a numerical value that changes by the difference in the file position. If the length of something changes, look for a value that changes as well.

Also note data that never changes, no matter what you do. These are often format identifiers. Look to checksum values and file length indicators (for example many wave players won't function if the length of the wave data isn't correct in the file header)

In most cases, file formats will succumb easily in this way. The real challenges come when data has been compressed.

This probably more effort then you are willing to expend on this project, and I'm sure you work out a simpler hack based on everyone elses suggestions, but if you have the time, cracking a file format can be a lot of fun.
Takes MUCHO time, which many people don't have. I suggest moneo's method, since he is usually right :)

moneo
Veteran
Posts: 451
Joined: Tue Jun 28, 2005 7:00 pm
Location: Mexico City, Mexico

Post by moneo » Fri Jan 27, 2006 3:25 pm

Patz QuickBasic Creations wrote:.....
Takes MUCHO time, which many people don't have. I suggest moneo's method, since he is usually right :)
Patz, thanks for your vote of confidence.

Not only will the mentioned approach take MUCHO time, but working on an "unknown" file format, you'll never know whether or not you programmed all the possible data variations and data content.
*****

RyanKelly
Coder
Posts: 48
Joined: Sun Jan 22, 2006 6:40 pm
Contact:

Post by RyanKelly » Sat Jan 28, 2006 3:15 am

Don't get me wrong. For serious work cracking a file format is probably a waste of time, but if your only in it for a personal challenge... I've never tackled anything as complex as a full spread sheet format, but for saved game files or level editing its usually not that much of a time investment.
Regardless, I think you can get pretty close to a full format specification with enough time and patience. It's an iterative process. You change one thing at a time.

moneo
Veteran
Posts: 451
Joined: Tue Jun 28, 2005 7:00 pm
Location: Mexico City, Mexico

Post by moneo » Sat Jan 28, 2006 8:56 pm

Ryan 'ol buddy, you're beating a dead horse here. The original thread was about Excel files.

Given the circumstance of having a file of totally unknown format, the we would probably have to resort to something like your proposed approach.
*****
If you are ahead of me, lead.
If you are behind me, follow.
If you are not doing anything,
Get out of the way.

Post Reply