OUT OF STRING SPACE
OUT OF STRING SPACE
Hi All!
New to QBasic. Running the QB included w/ W98|| on a W98|| system.
The quest:
Process fixed-width astronomical database files so that they load in Aladin.
I feel no attachment to QB, it just seems like a language I can handle. With a little help
from other topics in this forum, I produced a 15900-line 38-column CSV file. Great...
The source file runs to 2,501,314 lines (aka records) and 210 Bytes/line. CSVed
runs out of swap~:-(
ISTR something about loading large text files as binary to skate around LINE INPUT's limits...
The database is distributed as 30 files of at least 34000 lines each. They were put
into a directory and from DOS prompt:
COPY *.TXT I-280B.TXT
30 filenames slooowly scroll up... Done :-)
Thanks in Advance!
PS- The file at hand is The All-Sky Compiled Catalog of the 2.5 million Brightest Stars V3.
This box has 768MB RAM.
New to QBasic. Running the QB included w/ W98|| on a W98|| system.
The quest:
Process fixed-width astronomical database files so that they load in Aladin.
I feel no attachment to QB, it just seems like a language I can handle. With a little help
from other topics in this forum, I produced a 15900-line 38-column CSV file. Great...
The source file runs to 2,501,314 lines (aka records) and 210 Bytes/line. CSVed
runs out of swap~:-(
ISTR something about loading large text files as binary to skate around LINE INPUT's limits...
The database is distributed as 30 files of at least 34000 lines each. They were put
into a directory and from DOS prompt:
COPY *.TXT I-280B.TXT
30 filenames slooowly scroll up... Done :-)
Thanks in Advance!
PS- The file at hand is The All-Sky Compiled Catalog of the 2.5 million Brightest Stars V3.
This box has 768MB RAM.
- burger2227
- Veteran
- Posts: 2466
- Joined: Mon Aug 21, 2006 12:40 am
- Location: Pittsburgh, PA
So what's the question?
Nice story, but lacking in detail.
Nice story, but lacking in detail.
Please acknowledge and thank members who answer your questions!
QB64 is a FREE QBasic compiler for WIN, MAC(OSX) and LINUX : https://www.qb64.org/forum/index.php
Get my Q-Basics demonstrator: https://www.dropbox.com/s/fdmgp91d6h8ps ... s.zip?dl=0
QB64 is a FREE QBasic compiler for WIN, MAC(OSX) and LINUX : https://www.qb64.org/forum/index.php
Get my Q-Basics demonstrator: https://www.dropbox.com/s/fdmgp91d6h8ps ... s.zip?dl=0
Sorry for the ambiguity.
Is there a workaround for the LINE INPUT string space problem?
Should the file be processed as BINARY?
----------------
Since that post, I tried cutting the file into 32000-line chunks (with SPLIT.EXE). Got
79 files, each 6,720,000 Bytes. Still too large:(
Even the 3,617,044 Byte file #79 was too large.
Is there a workaround for the LINE INPUT string space problem?
Should the file be processed as BINARY?
----------------
Since that post, I tried cutting the file into 32000-line chunks (with SPLIT.EXE). Got
79 files, each 6,720,000 Bytes. Still too large:(
Even the 3,617,044 Byte file #79 was too large.
- burger2227
- Veteran
- Posts: 2466
- Joined: Mon Aug 21, 2006 12:40 am
- Location: Pittsburgh, PA
Why are you using LINE INPUT # to read a CSV file? Normally they are read using INPUT #.
Not sure how many variables you'd need. Depends on number of values separated by commas. String and number type values can be mixed in one line of data. Each line of data should have identical data types.
IE, read the file just like you used WRITE #1 to make them.
WRITE #1, a, b, c, d, e, f, ....
INPUT #1, g, h, i, j, k, l, m...
What are you doing with the data? QB size and memory limitations will probably not allow you to store it all in arrays so you would need to work with chunks of data. Not all at once. INPUT$(bytes, 1) can only read less than 32767 bytes at a time.
QB64 is the same as QB, but does not have those size and memory limitations. LINK is in my signature. It's written in C code for newer machines. Not 98's.
Ted
Not sure how many variables you'd need. Depends on number of values separated by commas. String and number type values can be mixed in one line of data. Each line of data should have identical data types.
IE, read the file just like you used WRITE #1 to make them.
WRITE #1, a, b, c, d, e, f, ....
INPUT #1, g, h, i, j, k, l, m...
What are you doing with the data? QB size and memory limitations will probably not allow you to store it all in arrays so you would need to work with chunks of data. Not all at once. INPUT$(bytes, 1) can only read less than 32767 bytes at a time.
QB64 is the same as QB, but does not have those size and memory limitations. LINK is in my signature. It's written in C code for newer machines. Not 98's.
Ted
Please acknowledge and thank members who answer your questions!
QB64 is a FREE QBasic compiler for WIN, MAC(OSX) and LINUX : https://www.qb64.org/forum/index.php
Get my Q-Basics demonstrator: https://www.dropbox.com/s/fdmgp91d6h8ps ... s.zip?dl=0
QB64 is a FREE QBasic compiler for WIN, MAC(OSX) and LINUX : https://www.qb64.org/forum/index.php
Get my Q-Basics demonstrator: https://www.dropbox.com/s/fdmgp91d6h8ps ... s.zip?dl=0
Thanks for the the help:)
My main reference is a book called 'QBasic by Example'. It shows how INPUT can be used to
read the fields from a CSV.
I'm trying to *make* a CSV from a very large fixed-width file. LINE INPUT can do that, but it seems to choke when there are too many lines in the source file. Here's the code:
CLS
col1 = 12
col2 = 12
col3 = 4
col4 = 4
col5 = 6
col6 = 5
col7 = 7
col8 = 7
col9 = 5
col10 = 5
col11 = 5
col12 = 5
col13 = 4
col14 = 4
col15 = 4
col16 = 1
col17 = 1
col18 = 1
col19 = 1
col20 = 2
col21 = 1
col22 = 1
col23 = 1
col24 = 1
col25 = 20
col26 = 4
col27 = 5
col28 = 1
col29 = 6
col30 = 6
col31 = 8
col32 = 7
col33 = 5
col34 = 4
col35 = 5
col36 = 4
col37 = 5
col38 = 4
'OPEN "c:\280b\i280b.txt" FOR INPUT AS #1
OPEN "c:\windows\desktop\str1.txt" FOR INPUT AS #1
DO
OPEN "c:\windows\desktop\catalog.txt" FOR APPEND AS #2
LINE INPUT #1, jn$
PRINT #2, MID$(jn$, 1, col1) + "," + MID$(jn$, 14, col2) + "," + MID$(jn$, 27, col3) + "," + MID$(jn$, 32, col4) + "," + MID$(jn$, 37, col5) + "," + MID$(jn$, 44, col6) + "," + MID$(jn$, 50, col7) + "," + MID$(jn$, 58, col8) + "," + MID$(jn$, 66, col9) + "," + MID$(jn$, 72, col10) + "," + MID$(jn$, 78, col11) + "," + MID$(jn$, 84, col12) + "," + MID$(jn$, 90, col13) + "," + MID$(jn$, 95, col14) + "," + MID$(jn$, 100, col15) + "," + MID$(jn$, 105, col16) + "," + MID$(jn$, 106, col17) + "," + MID$(jn$, 107, col18) + "," + MID$(jn$, 108, col19) + "," + MID$(jn$, 109, col20) + "," + MID$(jn$, 111, col21) + "," + MID$(jn$, 112, col22) + "," + MID$(jn$, 113, col23) + "," + MID$(jn$, 114, col24) + "," + MID$(jn$, 116, col25) + "," + MID$(jn$, 137, col26) + "," + MID$(jn$, 141, col27) + "," + MID$(jn$, 146, col28) + "," + MID$(jn$, 148, col29) + "," + MID$(jn$, 155, col30) + "," + MID$(jn$, 162, col31) + "," + MID$(jn$, 171, col32) + "," + MID$(jn$, 179, col33) + "," + MID$(jn$, 185, col34) + "," + MID$(jn$, 190, col35) + "," + MID$(jn$, 196, col36) + "," + MID$(jn$, 201, col37) + "," + MID$(jn$, 207, col38)
CLOSE #2
LOOP UNTIL EOF(1)
CLOSE #1
PRINT
PRINT "Done"
---------------------
The COLx = X contains the desired column widths.
This code works to produce a 15900 line (aka record) CSV file... but not a
2501314 line CSV.
My main reference is a book called 'QBasic by Example'. It shows how INPUT can be used to
read the fields from a CSV.
I'm trying to *make* a CSV from a very large fixed-width file. LINE INPUT can do that, but it seems to choke when there are too many lines in the source file. Here's the code:
CLS
col1 = 12
col2 = 12
col3 = 4
col4 = 4
col5 = 6
col6 = 5
col7 = 7
col8 = 7
col9 = 5
col10 = 5
col11 = 5
col12 = 5
col13 = 4
col14 = 4
col15 = 4
col16 = 1
col17 = 1
col18 = 1
col19 = 1
col20 = 2
col21 = 1
col22 = 1
col23 = 1
col24 = 1
col25 = 20
col26 = 4
col27 = 5
col28 = 1
col29 = 6
col30 = 6
col31 = 8
col32 = 7
col33 = 5
col34 = 4
col35 = 5
col36 = 4
col37 = 5
col38 = 4
'OPEN "c:\280b\i280b.txt" FOR INPUT AS #1
OPEN "c:\windows\desktop\str1.txt" FOR INPUT AS #1
DO
OPEN "c:\windows\desktop\catalog.txt" FOR APPEND AS #2
LINE INPUT #1, jn$
PRINT #2, MID$(jn$, 1, col1) + "," + MID$(jn$, 14, col2) + "," + MID$(jn$, 27, col3) + "," + MID$(jn$, 32, col4) + "," + MID$(jn$, 37, col5) + "," + MID$(jn$, 44, col6) + "," + MID$(jn$, 50, col7) + "," + MID$(jn$, 58, col8) + "," + MID$(jn$, 66, col9) + "," + MID$(jn$, 72, col10) + "," + MID$(jn$, 78, col11) + "," + MID$(jn$, 84, col12) + "," + MID$(jn$, 90, col13) + "," + MID$(jn$, 95, col14) + "," + MID$(jn$, 100, col15) + "," + MID$(jn$, 105, col16) + "," + MID$(jn$, 106, col17) + "," + MID$(jn$, 107, col18) + "," + MID$(jn$, 108, col19) + "," + MID$(jn$, 109, col20) + "," + MID$(jn$, 111, col21) + "," + MID$(jn$, 112, col22) + "," + MID$(jn$, 113, col23) + "," + MID$(jn$, 114, col24) + "," + MID$(jn$, 116, col25) + "," + MID$(jn$, 137, col26) + "," + MID$(jn$, 141, col27) + "," + MID$(jn$, 146, col28) + "," + MID$(jn$, 148, col29) + "," + MID$(jn$, 155, col30) + "," + MID$(jn$, 162, col31) + "," + MID$(jn$, 171, col32) + "," + MID$(jn$, 179, col33) + "," + MID$(jn$, 185, col34) + "," + MID$(jn$, 190, col35) + "," + MID$(jn$, 196, col36) + "," + MID$(jn$, 201, col37) + "," + MID$(jn$, 207, col38)
CLOSE #2
LOOP UNTIL EOF(1)
CLOSE #1
PRINT "Done"
---------------------
The COLx = X contains the desired column widths.
This code works to produce a 15900 line (aka record) CSV file... but not a
2501314 line CSV.
- burger2227
- Veteran
- Posts: 2466
- Joined: Mon Aug 21, 2006 12:40 am
- Location: Pittsburgh, PA
Are you sure that LINE INPUT is the culprit? Did you try running just LINE INPUT # without any PRINT #?
Why are you opening and closing #2 every loop? That can't be good.
When you start numbering variable names, it's time to think about using an array. Create an Array to hold the number of characters needed in colX.
Do you know how to make SUB programs? Put the PRINT #2 code into a SUB. Place parenthesis around the col numbers as shown in first line of statement below because you are reading the array now:
Place the SUB code after the main program code or create it in the Edit menu. EDIT has Make SUB, just place a name in the box to make one.
Now instead of the PRINT # code in the loop, place the SUB call after the LINE INPUT.
SUB calls prevent STRING errors like "Out of String Space" because every time they are called, everything is new to them.
IF parsing the string values does not work, then just use INPUT # and try reading it as a CSV file. You'll need a long statement but ANY type of values can be directly read. The code below assumes that the data is all one numerical or string type! If they are a mixture just use appropriate variable types without an array. Put this in the loop instead:
If this doesn't work then QB can't work with that size of files. Try QB64 on a newer machine.
Why are you opening and closing #2 every loop? That can't be good.
When you start numbering variable names, it's time to think about using an array. Create an Array to hold the number of characters needed in colX.
Code: Select all
DIM SHARED col(38) AS INTEGER ' shared passes array values to any SUB without a parameter
col(1) = 12
col(2) = 12
col(3) = 4
etc.
etc.
Code: Select all
SUB CreateCSV (Jn$)
PRINT #2, MID$(jn$, 1, col(1)) + "," + MID$(jn$, 14, col(2)) + "," + MID$(jn$, 27, col(3)) + "," + MID$(jn$, 32, col4) + "," + MID$(jn$, 37, col5) + "," + MID$(jn$, 44, col6) + "," + MID$(jn$, 50, col7) + "," + MID$(jn$, 58, col8) + "," + MID$(jn$, 66, col9) + "," + MID$(jn$, 72, col10) + "," + MID$(jn$, 78, col11) + "," + MID$(jn$, 84, col12) + "," + MID$(jn$, 90, col13) + "," + MID$(jn$, 95, col14) + "," + MID$(jn$, 100, col15) + "," + MID$(jn$, 105, col16) + "," + MID$(jn$, 106, col17) + "," + MID$(jn$, 107, col18) + "," + MID$(jn$, 108, col19) + "," + MID$(jn$, 109, col20) + "," + MID$(jn$, 111, col21) + "," + MID$(jn$, 112, col22) + "," + MID$(jn$, 113, col23) + "," + MID$(jn$, 114, col24) + "," + MID$(jn$, 116, col25) + "," + MID$(jn$, 137, col26) + "," + MID$(jn$, 141, col27) + "," + MID$(jn$, 146, col28) + "," + MID$(jn$, 148, col29) + "," + MID$(jn$, 155, col30) + "," + MID$(jn$, 162, col31) + "," + MID$(jn$, 171, col32) + "," + MID$(jn$, 179, col33) + "," + MID$(jn$, 185, col34) + "," + MID$(jn$, 190, col35) + "," + MID$(jn$, 196, col36) + "," + MID$(jn$, 201, col37) + "," + MID$(jn$, 207, col38)
END SUB
Now instead of the PRINT # code in the loop, place the SUB call after the LINE INPUT.
Code: Select all
OPEN "c:\windows\desktop\catalog.txt" FOR APPEND AS #2 'NOT in loop!
DO UNTIL EOF(1) ' you cannot read it if it is empty
LINE INPUT #1, text$
CALL CreateCSV (text$) 'use a different variable name, SUB won't care
LOOP ' using EOF here might cause an error because an empty file would be read once
CLOSE ' closes all files!
PRINT
PRINT "Done"
END ' or SYSTEM closes program Place SUB code after this line.
IF parsing the string values does not work, then just use INPUT # and try reading it as a CSV file. You'll need a long statement but ANY type of values can be directly read. The code below assumes that the data is all one numerical or string type! If they are a mixture just use appropriate variable types without an array. Put this in the loop instead:
Code: Select all
DIM data(38) AS '??? STRING, INTEGER, DOUBLE, SINGLE, LONG if used
DO UNTIL EOF(1)
INPUT #1, data(1), data(2), data(3)......data(38)
WRITE #2, data(1), data(2), data(3)....data(38)
LOOP
Last edited by burger2227 on Sun Sep 26, 2010 5:49 pm, edited 1 time in total.
Please acknowledge and thank members who answer your questions!
QB64 is a FREE QBasic compiler for WIN, MAC(OSX) and LINUX : https://www.qb64.org/forum/index.php
Get my Q-Basics demonstrator: https://www.dropbox.com/s/fdmgp91d6h8ps ... s.zip?dl=0
QB64 is a FREE QBasic compiler for WIN, MAC(OSX) and LINUX : https://www.qb64.org/forum/index.php
Get my Q-Basics demonstrator: https://www.dropbox.com/s/fdmgp91d6h8ps ... s.zip?dl=0
Could really perform this properly.
Some back up files seems to run some error and don't know why.
howtodealwithdepression.org
Some back up files seems to run some error and don't know why.
howtodealwithdepression.org
- burger2227
- Veteran
- Posts: 2466
- Joined: Mon Aug 21, 2006 12:40 am
- Location: Pittsburgh, PA
Post your code and list the errors. You can get many errors with files.
When an error occurs, note the line of code it stops at. It will almost point at the error!
When an error occurs, note the line of code it stops at. It will almost point at the error!
Please acknowledge and thank members who answer your questions!
QB64 is a FREE QBasic compiler for WIN, MAC(OSX) and LINUX : https://www.qb64.org/forum/index.php
Get my Q-Basics demonstrator: https://www.dropbox.com/s/fdmgp91d6h8ps ... s.zip?dl=0
QB64 is a FREE QBasic compiler for WIN, MAC(OSX) and LINUX : https://www.qb64.org/forum/index.php
Get my Q-Basics demonstrator: https://www.dropbox.com/s/fdmgp91d6h8ps ... s.zip?dl=0