October 30, 2025, 05:22:05 PM

News:

Own IWBasic 2.x ? -----> Get your free upgrade to 3.x now.........


Sorting Files

Started by Brian, October 16, 2011, 10:01:10 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Brian

Hi, just need a bit of prodding in the right direction here...

I have two files, one with 26000 records (lines), and one with 2300 records.
I have got them loaded into memory OK, using Fletchie's DynaStore functions

I want to extract to a listview (already working), the lines in the smaller file
which are not in the larger file. Some lines in the smaller file will exist in the
larger file, and I only want the different lines

Can anyone tell me the sequence of operations to do this, because it's
driving me mad at the moment!

Many thanks,

Brian

LarryMc

If I understand you correctly this should do it for you.
The following will take each string in the small dyna file and start comparing it against each of the lines in the big file.
If a match is found then the found flag is set and the BREAK causes the for y loop to end(no reason to keep looking when we've found a match.

At the exit of the For y loop we check to see if we found a match.( we don't know if we are at that point because we found a match or if we have gone through all 26000 lines and didn't find a match)
If we didn't find a match we add the line to the list view.
we then go back to the top of the for x loop, reset the found fflag and get another string to compare.

LarryMc

int x,y,found
string search$,test$

for x=1 to 2300 'small file
found = 0
DynaGetStr(DynaStore1,x,search$)
for y=1 to 26000 'big file
DynaGetStr(DynaStore2,y,test$)
if search$=test$
found =1
break
endif
next y
if found =0
add search$ to listview
endif
next x
LarryMc
Larry McCaughn :)
Author of IWB+, Custom Button Designer library, Custom Chart Designer library, Snippet Manager, IWGrid control library, LM_Image control library

billhsln

Ok, I know this is going to maybe be a stupid question, but.

Why load the large file into memory, just load the small file, then check against each line read from the larger file, as you read each record.  I see no reason to save the larger file in memory, unless you need it some where else down the line in the code.

Bill
When all else fails, get a bigger hammer.

Brian

Larry,
Thanks very much - I haven't tested it thoroughly yet, but I can see it is going to work.
I need to set up some control data with numbers at the beginning of each line so that
I can visually see what output I get

Bill,
There is no reason to load the largest file, other than the DynaStore loadfile function
loads extremely quickly, the searches in memory are much faster, and we're talking of
maybe a 3mb file here to loop through. Don't forget, it has to loop through the larger
file times the number of lines in the smaller file to reach a result

Brian

billhsln

I would think that with I/O buffering, the difference would be minimal, since most of the time would be taken with the loop after each read going thru the smaller file to see if it matches.

Bill
When all else fails, get a bigger hammer.