IonicWind Software

IWBasic => General Questions => Topic started by: Brian on October 16, 2011, 10:01:10 AM

Title: Sorting Files
Post by: Brian on October 16, 2011, 10:01:10 AM
Hi, just need a bit of prodding in the right direction here...

I have two files, one with 26000 records (lines), and one with 2300 records.
I have got them loaded into memory OK, using Fletchie's DynaStore functions

I want to extract to a listview (already working), the lines in the smaller file
which are not in the larger file. Some lines in the smaller file will exist in the
larger file, and I only want the different lines

Can anyone tell me the sequence of operations to do this, because it's
driving me mad at the moment!

Many thanks,

Brian
Title: Re: Sorting Files
Post by: LarryMc on October 16, 2011, 12:11:49 PM
If I understand you correctly this should do it for you.
The following will take each string in the small dyna file and start comparing it against each of the lines in the big file.
If a match is found then the found flag is set and the BREAK causes the for y loop to end(no reason to keep looking when we've found a match.

At the exit of the For y loop we check to see if we found a match.( we don't know if we are at that point because we found a match or if we have gone through all 26000 lines and didn't find a match)
If we didn't find a match we add the line to the list view.
we then go back to the top of the for x loop, reset the found fflag and get another string to compare.

LarryMc

int x,y,found
string search$,test$

for x=1 to 2300 'small file
found = 0
DynaGetStr(DynaStore1,x,search$)
for y=1 to 26000 'big file
DynaGetStr(DynaStore2,y,test$)
if search$=test$
found =1
break
endif
next y
if found =0
add search$ to listview
endif
next x
Title: Re: Sorting Files
Post by: billhsln on October 16, 2011, 10:20:19 PM
Ok, I know this is going to maybe be a stupid question, but.

Why load the large file into memory, just load the small file, then check against each line read from the larger file, as you read each record.  I see no reason to save the larger file in memory, unless you need it some where else down the line in the code.

Bill
Title: Re: Sorting Files
Post by: Brian on October 17, 2011, 02:20:18 AM
Larry,
Thanks very much - I haven't tested it thoroughly yet, but I can see it is going to work.
I need to set up some control data with numbers at the beginning of each line so that
I can visually see what output I get

Bill,
There is no reason to load the largest file, other than the DynaStore loadfile function
loads extremely quickly, the searches in memory are much faster, and we're talking of
maybe a 3mb file here to loop through. Don't forget, it has to loop through the larger
file times the number of lines in the smaller file to reach a result

Brian
Title: Re: Sorting Files
Post by: billhsln on October 17, 2011, 10:29:31 AM
I would think that with I/O buffering, the difference would be minimal, since most of the time would be taken with the loop after each read going thru the smaller file to see if it matches.

Bill