Print Page - Sorting Files

Title: Sorting Files
Post by: Brian on October 16, 2011, 10:01:10 AM

Hi, just need a bit of prodding in the right direction here...

I have two files, one with 26000 records (lines), and one with 2300 records.
I have got them loaded into memory OK, using Fletchie's DynaStore functions

I want to extract to a listview (already working), the lines in the smaller file
which are not in the larger file. Some lines in the smaller file will exist in the
larger file, and I only want the different lines

Can anyone tell me the sequence of operations to do this, because it's
driving me mad at the moment!

Many thanks,

Brian

Title: Re: Sorting Files
Post by: LarryMc on October 16, 2011, 12:11:49 PM

If I understand you correctly this should do it for you.
The following will take each string in the small dyna file and start comparing it against each of the lines in the big file.
If a match is found then the found flag is set and the BREAK causes the for y loop to end(no reason to keep looking when we've found a match.

At the exit of the For y loop we check to see if we found a match.( we don't know if we are at that point because we found a match or if we have gone through all 26000 lines and didn't find a match)
If we didn't find a match we add the line to the list view.
we then go back to the top of the for x loop, reset the found fflag and get another string to compare.

LarryMc

Code Select

int x,y,found
string search$,test$

for x=1 to 2300		'small file
	found = 0
	DynaGetStr(DynaStore1,x,search$)
	for y=1 to 26000	'big file
		DynaGetStr(DynaStore2,y,test$)
		if search$=test$
			found =1
			break
		endif
	next y
	if found =0 
		add search$ to listview
	endif
next x

Title: Re: Sorting Files
Post by: billhsln on October 16, 2011, 10:20:19 PM

Ok, I know this is going to maybe be a stupid question, but.

Why load the large file into memory, just load the small file, then check against each line read from the larger file, as you read each record. I see no reason to save the larger file in memory, unless you need it some where else down the line in the code.

Bill

Title: Re: Sorting Files
Post by: Brian on October 17, 2011, 02:20:18 AM

Larry,
Thanks very much - I haven't tested it thoroughly yet, but I can see it is going to work.
I need to set up some control data with numbers at the beginning of each line so that
I can visually see what output I get

Bill,
There is no reason to load the largest file, other than the DynaStore loadfile function
loads extremely quickly, the searches in memory are much faster, and we're talking of
maybe a 3mb file here to loop through. Don't forget, it has to loop through the larger
file times the number of lines in the smaller file to reach a result

Brian

Title: Re: Sorting Files
Post by: billhsln on October 17, 2011, 10:29:31 AM

I would think that with I/O buffering, the difference would be minimal, since most of the time would be taken with the loop after each read going thru the smaller file to see if it matches.

Bill

IonicWind Software

IWBasic => General Questions => Topic started by: Brian on October 16, 2011, 10:01:10 AM