May 14, 2024, 07:22:36 PM

News:

Own IWBasic 2.x ? -----> Get your free upgrade to 3.x now.........


Parsing a line

Started by Rock Ridge Farm (Larry), January 11, 2006, 05:06:08 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Rock Ridge Farm (Larry)

In my old days programming in C I had a routine that would parse a line of delimited data.
The same routine is not working in 'A'.
I have a line -> "item1|item2|item3|item4" I want to extract the individual data into
variables a,b,c,d such that a ="item1", b = "item2" ........
Can someone show me how to do this?

Zen

I cannot give you an example as i am not sure exactly how i would do it myself. I can however point you in a direction that i thing is good.

You can use the instr function to search for the | character and then use linked lists to store the values that are seperated by the | character.

Here is a description of the instr command...
Quote
unsigned int = instr(string target,string search,OPT unsigned int pos = 1);

Description
Finds the first occurrence of a search string in a target string.

Parameters
target - String to perform the search on.
search - String to search for.
pos - Optional ones based starting position to begin the search.

Return value
Returns the ones based position of the first occurrence of the search string in the target string. Returns 0 if the search string could not be found.

Example usage
Test = "A string to search for something"
writeln(instr(Test,"search"));

Lewis

Rock Ridge Farm (Larry)

That would serve to find the delimiter but how would you break the individual
components from the line. In C I could use a pointer assigned to the start of the string,
find the delimiter, assign a null byte at that point and copy the value then re-assign the pointer++ and move along. In 'A' I am having issues with assigning pointers and declaring them.

Zen

you can do it with the mid$ function...

Quote
string = mid$(string str,int start,OPT int count=-1);

Description
Extracts zero or more characters from any position in the specified input string.

Parameters
str - The input string to extract characters from/
start - Ones based starting position of the extraction.
count - Optional. Number of characters to extract. If omitted then all characters from the starting position to the end of the string are included in the extracted substring.

Return value
A copy of the extracted substring.

Example usage
Test = "This is a test";

if(mid$(Test,6,2) = "is") {
   writeln("cool");
}

writeln(mid$(Test,9));

Lewis

Rock Ridge Farm (Larry)

Here is the routine in 'C':

char *
mktok(p)
register char *p;
{
        while(*p && *p != '|' && *p != '\n')
                ++p;
        if(*p == '\n')
                *p = '\0';
        else if(*p)
                *p++ = '\0';
        return(p);
}

How can this be converted to 'A'?
The way it was used in 'C':
      Typical calling segment:

              char *c;

              c = fgets(obuf,512,empf);
              if(c == NULL)
                      return(0);
              for(i=0;i<5;i++){
                      fptrs = c;
                      c = mktok(c);
              }

This will fill fptrs[0] - fprts[4] with data items 1 - 4.


I am atempting it with the mid$ function but would really like to use something
like the above.

Rock Ridge Farm (Larry)

I wrote the subroutine as follows - but it still does not work - help!
SUB DoParse(ClistView *Plist){
   string tbfr;
   unsigned int pos;
   unsigned int curpos;

   curpos = 1;
   pos = instr(inln,"|",curpos);
   pos--;
   pList->InsertItem(curln,mid$(inln,curpos,pos));
   curpos = curpos + pos + 1;
   pos = instr(inln,"|",curpos);
   pos--;
   pList->SetItemText(curln,1,mid$(inln,curpos,pos));
   curpos = curpos + pos + 1;
   pos = instr(inln,"|",curpos);
   pos--;
   pList->SetItemText(curln,2,mid$(inln,curpos,pos));
   curpos = curpos + pos + 1;
   pos = instr(inln,"|",curpos);
   pos--;
   pList->SetItemText(curln,3,mid$(inln,curpos,pos));
   curln++;
   return;
}

Parker

What doesn't work about it? Are there compiler errors or does it just not display right? What is the sample output from a list? I always found it odd how INSTR and MID$ require a one-based position while arrays are zero based, so you may want to change the parameters a little to see if it works. I always have to go to the IBPro docs to look it up.

Rock Ridge Farm (Larry)

Compiles and runs - just does not load any data.

Parker

A possibility is that the 'inln' string isn't declared in the right scope. But I'm not sure what could be happening. Try defining that string inside the function to see if it works there, then at least we'll know if that's the problem.

Rock Ridge Farm (Larry)

I got it to partially work with the code above - first field parses ok - 2,3,4 do not.
It seems that :
        curpos = curpos + pos;
   pos = instr(inln,"|",curpos);
   pList->SetItemText(curln,1,mid$(inln,curpos,pos - 1));
does not return the second field. Value curpos is ok but pos is wrong.
Any Ideas?

Rock Ridge Farm (Larry)

I got it to work - not a generic method but it works.


SUB DoParseSys(ClistView *Plist){
unsigned int pos;
unsigned int curpos;

curpos = 1;
pos = instr(sysbuf,"|",curpos);
pList->InsertItem(syscnt,mid$(sysbuf,curpos,pos - 1));
sysdat[syscnt].systemname = mid$(sysbuf,curpos,pos - 1);
curpos = pos + 1;
pos = instr(sysbuf,"|",curpos);
pList->SetItemText(syscnt,1,mid$(sysbuf,curpos,pos - curpos));
sysdat[syscnt].location = mid$(sysbuf,curpos,pos - curpos);
curpos = pos + 1;
pos = instr(sysbuf,"|",curpos);
pList->SetItemText(syscnt,2,mid$(sysbuf,curpos,pos - curpos));
sysdat[syscnt].certified = mid$(sysbuf,curpos,pos - curpos);
curpos = pos + 1;
pos = instr(sysbuf,"|",curpos);
pList->SetItemText(syscnt,3,mid$(sysbuf,curpos,pos - curpos));
sysdat[syscnt].ip = mid$(sysbuf,curpos,pos - curpos);
curpos = pos + 1;
pos = instr(sysbuf,"|",curpos);
pList->SetItemText(syscnt,4,mid$(sysbuf,curpos,pos - curpos));
sysdat[syscnt].os = mid$(sysbuf,curpos,pos - curpos);
syscnt++;
return;
}


Parker

I never really got the hang of parsing with INSTR. Maybe it's because INSTR is 1 based and arrays are 0 based. But I always end up writing my own tokenizing subroutine whenever I need it.