IonicWind Software

IWBasic => Database => Topic started by: Jim Scott on December 30, 2006, 07:44:13 PM

Title: Continuation of Connecting to a CSV file
Post by: Jim Scott on December 30, 2006, 07:44:13 PM
The following code connects to the data without the schema.ini file.
Def pdb as Pointer
Def hstmt as int
def error as string
Def NumOfCols as Int
def CLIENT_TYPE as string
def LAST_NAME as string
def FIRST_NAME as string
def SITE_LOC as string
def MEMBER_NUM as string
def CHECKIN as DBDATE
def CHECKOUT as DBDATE
def UNIT_NUM as string

OPENCONSOLE
pdb=dbConnect("Microsoft Text Driver (*.txt; *.csv)",GetStartPath,"")
If pdb <> Null
Print "Connection To Database Established"
hstmt = dbExecSQL(pdb,"SELECT * FROM SmallFile.csv")
error = dbGetErrorCode(hstmt)

If Len(error) > 0
Print
Print "Error Code: ", error
Print "Error Text: ", dbGetErrorText(hstmt)
Print
EndIf

If hstmt <> Null
NumOfCols = dbGetNumCols(hstmt)
Print "Number of Columns in Dataset = ", NumOfCols
dbBindVariable(hstmt,1,Client_Type)
dbBindVariable(hstmt,2,Last_Name)
dbBindVariable(hstmt,3,First_Name)
dbBindVariable(hstmt,4,Site_Loc)
dbBindVariable(hstmt,5,Member_Num)
dbBindDate(hstmt,6,CheckIn)
dbBindDate(hstmt,7,CheckOut)
dbBindVariable(hstmt,8,Unit_Num)

While dbGetNext(hstmt) <> Null
print Len(Client_Type)," ,", Len(Last_Name)," ,", Len(First_Name),_
," ,", Len(Site_Loc)," ,", Len(Member_Num)," ,", Len(CheckIn),_
," ,", Len(CheckOut)," ,", Len(Unit_Num)
Print Client_Type,",",Last_Name,",",First_Name,",",Site_Loc,_
,",",Member_Num,",",CheckIn,",",CheckOut,",",Unit_Num
Print
EndWhile
dbFreeSQL(hstmt)
EndIf

dbDisconnect(pdb)
Else
Print "Connection Not Established"
EndIf

Print "Press any key to exit"
Do:Until INKEY$ <> ""
CLOSECONSOLE
END


The file "Smalldb.CSV" contains the following;
Client Type,Last Name,First Name,Site Loc.,Member#,CheckIn,CheckOut,Unit #
Member,ADAMS,WILLIAM,53,296252920,26-Dec-2006,1-Jan-2007,41
Member,AMES,ALBION,33,200068864,14-Dec-2006,28-Dec-2006,65

The output to the console is the following;
Connection To Database Established
Number of Columns in Dataset = 8
59  ,0  ,0  ,0  ,0  ,6  ,6  ,0
Member,ADAMS,WILLIAM,53,296252920,26-Dec-2006,1-Jan-2007,41,,,,,,,

58  ,0  ,0  ,0  ,0  ,6  ,6  ,0
Member,AMES,ALBION,33,200068864,14-Dec-2006,28-Dec-2006,65,,,,,,,

Press any key to exit

Which shows the entire line, for each dbGetNext command, being sucked into the "Client_Type" variable and not parsing the line using the commas.

Am I missing something here?  Can someone point me to some reading that may help me understand this stuff better?
Title: Re: Continuation of Connecting to a CSV file
Post by: Ionic Wind Support Team on December 30, 2006, 07:55:50 PM
When you don't use a schema file the driver tries to "guess" at the format of each field, and the separator for each column.  It may be your column tag line that is messing it up.

Just for an experiment try chaning the first line to:

ClientType,LastName,FirstName,SiteLoc,Member,CheckIn,CheckOut,Unit

What is probably happening is the driver is seeing the space in the "Client Type" column and treating that as the separator for the whole file.  The schema file, when you get it to work correctly, specifies the separator and the format of all the fields.  If you look at DominqueB's post in your other thread he did get it to work properly with the schema file.


[SmallFile.csv]
ColNameHeader=True
Format=CSVDelimited
MaxScanRows=0
CharacterSet=ANSI
Col1=CLIENT_TYPE Char Width 255
Col2=LAST_NAME Char Width 255
Col3=FIRST_NAME Char Width 255
Col4=SITE_LOC Char Width 255
Col5=MEMBER_NUM Char Width 255
Col6=CHECKIN DateTime
Col7=CHECKOUT DateTime
Col8=UNIT_NUM Char Width 255


Keep at it ;)
Title: Re: Continuation of Connecting to a CSV file
Post by: Jim Scott on December 30, 2006, 08:33:47 PM
Ok, here's what I did.

Program output after putting the header to;

ClientType,LastName,FirstName,SiteLoc,Member,CheckIn,CheckOut,Unit

and having no schema.ini file present;

****************************************
Connection To Database Established
Number of Columns in Dataset = 8
6  ,5  ,7  ,2  ,9  ,6  ,6  ,2
Member,ADAMS,WILLIAM,53,296252920,,,41

6  ,4  ,6  ,2  ,9  ,6  ,6  ,2
Member,AMES,ALBION,33,200068864,,,65
****************************************

Program output after putting the header row back to;

Client Type,Last Name,First Name,Site Loc.,Member#,CheckIn,CheckOut,Unit #

and putting the following schema.ini file back into the directory;

****************************************
[smallfile.csv]
ColNameHeader=True
Format=TabDelimited
MaxScanRows=0
CharacterSet=ANSI
Col1=CLIENT_TYPE Char Width 255
Col2=LAST_NAME Char Width 255
Col3=FIRST_NAME Char Width 255
Col4=SITE_LOC Char Width 255
Col5=MEMBER_NUM Char Width 255
Col6=CHECKIN Date
Col7=CHECKOUT Date
Col8=UNIT_NUM Char Width 255
****************************************

****************************************
Connection To Database Established
Number of Columns in Dataset = 8
59  ,0  ,0  ,0  ,0  ,6  ,6  ,0
Member,ADAMS,WILLIAM,53,296252920,26-Dec-2006,1-Jan-2007,41,,,,,,,

58  ,0  ,0  ,0  ,0  ,6  ,6  ,0
Member,AMES,ALBION,33,200068864,14-Dec-2006,28-Dec-2006,65,,,,,,,

****************************************

That's interesting that we can get it to guess correctly if I don't throw it any funny characters and formating, but it doesn't seem to be using the schema.ini file for anything.

Actually, its doing something.  As another test, I put the header back to your suggested way and included the schema.ini file and that seems to break it again.  Interesting.....

I guess I need to learn more about what its doing but this will get me going onto other branches of my re-education.  I'll report if I find out why it behaves this way.

Thanks again Paul.
Title: Re: Continuation of Connecting to a CSV file
Post by: Ionic Wind Support Team on December 30, 2006, 09:40:41 PM
Remove the 'MaxScanRows=0' line in the schema file.  According to Microsoft that entry causes the entire file to be scanned looking for the column formats.

The link I gave you before should give you enough information to tweak the file.

Sorry I am not much more help.  Haven't worked with CSV files through ODBC much.  Usually I will just import them into Access and save as an access .mdb file.

Paul.
Title: Re: Continuation of Connecting to a CSV file
Post by: Jim Scott on December 31, 2006, 05:32:29 AM
Paul, you've been a great help.  A friend pointed out that a Format=TabDelimited statement snuck into my schema.ini file.  That will certainly mess things up.  After changing it to Format=CSVDelimited it works much better.  I am getting some funny behavior that I think is due to handling of the date data types.  When I set everything to Char and String and fix the schema.ini file, then it is consistent and not sensitive to the header data.

I've spent several hours this morning just playing with letting the driver guess at the data types by leaving out the schema.ini file.  I've played with different headers and changed the cardinality of the file and mixed it up with pure numbers, numbers with chars, pure text etc..  I think I'm going to stick with using the schema.ini file as I think I'll be chasing my tail less this way.

Thanks again
Title: Re: Continuation of Connecting to a CSV file
Post by: Vikki on December 31, 2006, 10:22:13 AM
Still playing with this also.

Try using dbGetData and dbGetDate instead of dbBindVariable. I have been able to print just the column information for any column I choose with that instead of getting the full row as discussed before. The only issue I am having now is with the dates. I'll keep working on it and post if I get it to work properly.

Also, you can have spaces in your header names if you enclose them in quotes.  :)
Title: Re: Continuation of Connecting to a CSV file
Post by: Ionic Wind Support Team on December 31, 2006, 11:34:36 AM
Reading more on microsoft the text driver expects the date to use the / character as a separator.  Although they hint at the fact you can specify the format of dates in the schema.ini file.  See the entry for DateTimeFormat in the link I gave you.  Try someting like:

DateTimeFormat = dd-MMM-yyyy


Also be aware of the text drivers limitations:

Quote
The ODBC Text driver supports both delimited and fixed-width text files. A text file consists of an optional header line and zero or more text lines.

Although the header line uses the same format as the other lines in the text file, the ODBC Text driver interprets the header line entries as column names, not data.

A delimited text line contains one or more data values separated by delimiters: commas, tabs, or a custom delimiter. The same delimiter must be used throughout the file. Null data values are denoted by two delimiters in a row with no data between them. Character strings in a delimited text line can be enclosed in double quotation marks (""). No blanks can occur before or after delimited values.

The width of each data entry in a fixed-width text line is specified in a schema. Null data values are denoted by blanks.

Tables are limited to a maximum of 255 fields. Field names are limited to 64 characters, and field widths are limited to 32,766 characters. Records are limited to 65,000 bytes.

A text file can be opened only for a single user. Multiple users are not supported.

Existing records are treated as read-only byt the text driver.  You can insert rows, according to the docs.

Paul.
Title: Re: Continuation of Connecting to a CSV file
Post by: Vikki on December 31, 2006, 12:44:18 PM
Thanks Paul,

That fixes my date problem. The corrected code is below:

EBA File:


Def pdb as Pointer
Def hstmt as int
def error as string

def CLIENT_TYPE as string
def LAST_NAME as string
def FIRST_NAME as string
def SITE_LOC as string
def MEMBER_NUM as string
def CHECKIN as DBDATE
def CHECKOUT as DBDATE
def UNIT_NUM as string

OPENCONSOLE
Print "Looking in " + GetStartPath

pdb=dbConnect("Microsoft Text Driver (*.txt; *.csv)",GETSTARTPATH,"")

Print "pdb Value Returned = ", pdb

IF pdb <> NULL
    PRINT "Connection Established"

   hstmt = dbExecSQL(pdb,"SELECT * FROM SmallFile.csv")
   error = dbGetErrorCode(hstmt)
   IF LEN(error)
       PRINT
       PRINT "Error Code: ", error
       PRINT "Error Text: ", dbGetErrorText(hstmt)
       PRINT
   ENDIF
   
   IF hStmt

       WHILE dbGetNext(hstmt)

dbgetdata(hstmt,1,CLIENT_TYPE)
dbgetdata(hstmt,2,LAST_NAME)
dbgetdata(hstmt,3,FIRST_NAME)
dbgetdata(hstmt,4,SITE_LOC)
dbgetdata(hstmt,5,MEMBER_NUM)
dbgetdate(hstmt,6,CHECKIN)
dbgetdate(hstmt,7,CHECKOUT)
dbgetdata(hstmt,8,UNIT_NUM)
       
         print
           PRINT CLIENT_TYPE, ", ", LAST_NAME, " ", FIRST_NAME, ", ", SITE_LOC, ", ", MEMBER_NUM, ", ", CHECKIN.day, "/", CHECKIN.month, "/", CHECKIN.year, ", ", CHECKOUT.day,"/",CHECKOUT.month,"/",CHECKOUT.year, ", ", UNIT_NUM
print "<-------------->"
print LAST_NAME, "  ", MEMBER_NUM
print "<-------------->"
           PRINT
       ENDWHILE
   dbFreeSQL(hstmt)
   ENDIF
   
    dbDisconnect(pdb)
Else
   Print "Connection Not Established"
ENDIF

PRINT "Press any key to exit"
DO:UNTIL INKEY$ <> ""
CLOSECONSOLE
END



Schema.ini


[smallfile.csv]
ColNameHeader=True
Format=TabDelimited
MaxScanRows=0
CharacterSet=ANSI
DateTimeFormat=dd-mmmm-yyyy
Col1=CLIENT_TYPE Char Width 255
Col2=LAST_NAME Char Width 255
Col3=FIRST_NAME Char Width 255
Col4=SITE_LOC Char Width 255
Col5=MEMBER_NUM Char Width 255
Col6=CHECKIN Date
Col7=CHECKOUT Date
Col8=UNIT_NUM Char Width 255


smallfile.csv


Client_Type Last Name First Name Site Loc Member# CheckIn CheckOut Unit#
Member    AMES    ALBION    33    200068864         14-12-2006    28-12-2006    65
Member    PAUL    QUIERST    22    305634213         22-08-2006    01-01-2009    38
Member    ADAMS    WILLIAM    53    296252920         09-12-2006    10-12-2007    41


I altered the csv file to be tab delimited but should work as a comma seperated or custom delimited after edditing the csv file to reflect the type of delimitation you want to use. If you use the csv file by copy and paste you'll probably have trouble. You'll need to copy it to notepad or some other text editing software and reformat it I think.

This was a very good excercise and I'm still playing with it more.

Thanks Jim. I think we are all learning a lot from your question.  :)
Title: Re: Continuation of Connecting to a CSV file
Post by: Jim Scott on January 03, 2007, 08:33:38 AM
Thanks Vikki, I'm now easily sucking in large data sets.  Pretty cool and thanks again for your help.  Do you know what the real differences are between dbBindVariable versus dbGetData? 

Although I'm not yet ready to post a formal question about memory management, since I'm doing more testing of my own understanding, I do think that I may be stepping on one variable while loading another.  More on this later...
Title: Re: Continuation of Connecting to a CSV file
Post by: Vikki on January 03, 2007, 09:53:49 AM
Hi Jim,

Actually I'm not sure the real programmatically correct reason. I just know that I can retrieve column information independent of the rest of the record data with dbgetdata and I can retrieve the full record with dbbindvariable. Having said that...I am pretty sure there is  way to use dbbindvariable to do the same thing..I just haven't figured it out yet and I continue to work on it.  ;)

Maybe someone who knows will ljump in and explain it to us.  ;)