Print Page - Reading .reg files

Title: Reading .reg files
Post by: Andy on November 07, 2017, 12:29:09 AM

This one is just for curiosity.

I can create (export) .reg files which contain registry details the same as the regedit program which both my program and regedit can read.

But I can't read any .reg files created by the regedit program - notepad opens them okay.

I can't find any info from msdn at all.

Does anyone know what format they are stored in?

Here is the simple program to read a file:

Code Select

string ln
file myfile

openconsole

	IF(OPENFILE(myfile, getstartpath + "10.reg", "R") = 0)
		WHILE EOF(myfile) = 0
			IF(READ(myfile,ln) = 0)

            print "ln = ",ln,"    Length is ",len(ln) 

			ENDIF
		WEND
		
		CLOSEFILE myfile
   endif

do:until inkey$ <> ""
closeconsole
end

Attached is a simple .reg file

Please rename it from .txt to .reg!

Right click on it and choose EDIT to have a look at it and then try to read it with the above code.

Title: Re: Reading .reg files
Post by: fasecero on November 07, 2017, 11:01:24 AM

It's the encoding. Open your text file in Notepad - Save as, at the bottom you should see the "encoding" dropdown - select ANSI - and save.

Title: Re: Reading .reg files
Post by: Andy on November 08, 2017, 05:27:09 AM

Yes I realised it was in unicode format eventually (really wish they didn't do that as it's a pain in the you know what!).

I've used the W2S function to convert it but it only converts the very first line, all other lines with text are displayed with the question mark symbols (?)

So that leaves me puzzled again.

And one further question, how can I tell if a file is in unicode format?

Thanks,
Andy.

Title: Re: Reading .reg files
Post by: LarryMc on November 08, 2017, 01:33:36 PM

I found this Andy

Quote"It isn't enough to just determine Unicode vs. ASCII because Unicode itself comes in various flavors (UTF-8, UTF-16BE, UTF-16LE, etc). The file format that you are reading should define how the text is encoded (or how to determine it from a header, but that is specific to the file type).

For text (and CSV) files, Windows provides an API that you can use to determine if a given byte sequence is Unicode. The function name is (no surprise) IsTextUnicode (http://msdn.microsoft.com/en-us/library/dd318672). Another thing you should probably do is check for a BOM (Byte Order Mark), which, if present, tells you that the text is for sure Unicode, and what the encoding is (UTF-16BE vs. UTF-16LE, for example).

Raymond Chen wrote an article that gives us insight into how Notepad determines the encoding of a text file: http://blogs.msdn.com/b/oldnewthing/archive/2004/03/24/95235.aspx

You can read more about BOMs from this FAQ on the Unicode organization's web site: http://unicode.org/faq/utf_bom.html"

Title: Re: Reading .reg files
Post by: fasecero on November 08, 2017, 02:37:47 PM

Yep, not an easy task to be honest. Here you can read a discussion about this topic. They also are talking about BOM & IsTextUnicode.
https://stackoverflow.com/questions/4672659/whats-the-best-way-to-identify-unicode-encoded-text-files-in-windows

Title: Re: Reading .reg files
Post by: fasecero on November 08, 2017, 04:53:31 PM

Man this one was tough. Couldn't find any example so I made one myself - maybe it will need some revision. You have to use two differents files: "10_ansi.txt" & "10_unicode.txt"

Code Select



$include "windowssdk.inc"

' var
INT j
STRING fullpath

' entry point
OPENCONSOLE

fullpath = GETSTARTPATH + "10_ansi.txt" ' "10_ansi.txt" - "10_unicode.txt"

IF IsFileUNICODE(fullpath) = 0 THEN
	PRINT "FILE IS ANSI"
	PRINT fullpath
	PRINT 
	
	STRING ln
	FILE myfile
	
	IF(OPENFILE(myfile, getstartpath + "10.reg", "R") = 0)
		WHILE EOF(myfile) = 0
			IF READ(myfile, ln) = 0 THEN
            PRINT ln
			ENDIF
		WEND
		
		CLOSEFILE myfile
   ENDIF
ELSE
	PRINT "FILE IS UNICODE"
	PRINT fullpath
	PRINT 
	
	' TODO: get unicode data as strings
	' OPENFILEW/EOFW/READW/CLOSEFILEW
ENDIF

PRINT 
PRINT "  Press any key to exit..."
DO:UNTIL INKEY$ <> ""
END

SUB IsFileUNICODE(string path), INT
	HANDLE hFile = CreateFileW(S2W( path), GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL)
	INT response = 0
	
	IF hFile <> INVALID_HANDLE_VALUE THEN
		' get file size
		LARGE_INTEGER size
		
		IF GetFileSizeEx(hFile, &size) THEN
			INT filesize = size.QuadPart ' size of the file in bytes
			INT buffsize = 100 ' number of bytes we want to check for ansi or unicode content
			IF buffsize > filesize THEN buffsize = filesize
			
			pointer buffer = NEW(char,buffsize + 1)
			DWORD NumberOfBytesRead = 0
			ReadFile(hFile, buffer, buffsize, &NumberOfBytesRead, NULL)

			IF NumberOfBytesRead THEN
				response = IsTextUnicode(buffer, NumberOfBytesRead, NULL)
			ENDIF
				
			DELETE buffer
		ENDIF

		CloseHandle(hFile)
	ENDIF
	
	RETURN response
ENDSUB

Just pass in any text file to IsFileUNICODE(path) and hopefully it should tell you if the file is unicode or not.

Title: Re: Reading .reg files
Post by: jalih on November 08, 2017, 10:23:24 PM

Quote from: fasecero on November 08, 2017, 04:53:31 PM
Man this one was tough. Couldn't find any example so I made one myself - maybe it will need some revision.

Here is my old version written in MiniBASIC

Code Select


##ifndef WIN32
##define WIN32
##endif
 
##ifdef WIN32
##define WIN32_LEAN_AND_MEAN
##endif
 
##include "winsdk\windef.mbi"
##include "winsdk\winbase.mbi"
##include "winsdk\wingdi.mbi"
##include "winsdk\winuser.mbi"


type FILETIME
	UINT64 qwTime
end type

type WIN32_FILE_ATTRIBUTE_DATA
  UINT dwFileAttributes
  FILETIME ftCreationTime
  FILETIME ftLastAccessTime
  FILETIME ftLastWriteTime
  UINT nFileSizeHigh
  UINT nFileSizeLow
end type


WIN32_FILE_ATTRIBUTE_DATA dat

string filter = "Text files|*.txt|All Files|*.*||"
string filename = ChooseFile("Select file",NULL,1,filter,"txt")

GetFileAttributesExA(filename,0,dat)

int64 size64
size64 = dat.nFileSizeHigh
size64 = size64 << 32
size64 |= dat.nFileSizeLow

uint size = size64

pointer buffer = new(string, size)

hFile = fopen(filename, "R")
int bytesread = fread(hFile, buffer, size)
fclose(hFile)

if IsTextUnicode(buffer, size)
	print "Text file is probably Unicode."
else
		print "Text file is probably ANSI."
endif

delete(buffer) 


do:until inkey$ <> ""

Title: Re: Reading .reg files
Post by: fasecero on November 09, 2017, 11:58:41 AM

Thank you, good to have a reference in case of any trouble. I see that you used the entire file in IsTextUnicode, I just take the first bytes. If a problem arise, we could either increase the 'block' size or use all the content. I'm avoiding this for now to gain (hypothetically) some speed.

IonicWind Software

IWBasic => General Questions => Topic started by: Andy on November 07, 2017, 12:29:09 AM