WAKom versus WAKomEncoded

John M McIntosh johnmci at smalltalkconsulting.com
Thu Feb 4 21:34:51 MET 2010


First let me assume that WAKomEncoded is what I should be starting, versus WAKom  ?

Us old Smalltalkers remember starting WAKom so in WikiServer startup that is what happens. 

I *guess* it really should be WAKomEncoded?

So what's the fall out, I mean I can stuff UTF8 chars into PRPages...  Happy Happy. 

Well not quite, I got a support email out of South Korea that the UTF8 character that was entered for the 
Page title was being mangled. In fact if they use the *wrong* character the app would hang as it's loading
from binary storage to instantiate the PRPage. 

In looking at this it turns out that because WAKom is used, the UTF8 data from the request is being passed 
as a String into PRStructure (instance var name). Later lazy initialization is used to populate title

title
	"Answer the title of the receiver, essentially the name but starting uppercase."

	^ title ifNil: [ title := self name capitalized]
	
Now here is the bad part, the capitalized runs Character>>asUppercase which actually is kinda unicode aware
so it's attempting only to deal with wide characters but since the UTF8 character is multiple bytes in a String then it mangles
the first byte to uppercase thus destroying the meaning of the UTF8 sequence.

However now if I restart with WAKomEncoded the  squeak to utf8 process then messes the UTF8 data that was 
stored in the binary data file.

So thoughts on how to fix things when I load the PRPages from storage, and what fields would need fixing are welcome
--
===========================================================================
John M. McIntosh <johnmci at smalltalkconsulting.com>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================







More information about the smallwiki mailing list