#pentagram@irc.freenode.net logs for 27 Sep 2005 (GMT)

Archive Today Yesterday Tomorrow
Pentagram homepage

[00:12:20] --> megawatt has joined #pentagram
[00:24:15] --> _watt has joined #pentagram
[00:24:58] <-- watt has left IRC (Nick collision from services.)
[00:25:19] --- _watt is now known as watt
[00:25:29] --- ChanServ gives channel operator status to watt
[00:25:42] <watt> there.. finally back with a static ip
[00:31:23] <-- Kirben has left IRC (Read error: 110 (Connection timed out))
[00:31:50] <watt> http://fallofrome.dyndns.org:8080/pentagram.ini
[00:31:58] <watt> And there you are.
[00:32:54] <watt> IP conflict on the static IP as I had previously guessed.
[00:36:35] <-- megawatt has left IRC (Read error: 110 (Connection timed out))
[03:04:49] <Colourless> unicode
[03:05:03] <Colourless> like that wasn't known :-)
[03:20:22] <Colourless> or more specifcally it's UTF-16
[03:20:44] <Colourless> it can be detected by the first two bytes being FF FE
[03:24:30] <Colourless> since we are only likely to encounter the problem under windows with people using notepad and saving as unicode it might be a nice idea to add in special win32 handling if we attempt to read unicode ini files
[03:25:29] <-- Lightkey has left IRC (Read error: 104 (Connection reset by peer))
[03:25:31] <Colourless> i would use the win32 functions to convert from the unicode text file to the systems multibyte code page
[03:34:11] --> Lightkey has joined #pentagram
[03:53:08] <Colourless> hmm of course that isn't exactly a simple task...
[03:55:20] <Colourless> could always just code to utf8...
[03:55:41] <Colourless> but need to have the file open functions to support utf-8
[03:57:47] <Colourless> should just then convert to ansi... but since the windows file system is unicode, that might kill pathnames
[04:04:10] <Colourless> hmmm utf-8 is probably a good idea thinking about it. pentagram's ttf font support is already utf-8
[04:05:34] <Colourless> so what i'm thinking is this
[04:06:24] <Colourless> convert 'unicode' ini files if found to utf-8. Then when opening files, convert the utf-8 text back widechars and use things like wopen
[04:47:01] <watt> standard UTF-16? Really? I know almost nothing about encoding, but I'm suprised my editors seemed to not play very nice.
[05:11:39] <Colourless> yeah standard UTF-16 :-)
[05:11:51] * watt shrugs
[05:14:56] <Colourless> hmm
[05:15:23] <Colourless> actually looks like pentagram is setup to use the char system that ultima8 used.... dos code page
[05:15:39] <Colourless> not utf-8
[05:16:31] * Colourless should have remembered that
[05:39:51] <watt> neat... now I can bind double tapping of any key or button.
[05:42:16] <watt> I'm ummm... yeah, changing stuff... thinking more along the lines of interaction between gumps and mouse buttons along the lines that they don't have to be right and left mouse.. or even mouse buttons all together...
[05:43:09] <watt> but that's still a ways off.. I'm not switching focus until I'm positive about how the "Normal" bindings should work.
[05:43:34] <watt> and that they won't change.
[05:43:47] <watt> anywho.. sleepy time
[05:45:09] <Colourless> night :-)
[07:48:18] <wjp> Colourless: we did talk about using utf-8 internally
[07:48:35] <Colourless> yeah
[07:48:44] <Colourless> i think moving to utf8 might be a good idea
[07:48:55] <wjp> the tt font rendering does use unicode internally (ucs2 I think)
[07:49:39] <Colourless> if we moved to utf8 it would mean we could accept all characterse from keyboard input
[07:50:37] <Colourless> and since the ttf support is unicode it means we can even display it
[07:50:49] <wjp> well, that would be somewhat problematic
[07:51:17] <Colourless> how so?
[07:51:25] <wjp> in linux at least the unicode character space isn't covered by one single ttf
[07:51:55] <wjp> not even the characters we could reasonable expect
[07:53:08] <wjp> and SDL_ttf has no support at all for using different TTFs as far as I can tell
[07:53:33] <Colourless> hmm
[07:53:43] <Colourless> that would be a big problem
[07:53:48] <wjp> we _could_ use fontconfig/freetype to handle this, but to be honest I think that would be overkill
[07:54:02] <wjp> I assume the situation in windows is similar?
[07:55:00] <Colourless> many of the core windows fonts come with all characters
[07:55:12] <Colourless> example:
[07:55:12] <Colourless> 30/11/2000 05:40 PM 23,274,572 ARIALUNI.TTF
[07:56:04] <Colourless> but..
[07:56:34] <Colourless> i'm pretty sure windows will in general have the same problem
[07:56:41] <Colourless> fonts can be in multiple ttf files
[08:01:51] <wjp> I'd really like to avoid having to write code for using multiple TTFs in a single string
[08:04:46] <Colourless> unless you do it 'properly' it's kind of hard to do it at all
[08:15:47] <Colourless> probably not worth the effort of getting multiple font rendering working
[08:21:44] --> Kirben has joined #pentagram
[08:21:44] --- ChanServ gives channel operator status to Kirben
[08:43:19] --> Kirben_ has joined #pentagram
[08:43:19] --- ChanServ gives channel operator status to Kirben_
[09:01:42] <-- Kirben has left IRC (Read error: 110 (Connection timed out))
[09:10:08] <-- Kirben_ has left IRC (Read error: 110 (Connection timed out))
[09:26:01] <Colourless> would you have any particular objections if i converted pentagrams internal string handling to utf8?
[09:27:42] <wjp> hm, I might :-)
[09:28:03] <wjp> it's a bit too potentially far-reaching to immediately see if I would have objections :-)
[09:28:47] <wjp> main issue would be removing any potential confusing about in which encoding a string is at any given point
[09:29:07] <wjp> s/confusing/confusion/
[09:29:17] <Colourless> i'm thinking as soon as it comes from usecode it gets translated into unicode
[09:31:17] <wjp> do you mean only strings in UCMachine?
[09:32:47] <Colourless> i mean the push string opcode would convert the string to utf8 when it pushes it to the stack
[09:33:36] * wjp nods
[09:33:47] <wjp> did you also want to change config files, paths, UI strings, etc... to utf-8?
[09:34:07] <wjp> at least the translation .ini's should probably be converted to utf-8 at some point
[09:34:11] <Colourless> that was the primary reason of doing it since in windows paths can be unicode
[09:34:31] <wjp> (they're currently in u8's ancient-dos encoding, which is not the most portable encoding :-) )
[09:34:52] <Colourless> yes :-)
[09:35:06] <wjp> path encoding in linux is a bit of a mess as far as I can tell
[09:35:23] <wjp> they're just byte sequences
[09:35:29] <Colourless> STL doesn't help....
[09:36:15] <Colourless> fstreams only support ASCII filenames
[09:37:00] <Colourless> i can get around that in windows though by using the GetShortPathName() function
[09:37:24] <Colourless> short paths name are only ASCII
[09:37:37] <wjp> I don't think that will form a problem in linux, as I expect the pathnames are just literally passed to the c library
[09:38:25] <wjp> just need to make sure the encoding of the actual path and the encoding of the path in the config file is the same
[09:39:10] <wjp> I'm assuming/hoping there will be sufficient restrictions on filenames to prevent you from using encodings that have NUL bytes...
[09:45:48] <Colourless> linux kernel is supposed to support utf-8 in filenames
[09:50:43] <wjp> what would 'support' mean, though?
[09:57:55] <Colourless> all got to do with locale settings
[09:58:19] <Darke> "It doesn't die"? I've got quite a few files with kanji in their filenames and they just appear as ?'s to the appropriate length of characters in a terminal. But I don't have full local support setup for it.
[09:59:05] <Colourless> http://www.cl.cam.ac.uk/~mgk25/unicode.html << "UTF-8 and Unicode FAQ for Unix/Linux"
[10:02:16] <Colourless> my linux box is currently set to use LANG=en_AU.UTF-8
[10:03:25] <Colourless> anyway in pentagram using the setlocale() function should let us tell the c runtime to use UTF-8
[10:46:34] --> Kirben has joined #pentagram
[10:46:34] --- ChanServ gives channel operator status to Kirben
[11:04:12] <-- Kirben has left IRC ("System Meltdown")
[11:11:28] --> Kirben has joined #pentagram
[11:11:28] --- ChanServ gives channel operator status to Kirben
[12:58:14] <-- Colourless has left IRC ("casts improved invisibility")
[13:41:55] <-- Kirben has left IRC ("System Meltdown")
[20:48:10] <-- Darke has left IRC (Read error: 104 (Connection reset by peer))
[21:06:27] --> Darke has joined #pentagram
[22:25:46] --> Colourless has joined #Pentagram
[22:25:47] --- ChanServ gives channel operator status to Colourless
[23:24:38] --> Kirben has joined #pentagram
[23:24:38] --- ChanServ gives channel operator status to Kirben