• unicode/cp437 door

    From bugz@21:4/110 to All on Wednesday, March 31, 2021 23:25:58
    Hello,

    I've been working on writing my own door and door library, but I've run into some interesting issues with ENiGMA. In my library, I try to detect if utf8/unicode or cp437 is active, and adjust my output accordingly.

    The problem (I think) is ENiGMA wants to help. Under linux, using stdio, you have to choose utf8 or cp437. Selecting cp437, I can only detect cp437 and never unicode -- probably because ENiGMA is encoding/translating in the background for me. If I select utf8, then cp437 gets Mojibaked.

    I thought maybe if I could setup a "raw" mode that wasn't translated I'd have better luck, but doesn't seem like it. Here's what I tried.

    In core/door.js, I tried doing something like this:

    doorDataHandler(data) {
    if ('raw' === this.encoding) {
    this.client.term.write(data);
    } else {
    this.client.term.write(decode(data, this.encoding));
    }
    }

    But it still seemed like it's getting mojibaked. Is this where the doors get encoded? Is there somewhere else that encoding is going on?

    Thanks for reading and take care,

    bugz


    --- ENiGMA 1/2 v0.0.12-beta (linux; x64; 14.16.0)
    * Origin: BZ&BZ BBS (21:4/110)
  • From bugz@21:1/182 to bugz on Thursday, April 01, 2021 00:40:00
    bugz wrote to All <=-

    I've been working on writing my own door and door library, but I've run into some interesting issues with ENiGMA. In my library, I try to
    detect if utf8/unicode or cp437 is active, and adjust my output accordingly.

    Or I could just have ENiGMA call my door with the user's encoding set correctly. There's an access condition for that.

    In my door menu:

    {
    value: { command: "0" }
    action: [
    {
    acs: EC0
    action: @menu:doorAce
    }
    {
    action: @menu:doorAceU8
    }
    ]
    }

    And I set the encoding: utf8 in doorAceU8 block, and there we go.

    If the caller's encoding is CP437, they get doorAce. Otherwise,
    they get doorAceU8 configured with utf8.

    Now, I just need to finish writing the door.

    Take care,

    bugz

    ... Don't sweat petty things.... or pet sweaty things.
    --- MultiMail/Linux v0.52


    --- Talisman v0.16-dev (Linux/x86_64)
    * Origin: HappyLand v2.0 - telnet://happylandbbs.com:11892/ (21:1/182)
  • From NuSkooler@21:1/121 to bugz on Saturday, April 03, 2021 10:42:06

    On Thursday, April 1st bugz muttered...
    In core/door.js, I tried doing something like this:
    doorDataHandler(data) {
    if ('raw' === this.encoding) {
    this.client.term.write(data);
    } else {
    this.client.term.write(decode(data, this.encoding));
    }
    }

    Sounds like you already found a work around via ACS, but you were on the right track here.

    When the external process is spawned we disable all encoding so everything is "raw" between ENiG and the process. The handler you have above takes that raw |data| and decodes it based on the specified encoding -- on the way out back to the client's term it has to be re-encoded to whatever their term specified.

    e.g. PID -> UTF-8 -> Decode to JS/Unicode -> Encode CP437 -> Terminal

    This is used successfully in a lot of setups, so I'd have to have more details as to why it's not working for you I guess -- assuming the work around isn't too much of of a PITA/you want to muck with it more :)



    --
    |08 ■ |12NuSkooler |06// |12Xibalba |08- |07"|06The place of fear|07"
    |08 ■ |03xibalba|08.|03l33t|08.|03codes |08(|0344510|08/|03telnet|08, |0344511|08/|03ssh|08)
    |08 ■ |03ENiGMA 1/2 WHQ |08| |03Phenom |08| |0367 |08| |03iMPURE |08| |03ACiDic
    --- ENiGMA 1/2 v0.0.12-beta (linux; x64; 14.15.4)
    * Origin: Xibalba -+- xibalba.l33t.codes:44510 (21:1/121)
  • From bugz@21:1/182 to NuSkooler on Sunday, April 04, 2021 18:49:00
    NuSkooler wrote to bugz <=-

    Sounds like you already found a work around via ACS, but you were on
    the right track here.

    I guessed right. :D I amaze myself sometimes...

    When the external process is spawned we disable all encoding so
    everything is "raw" between ENiG and the process. The handler you have above takes that raw |data| and decodes it based on the specified
    encoding -- on the way out back to the client's term it has to be re-encoded to whatever their term specified.

    Actually, I have a problem with the normal process, also. My card door game is trying to use CP437 0x03, 0x04, 0x05, 0x06 (hearts, diamonds, spades, clubs). But I'm thinking that "iconv" (or whatever is doing the encoding) ignores these as control codes. [RATS!]

    e.g. PID -> UTF-8 -> Decode to JS/Unicode -> Encode CP437 -> Terminal

    ^ This here doesn't like the "control codes" symbols.

    This is used successfully in a lot of setups, so I'd have to have more details as to why it's not working for you I guess -- assuming the work around isn't too much of of a PITA/you want to muck with it more :)

    The "raw" workaround above failed on CP437 to CP437 output. It was mojibaked /garbled up. Like when you edit ANSI files/high ascii with vim, it "fixes" the broken utf-8 it sees.

    Yes, there are ways to make vim work right ...

    vim --cmd "set encoding=Latin1" WELCOME.ANS

    Am I missing something further along in the terminal output part to get
    "raw"? I think I'll probably stick to the ACS fix until I see what happens with the door running on other BBS software.

    I don't think there's many BBSes that handle unicode, and unicode is weird.
    And fewer (if any?) doors that are crazy enough to try unicode. I just happened to be crazy enough to try writing a CP437/unicode door.

    They don't call it unipain without a reason!

    And I'm not sure how other BBS systems handle these cases. That's for
    testing out another day. I don't think I'd even be trying unicode
    support with them.

    I think you've got a great way of dealing with unicode/CP437. Thanks
    for reading and the feedback.

    Take care,
    bugz

    ... You must know your limits to break through them.

    --- MultiMail/Linux v0.52


    --- Talisman v0.16-dev (Linux/x86_64)
    * Origin: HappyLand v2.0 - telnet://happylandbbs.com:11892/ (21:1/182)
  • From fusion@21:1/616 to NuSkooler on Monday, April 05, 2021 09:27:53
    On 03 Apr 2021, NuSkooler said the following...

    This is used successfully in a lot of setups, so I'd have to have more details as to why it's not working for you I guess -- assuming the work around isn't too much of of a PITA/you want to muck with it more :)

    tbh i'd love it if most bbses just defaulted to utf-8 and auto-translated art/doors/message bases/etc to utf-8. the target bbs of echomail for example.. make it their responsibility to force their bases to cp437 should they wish to do so. i don't think that'd fly though considering what we saw here with ONE non-English-language message but.. ;)

    mystic (a46 at least) SEEMS to have utf-8 detection but doing just 'ssh bbs' from a linux command prompt doesn't seem to figure it out. seems to work with ZOC. i'm assuming Netrunner also works right. but tbh the target all along
    for me feels like it should be "nothing installed, user has ssh already" .. just like how windows used to always come with 'telnet'.

    in windows 10 only but.. openssh even automatically sets the command prompt to utf-8, the command prompt's ansi emulation is better than most (no more ansi.sys monkey business) AND it's harware accelerated. without installing anything.

    i duno, i'm probably preaching to the crowd ;)

    --- Mystic BBS v1.12 A46 2020/08/26 (Windows/32)
    * Origin: cold fusion - cfbbs.net - grand rapids, mi (21:1/616)
  • From NuSkooler@21:1/121 to bugz on Monday, April 05, 2021 16:32:04

    On Monday, April 5th bugz was heard saying...
    Actually, I have a problem with the normal process, also. My card door game is trying to use CP437 0x03, 0x04, 0x05, 0x06 (hearts, diamonds, spades, clubs). But I'm thinking that "iconv" (or whatever is doing the encoding) ignores these as control codes. [RATS!]

    Shoot I should have pointed this out prior. Yes, this is a known issue. iconv doesn't make an attempt to convert some of the "control" characters in the CP437 table but instead spits out a placeholder. I have a ticket in with the author (it's a blocker for a game I'm working on) & will fork and PR to them when I get a chance if they don't fix it.



    --
    |08 ■ |12NuSkooler |06// |12Xibalba |08- |07"|06The place of fear|07"
    |08 ■ |03xibalba|08.|03l33t|08.|03codes |08(|0344510|08/|03telnet|08, |0344511|08/|03ssh|08)
    |08 ■ |03ENiGMA 1/2 WHQ |08| |03Phenom |08| |0367 |08| |03iMPURE |08| |03ACiDic
    --- ENiGMA 1/2 v0.0.12-beta (linux; x64; 14.15.4)
    * Origin: Xibalba -+- xibalba.l33t.codes:44510 (21:1/121)
  • From bugz@21:1/182 to NuSkooler on Wednesday, April 07, 2021 22:11:00
    NuSkooler wrote to bugz <=-

    Shoot I should have pointed this out prior. Yes, this is a known issue. iconv doesn't make an attempt to convert some of the "control"
    characters in the CP437 table but instead spits out a placeholder. I

    Well, not all of the control codes can be converted, we still need some of them:

    Ctrl-G (07/bell), Ctrl-H (08/backspace), Ctrl-M (enter). There's some
    control codes you really can't touch.

    Ctrl-F (formfeed) I remember seeing that to "clear" the screen for very
    old terminal types.

    But, I'm good with my unicode/CP437 detection -- so this doesn't really get
    me anymore. Actually, I've been going crazy with it -- I actually have it swapping (C) for the unicode "\u00a9" &copy; copyright symbol.

    Take care,
    bugz

    ... Let's split up, we can do more damage that way.
    --- MultiMail/Linux v0.52


    --- Talisman v0.16-dev (Linux/x86_64)
    * Origin: HappyLand v2.0 - telnet://happylandbbs.com:11892/ (21:1/182)