Full Sail: Power User Tips

Computers
With Character(s)

by
Norman L. De Forest
Beacon Correspondent

          In part one of this series, "Full Sail Vol.2 No.2 Browsing With Character", you learned how to configure lynx to match the character set on your computer. In part two, "Full Sail Vol.2 No.3 Creating Web Pages With Character", you learned how to include special characters on your web site or in your email.

          In this episode, you will learn how to get special characters on your computer. You may find it preferable to use the ISO-8859-1 or the Windows cp1252 character sets on CCN because (a) you don't lose any accented characters, (b) you don't have problems with character 128 on your system (C-cedilla on IBM PCs or A-umlaut on the Macintosh) which lynx will print to a file but won't display to the screen (c) you can read accented characters in newspostings or email (pine doesn't translate to your computer's character set as lynx does but just warns you that some characters may be displayed incorrectly) and (d) leaving the text unchanged by lynx allows you to use other fonts or utilities which can translate the text for you if it isn't really in a Latin character set. In some circumstances it may require some reconfiguring of your computer.

[At this point, if I were a good cartoonist, you would be seeing a picture of a cloud of dust raised by a crowd hurrying to the exit and a picture of me getting up and trying to brush the footprints off my clothes. But I'm not so you won't.]

          Is configuration *that* scary? After all, this column is supposed to be for advanced users.

          Unfortunately, I can't cover every system. Some computers can't have the character sets changed and for some others I do not have any information or experience.

          I do have some information on:

          Any information of machines not covered will be gratefully accepted and possibly included in a subsequent article -- or perhaps you could get yourself in print by submitting such an article to the Beacon editor.

 

MS-DOS using VGA video cards.


          The early IBM PCs (and compatibles) using monochrome video adapters, CGA video adapter, and EGA video adapters could not have the font changed when operating in text mode. Some models of Hercules graphics adapters did have provisions for software-loaded fonts but only the later models.

          Those using VGA and compatible cards are in luck. You can change the font used in text mode. "How?" you ask. Whenever I am looking for a utility to do something, the first places I look are:

A search of the Simtel and Garbo software archives revealed one font editor/loader that has been released as freeware, fpman220.zip VGA font editor and loader package.

          With that, I created a VGA-compatible font with the ISO-8859-1 character set. Later, when I found the specifications for the ISO-8859-1-Windows-3.1-Latin-1 font (also known as Code Page 1252, a superset of the ISO-8859-1 character set) I amended my VGA font to match the entire defined cp1252 character set (available here as a zipped file). (For the characters which were undefined I used miniature 7-segment hexadecimal numbers for the characters so they can be distinguished from each other.) I then changed my lynx settings to select the win cp1252 character set. (See the first article in this three-part series if you missed how to change your lynx settings.)

          Now, before starting up my terminal program, I load the cp1252 font into my VGA card with the VGA utility from the fontman package using the command:
        C:\path\VGA  FONT  C:\path\CP1252.FNT
(with the actual path substituted for '\path\' above) and can then view email and web pages with any of the defined cp1252 characters.

          My VGA character set before and after loading in the new font:

[a screen shot of the default VGA character set.]

[a screen shot of my cp1252 VGA character set.]

          The advantages of this:

  1. I get to see all of the Windows cp1252 characters that lynx supports and also can see them properly when viewing email or news-postings in which people have used such characters. If I had lynx change the characters displayed to the DOS character set, accented characters would still be seen incorrectly with pine when viewing such characters in email or on usenet.
  2. I can include such characters in my email and see that they are the correct characters (but it is only useful if I know the recipient's computer can also display such characters).
  3. If web pages are not in the ISO-8859-1 character set but in a non-Latin character set I can still print them to my home directory (with the lynx 'p' command), download them, and then print them out in the original language by editing the file with Write (telling Write not to convert from MS-DOS text) and using a font for that character set.

          The disadvantages are:

  1. Since the text is not converted to the character set used by my system and my printer, it can be a bit difficult to print out such text files in DOS without having to convert them manually. I'm still working on that.
  2. Windows still thinks plain text files are in the IBM PC character set and will "convert" the files to the Windows character set if they are imported into a Windows word processor even though they are already in the Windows character set. It then gets all of the characters wrong. Currently, I make two copies of the file open one with the word processor and another in DOS text mode and change the DOS font to a cp1252 font which has been flagged as an OEM font. (I don't mind lying to my computer to get things done.) I then manually change the word processor file to match the DOS text file. (If anybody knows how I can get Windows to stop converting the character sets of text files the information would be gratefully appreciated.)

 

Windows programs


          Also found in the Simtel archives was a font editor, Softy (sfty107a.zip) that can:

  • change font headers to redefine ANSI (Windows) fonts to OEM (PC) fonts so they can be selected by applications that won't allow you to select non-OEM fonts,
  • edit bitmapped fonts,
  • edit TrueType fonts, or
  • convert TrueType fonts to bitmapped fonts.

          With this, you can pick your favourite font and edit it to change its name and type and change the character set to cp1252 and then install it on your system.

          The tactic to use here, to be able to select a Windows font with an application that insists on an OEM font (such as HyperTerminal or Terminal), is to pick a suitable Windows bitmapped font and edit the headers to redefine it as an OEM font and save it with a new name. It can now be selected by your application. While you are at it, you can also update those characters in the font that don't meet the new cp1252 standard and change the undefined box characters to something that can be distinguished from each other.

          Here is a picture of the Softy dialogue box that is displayed when you select "Font", "Header..." with the mouse cursor showing where I have just changed the character set definition from "ANSI" (Windows character set) to "OEM":

[ A picture of the dialogue box. ]

          You can then use the font with HyperTerminal or Terminal and get the same advantages as you would get with the VGA font mentioned above.

 

MS-DOS programs running in a Windows DOS box


          I also was able to use the Softy font editor to create an edited version of the font used in a DOS box in Windows. Getting the system to use it with Windows 3.1 was a bit of a problem. Windows 95 users have it easier here. It is possible to select the font used in a DOS box on a window-by-window basis. (Gack! I can't believe I just said something nice about Windows 95. OK, so my bias is showing.) For some reason, Windows 3.1 wouldn't accept my changes to the default fonts and would reselect one of the already-installed Terminal fonts as the default font for a DOS box. My solution (which may not be optimal) was to:

  1. Backup the following OEM fonts in the \windows\system\ directory
    (apparently they all register as "Terminal" fonts with Windows):
    • copy 8514oem.fon *.nof
    • copy cga40woa.fon *.nof
    • copy cga80woa.fon *.nof
    • copy dosapp.fon *.nof
    • copy ega40woa.fon *.nof
    • copy ega80woa.fon *.nof
    • copy vgaoem.fon *.nof
  2. Backup the following .ini files in the \windows\ directory:
    • copy win.ini *.nin
    • copy system.ini *.nin
  3. Select the "Main" group, select "Control Panel", select "Fonts", and then select and "Remove" each of those fonts from the set of installed fonts.
  4. Copy my modified font to \windows\system\vgaoem.fon
  5. Select the "Main" group, select "Control Panel", select "Fonts", and then use the "Add" option to install the new font.

          There may be an easier method but this worked for me.

          Here is a picture of 4DOS running in a DOS box after running a batch file to display all 256 characters:

[ A picture of the Terminal cp1252 font. ]

 

Macintosh references


          I am going to have to plead ignorance as far as Macintosh computers are concerned and merely suggest that some of the techniques I have used with Windows (both those mentioned above and those mentioned below) can probably be used on the Macintosh if you have the right fonts or font utilities. Following are some links to notes about the Macintosh, to fonts for the Macintosh, and to font utilities. I regret that this is all the help I can offer at this time.

  1. ISO-8859-1 and the Mac platform
  2. Directory: /pub/i18n/ucs/EversonMono10646 -- Multilingual fonts for Mac
  3. Tommy of Escondido's Alien Fonts Page other SF font section -- plus TT to Mac font convertor
  4. INFO-MAC HyperArchive ROOT -- includes notes on fonts
  5. font/_TrueType
  6. _Font -- Macintosh font archive -- includes 'foreign' fonts.
  7. font/_Utility -- Font utilities for Macintosh

 

Using Cyrillic and other fonts


          A message to userhelp arrived in my mailbox last March that I couldn't read. It was in Russian. I had no way of telling whether this was a query to userhelp about CCN or just another piece of junk email (a.k.a. 'spam').

          I downloaded the message to my machine and made a search for a font with a Cyrillic character set. I found a site with such fonts and downloaded one and installed it on my system. I then logged off, entered Windows and opened the file with Write, making sure I selected no conversion to Write format:

[ A picture of the mouse cursor selecting 'No Conversion'. ]

          At first the message was as unreadable as before:

[ A screen shot of the message with some headers removed. ]

          I selected the message text with the mouse and clicked on the menu item "Characters", "Fonts..." and then selected the Cyrillic font, "Bulgarian Courier":

[ A screen shot of the mouse cursor selecting the font. ]

          Clicking on the "OK" button and clicking on the document gave me Russian text but misformatted on the display:

[ A screen shot of the Russian text. ]

          With a little formatting by adding and deleting line-breaks, I had the text tidy enough to print:

[ A screen shot of the edited Russian text. ]

          I was then able to print out the text and take it to the Department of Russian Studies at Dalhousie University where one of the professors (a very friendly gentleman) was able to translate the message for me.

          It was spam -- for printer-management hardware or software.

          Similar tactics could be used to render other languages such as Greek.

 

The Lumber Cartel (TINLC)
"Sooper Sekrit Decoder Ring"
or Rot-13 Decoding with fonts



          I have also created a rot-13 ( see footnote 1, Full Sail Vol.2 No.1) version of the VGA font so those using pine or lynx to read newspostings can shell out from TELIX (or whatever else they use) to DOS, load the "ROT13.FNT" font, exit from DOS back to TELIX (o.w.e.t.u.), and read rot-13 newspostings without having to manually decode them. Then shell out to DOS again and reload the ISO font to get back to normal. Download ROT13.FNT as a ZIP file and try it out with:

          Gur dhvpx oebja sbk whzcf bire gur ynml qbt.

          GUR DHVPX OEBJA SBK WHZCF BIRE GUR YNML QBT.

Note that the ROT-13 font is a conversion of my older ISO-8859-1 font and does not have all of the the cp1252 additions and corrections. Since the characters affected are rarely seen ('S', 'Z', 's', and 'z' with a caron) adding those characters hasn't had a very high priority in my schedule.

[a screen shot of the ROT-13 VGA character set.]

          There is one other way you can view ROT-13-encoded text in usenet postings and one way you can ROT-13-encode or ROT-13-decode text on CCN:

  1. Using the tin newsreader:
    1. View the posting with the ROT-13-encoded text.
    2. Press 'd' (lower-case 'D', without the quotes) to toggle ROT-13 decoding on.
    3. Read the encoded text.
    4. Press 'd' again to toggle things back to normal.
  2. Using the lynx 'f' ('File menu') command:
    1. Make sure no files are currently tagged.
    2. Use the up/down arrow keys to select the file you wish to view or encode/decode.
    3. Press 'f' (without the quotes) to get the File menu.
    4. Select item 13 ("rot13 (current selection)") and press your ENTER key to view the file with ROT-13 encoding/decoding.
    5. Press the Left-Arrow key to exit if you just wanted to view the file.
    6. Press 'p' for "Print" and select item 1 to print the encoded/decoded version to another file in your CCN home directory if you want a permanent copy (or want an encoded version of plaintext to paste into a usenet posting) and then use the Left-Arrow key to exit from the file.

          If you use Softy again, you can also create a ROT-13 font for Windows applications by swapping the 'A'-'M' with 'N'-'Z' and 'a'-'m' with 'n'-'z' in a cp1252 font. It can then be used to selectively decode parts of a usenet posting after you have downloaded it by highlighting the text to be decoded while viewing it with a Windows editor such as Write and change the font for the selected text only:

[Screen shot: A usenet posting being viewed with Write.]

[Screen shot: the same posting with the encoded URLs decoded.]

 

Viewing Chinese, Korean, and Japanese text


          Those using Windows can also view Chinese text.

          For some reason, I have been receiving a lot of spam lately in Chinese. You can view one such message full of HTML as text with full email headers or as an HTML file with the email headers stripped out.

          After much searching for something to use to view and print the Chinese messages in order to take them to Dalhousie University for translation, I saw, in a posting to the news.admin.net-abuse.email newsgroup, the URL for a viewer that can be used in Windows to display the text of such messages in Chinese. The NJStar Asian Software Development (南極星軟件公司) site in Australia has available for download as shareware their NJWIN CJK Viewer for Windows 3.1/95/98/NT. While most of their other CJK (Chinese, Japanese, and Korean) software requires Windows 95/98/NT/2000, their viewer will also work with Windows 3.1. It intercepts text written to the screen in Windows and substitutes the appropriate Chinese, Japanese, or Korean text. (At least, it is supposed to do so. I was unable to get it to recognise and redraw a Japanese web page I tried viewing. It seems to work for Chinese and I have not encountered any Korean sites to test.) Note that for this to work while you are viewing pages with lynx, you must have lynx configured to assume your system uses the ISO-8859-1 or the cp1252 character set or else lynx will attempt to convert the characters to the character set it thinks your computer is using and NJWIN will no longer be able to recognise the encoding for the Chinese characters.

          The NJWIN CJK menu offers you a number of options:

                           ----------------------------
                            Ansi / No CJK Support
                           ----------------------------
                            1 Chinese Auto Simplified
                            2 Chinese Auto Traditional
                            3 Chinese GB Simplified
                            4 Chinese GB Traditional
                            5 Chinese Big5 Simplified
                            6 Chinese Big5 Traditional
                           ----------------------------
                            7 Japanese Auto-Detect
                            8 Japanese EUC - JIS
                            9 Japanese Shift-JIS
                           ----------------------------
                            0 Korean KSC 5601
                           ----------------------------

[ A picture of the NJWIN Menu. ]

          Here are a couple of screen shots of me viewing a Chinese HTML spam with lynx while logged in with Terminal for Windows 3.1, first with the translation off and then with it turned on:

[Screen shot: Chinese spam with NJWIN translation turned off.]

[Screen shot: The same spam with NJWIN translation turned on.]

          NJWIN CJK even works for text displayed in a DOS box in Windows:

[Screen shot: Chinese spam with NJWIN translation turned off.]

[Screen shot: The same spam with NJWIN translation turned on.]

 

Miscellaneous Links:




          Footnote:
TINLC = There Is No Lumber Cartel.

 

You may direct comments or suggestions about this column to:

Norman L. De Forest,  af380@chebucto.ns.ca

 

Back To The Beacon Index