HPR
TECHNICAL NOTE
TN 95/12
Revision 01












UUENCODING
for
Binary File Transfers
via E-Mail


JULY 1996











H. Paul Robinson, P.Eng.
170 Merrimac Drive
Dartmouth, Nova Scotia
Canada B2W 4P8

E-Mail: probinso@fox.nstn.ns.ca
p.robinson@ieee.ca

Updates: http://www.ccn.cs.dal.ca/~am074/





This material may not be reproduced without the expressed written permission of the author, with the exception that permission is granted to copy this document providing it is kept intact and is distributed for private non-commercial use.

1.0 Introduction

It is often necessary to send a binary file via an electronic
mail (e-mail) system. Unfortunately, binary file transfers are
not supported by all e-mail systems. Examples of binary files
include: .ZIP compressed files, .DOC and .WP word processor
files, .EXE and .COM program files. One solution is encoding.

Encoding creates representative text files from binary files so
that the encoded files can be transmitted via an e-mail system
and the binary file re-constructed by the recipient. Popular
encoding formats include UUencoding, XXencoding, MIME, and
BinHex.

Some e-mail programs handle the encoding and decoding
automatically. This note is written to help those using e-mail
systems which do NOT handle this encoding and decoding
automatically.

The use of UUencoding and XXencoding will be described. Also in
this document, the use of the CompuServe mail to Internet e-mail
gateway with encoded files is specifically addressed.

2.0 Encoding Programs

There are many encoding and decoding programs which support
UUencoding and XXencoding. File UUEXE540.zip called UU-ENCODE 95
(v40) for PC written by Richard Marks contains UUENCODE.exe,
UUDECODE.exe, and documentation files. These are fully
functional, smart, and robust encoding and decoding programs.
The $10 shareware versions also support MIME encoding and
decoding.

The freeware version of this .ZIP file is available from many
Internet anonymous FTP sites. See Appendix D.


3.0 ENCODING

ASCII files contain only letters, numbers, and punctuation
characters. although not completely accurate, ASCII characters
are sometimes called printable characters. Binary files, on the
other hand, may contain ASCII characters plus other characters.
That is, binary files may contain all character values from 0 to
255.

Encoding creates a representative text file from a binary file.
The output encoded file contains only ASCII text characters and
can be sent by an e-mail system.

3.1 UUENCODE Recommended Usage

The following command line syntax is recommended:

uuencode [-X] -s filename.ext [output-filename]

Parameters in square brackets [] are optional.

-X The -X switch (uppercase X) is recommended, but use with
care. This performs XXencoding instead of UUencoding.
XXencoded files are less likely to be corrupted by modern
e-mail gateways, but XXdecoding programs are not as popular.

Encoding with the lowercase -x switch will produce output
files with the ".UUE" extension, but the files will be
XXencoded. Uppercase -X produces ".XXE" file extension.
The filename extension is not important as this UUDECODE
program correctly adapts automatically. Also, the filename
is usually lost during the e-mail message preparation.

Ensure that the recipient can decode XXencoded files,
otherwise do not use this switch.

-s The -s switch is recommended, but also use with care. This
switch creates the largest possible file, a SINGLE output
file which is easier to decode.

Problems can easily occur if the e-mail system, or e-mail
gateway, does not handle large e-mail messages and splits
them into many smaller e-mail messages before sending. This
can happen without the knowledge of the sender.

Do not use this switch with e-mail systems which subdivide
large e-mail messages into smaller messages. The resulting
smaller messages may not all contain section headers. The
missing header information will make decoding the files more
difficult. The preferred solution is to keep each binary
file small. An alternative solution is to use multisection
encoded files.

-s nnn
Multisection (multiple) output files are supported by the
program UUENCODE. Multisection files are files which are
divided into sections by the encoding program. If used
properly, each section, and therefore each message, will be
small enough to be transmitted intact without being
subdivided. Also, each section will include a header
containing the name of the file and position (order)
information to instruct the decoding program during
decoding. These instructions will ensure the file is
reconstructed correctly.

The size of the output files is controlled by the 'nnn'
parameter. This parameter specifies the maximum number of
lines allowed in each output file. The resulting file size
is approximately 64 bytes per line. See the following
sections for more information on multisection file size.

filename.ext
This is the name of the file which is to be encoded. It is
also the filename which will be automatically assigned to
the reconstructed file by the decoding program during
decoding. It is therefore, the name of the file which will
be received by the recipient.

output-filename
This is a temporary name assigned to the encoded file. If
multiple section output files are created during encoding,
"output-filename" should be used. Output-filename should be
less than 7 characters in length and should not end in a
number. The 7 character limit is required because the
encoding program creates output files with consecutively
numbered file names.

For example, if a large file, 'FILE.zip' is encoded using:

uuencode FILE.ZIP OUT

the encoded output file names will be in the form:

OUT1.UUE
OUT2.UUE
OUT3.UUE
OUT4.UUE
etc.

The output-filename filename must be small enough to allow
UUENCODE room to add the consecutive numbers to it.
UUENCODE will automatically provide the output filename
extensions, 'UUE' or 'XXE'.

The output file names are seldom important to the recipient
as they are usually lost during the e-mail message
preparation. The encoded output file name is usually lost
but the original binary file name is preserved in the header
information in each output file.

See UUSER.TXT for more information on encoding.

3.2 File Size

Encoding does not compress files. Output text files from
UUencoding are approximately 40% larger than the input file.

If it is desired to keep the output files small, and this is
usually the case, then large multiple file .zip files should be
avoided. Creating a large multiple file .zip file will create a
larger encoded file, OR will create multisection (multiple)
output files during encoding. Some e-mail systems or gateway
systems, CompuServe for example, do not handle large e-mail
messages. In addition, decoding multiple output files is
cumbersome.

To keep the encoded file small, encode each binary file
separately and send each encoded output file separately via an
e-mail message. In addition, the binary files could each be
zipped before encoding to further reduce size. Zipping also
increases file integrity by ensuring that the recipient knows if
the file has been corrupted during transmission.

3.3 CompuServe Mail

CompuServe mail supports binary file transfers (up to 2 MB
maximum in size), but only between CompuServe accounts. UU or XX
encoding is one solution for binary file transfers between
CompuServe and other e-mail systems, like Internet e-mail.

DOS CompuServe Information Manager, DOSCIM, version 2.2.3 was
tested using a DX-486 with 8 MB of RAM. E-mail messages up to
119,000 bytes were tested and were successfully transferred back
and forth between CompuServe and Internet e-mail. Larger
messages failed during the CompuServe send. Both UUencoded and
XXencoded files were transferred successfully.

The following instructions are recommended for sending encoded
binary files from CompuServe to Internet e-mail:

1 Create a .zip file. The preferred maximum zip file size is
80,000 bytes as this will create approximately the maximum
size file which can be transferred in a single CompuServe
e-mail message.

2 Encode the .zip file using the following:

uuencode -s1800 filename.ext

or

uuencode -X -s1800 filename.ext

(The '-s1800' limits the maximum output file size to under
113600 bytes.)

3 Prepare a new CompuServe e-mail message.

4 Select File/Import from the DOSCIM menu. Select the single
output .UUE or .XXE file, or select the first of multiple
.UUE or .XXE output files. The file will be entered into
the message body.

5 Select Out Basket. Wait. The software will take a few
seconds to store the message in the mail Out Basket.

6 Repeat the steps 3 to 5 for additional files, or further
parts of multisection (multiple) output files. Ensure each
file or section is sent in a separate e-mail message.

3.4 Other E-Mail Systems and Gateways:

CompuServe mail is generous in the size of text mail message
permitted. The maximum size is approximately 119,000 bytes. The
Internet is even more generous as there does not seem to be any
limit on the size of Internet e-mail messages.

On the other hand, e-mail messages passing through some gateways
to the Internet are limited to approximately 22,000 bytes each.

Encoded files, or sections of multisection files, must be small
enough so as not to be split by the e-mail system. Otherwise,
the recipient will receive encoded sections which do not contain
header information. If the user is uncertain, the system should
be tested to determine the maximum permitted e-mail size and the
encoding adjusted appropriately.

The following command will limit the file size to 340 lines, or
under 22,000 bytes:

uuencode -s340 filename.ext

Other approximate file sizes can be calculated by assuming
64 bytes per line.


4.0 DECODING

When an encoded file is received in an e-mail message it must be
decoded to recover the original binary file. Before decoding it
may be necessary to determine the type of encoding used on the
file. Appendix C should help with this.

The UUDECODE program decodes UUencoded and XXencoded files. No
switches are required if the files were encoded using UUENCODE or
most other encoding programs. UUDECODE supports single and
multisection files and in most cases handles the whole process
automatically.

4.1 UUDECODE Recommended Usage for Single Files

The following command line syntax is recommended for single
encoded files:

uudecode filename.ext

The filename extension, '.ext', is optional, providing the file
extension is either '.UUE' or '.XXE'.

4.2 DECODING Multisection (Multiple) Files

When large binary files are encoded, the output is sometimes
placed in more than one file, called multisection files. In this
case, each section will be received in a separate e-mail message.
Before decoding, the received messages must be saved in files
using the following rules.

To decode multisection (multiple) files:

- Ensure that each encoded section in each e-mail message
contains a header line at the beginning of each encoded
section. The header contains the following information:

section 2 of 3 of file filename.ext

Where filename.ext is the name of the original binary file.

- The multisection (multiple) files must be named in
consecutive order. For example:

FILE1.uue
FILE2.uue
etc.

- The first file, FILE1.uue or FILE1.xxe, must contain the
first section of the multiple section encoded file. The
first section contains a line like:

begin 644 filename.ext

- The remaining sections may be in any order within the files
FILE2.uue, FILE3.uue, etc. UUDECODE will locate and
reassemble the remaining sections correctly using the
information in the header line at the start of each section.
(If the sections do not contain headers then the order of
the sections within the files is important. In this case
the sections must be in order.)

Using the following command line syntax, UUDECODE will decode
each of the above files, FILE1.uue, FILE2.uue, in turn and will
automatically recreate the original binary file complete with
original file name:

uudecode FILE

For multisection (multiple) files, the filename used on the
command line, 'FILE' in this example, must not include a period
and must not include a filename extension.

See UUSER.TXT for more information on decoding.

4.3 DECODING E-Mail Files Containing Encoded Sections

It is not necessary to separate or remove the encoded section
from an e-mail message. It is only necessary to save each
message as a file, as described above, and to run UUDECODE.
UUDECODE will ignore the e-mail header and extra text in each
file and will decode the encoded sections recreating the desired
binary file.


5.0 Conclusions

Binary files can be sent between users of e-mail systems.
Encoding programs convert binary files to e-mail compatible text
files which can be e-mailed. Testing of this procedure between
CompuServe and Internet e-mail was successful using the following
command line syntax:

uuencode -s1800 filename.ext output-filename

This limits the size of the resulting e-mail message to
approximately 119,000 bytes.

Some gateways only preserve messages which are smaller than
22,000 bytes and the following should be used:

uuencode -s340 filename.ext output-filename

Arbitrary file names may be assigned to the encoded files
(output-filename), but the file name of the original binary file
is preserved by the encoding / decoding process.

Decoding multisection files requires saving each e-mail message
in a separate file before decoding.


___________________________________________________


Appendix A

MIME


A1 Introduction

MIME is an acronym for Multipurpose Internet Mail Extensions.
The purpose of MIME is the same as UUENCODE, that is, to send
binary files and other files over e-mail systems which handle
only simple ASCII (text) characters. MIME will eventually allow
e-mail systems to automatically identify the type of information
in the file. For example, simple text, sound, pictures, and
video files will be automatically played and displayed by e-mail
systems. For the time being, not all e-mail systems are fully
MIME compliant therefore human intervention is sometimes
necessary.

A2 Identifying MIME Files

MIME e-mail can be identified by required lines at the beginning
of the message.

The MIME specification requires that the beginning of the message
contain a MIME-Version header field, which uses a version number
to declare that a message conforms to the MIME standard.

MIME-Version: 1.0

Since some e-mail systems are now adopting the MIME standard,
this header will often appear at the beginning of e-mail text
messages (just e-mail). It is therefore necessary to look
further to verify that the binary file encoding used is really
MIME and not just some other encoding format being sent by a MIME
compliant e-mail system.

This can be usually done by looking for the following line
usually found at the beginning of MIME encoded binary files:

Content-Transfer-Encoding: BASE64

This line identifies a MIME encoded binary file.

Identification of MIME files is not required as decoding programs
will usually indicate success or failure. In addition, some
decoding programs decode more than one format. For example, the
shareware ($10) version of the R. E. Marks UUDECODE program will
identify and decode MIME, and UUencoded, and XXencoded files
automatically.

A3 MIME Encoding and Decoding Programs

You can encode and decode MIME files on your computer using these
programs.

MIME64d.zip is a DOS MIME encoder / decoder and can be found in
FTP site:

ftp.coast.net

in directory:

/mirrors/SimTel/msdos/decode

or via the WWW at:

http://www.coast.net/SimTel/msdos/decode.html


The R. E. Marks UUENCODE and UUDECODE will also handle MIME but
only the $10 shareware version will decode MIME. The shareware
versions of UUENCODE and UUDECODE, which I use, are available
from Richard Marks. See Appendix D for his address.


___________________________________________________


Appendix B

BinHex


B1 Introduction

BinHex is a binary file encoding format commonly used by
Macintosh computers. It is not a popular format.

B2 Identifying BinHex Files

See Appendix C for information on identifying BinHex files.

B3 MIME Encoding and Decoding Programs

You can encode and decode BINHEX files on your computer using
BINHEX13.zip. BINHEX version 1.3 is a DOS BinHex encoder /
decoder and can be found in FTP site:

ftp.coast.net

in directory:

/mirrors/SimTel/msdos/mac

or via the WWW at:

http://www.coast.net/SimTel/msdos/mac.html


___________________________________________________


Appendix C

IDENTIFYING ENCODED FILES


C1 Identifying Encoded File Types

There are four common types of file encoding formats. These are
MIME, UUencode, XXencode, and BinHex. Some identifying
characteristics of each will be discussed.

C2 MIME

MIME encoded files usually have the line:

Content-transfer-encoding: base64

at the beginning of the encoded file. The following is an
example of a MIME file. It includes only the first few lines.

Content-Type: text/plain; charset=US-ASCII; name=junk.zip
Content-transfer-encoding: base64

UEsDBBQAAAAIAAeNrRo+KQfi/C4AAJ2MAAAIAAAAbWltZS50eHSlfdt2Gsmy
4LvW6B9qNA/b1gC62G7L7stsJKFu2kayBPLePWedhwIKKKuownURwt93Pmzi


This second example of MIME encoding was created by UUENCODE
(uuexe540.zip) written by Richard Marks:

MIME-Version: 1.0
Content-Description: "Base64 encode of junk.zip by UUDECODE
95 (v40)"
Content-Type: message/partial;
id="17900212345678901234567890123456789012345678";
number=1; total=7
Message-ID: <179002012034_junk.zip>
Content-Type: unknown; charset=us-ascii; name="junk.zip"
Content-Transfer-Encoding: Base64
Content-Disposition: attachment; filename="junk.zip"

UEsDBBQAAQAIALKLfR5jNDAJ3F0AAIDcAAAMAAAARVBST00xU1Qu
W6BZ9cDlmdB883S-LH+wwckwqHdbjFt87smBod4erm0+-GWN7Hbu

C3 UUencoded Files

UUencoded files usually have the line:

begin 644 filename

at the beginning of the encoded file. Filename is the file name
of the original file. The encoded file also has the letter 'M'
as the first character in each line. The following is an example
of an UUencoded file. It includes only the first few lines.

section 1 of 8 of file junk.zip < uuencode 95 (v40) by
R.E.M. >

begin 644 junk.zip
M4$L#!!0``0`(`+*+?1YC-#`)W%T``(#<```,````15!23TTQ4U0N
M*0'B%PXAD0\Y6Z!9]<#EF=!\\W2_+'^PP

Note that this is the first section of an 8 section multisection
file.

C4 XXencoded Files

XXencoded files also usually have the line:

begin 644 filename

at the beginning of the encoded file. Filename is the file name
of the original file. The encoded file has the letter 'h' as the
first character in each line where UUencoded files contain the
letter 'M'. The following is an example of an XXencoded file.
It also includes only the first few lines.

section 1 of 8 of file junk.zip < xxencode 95 (v40) by
R.E.M. >

begin 644 junk.zip
hI2g1--E++E+6+989TFtXB1+7r3o++61Q+++A++++FJ-GHoolIpEi
h8E5W3ksVYEwtKu-NxQ1ZaR-wwrGz95ykkQYke5RPX3hwvga-cRsS

Note again that this is the first section of an 8 section
multisection file.

C5 BinHex

The first character of a BinHex file is a colon ':'. It is also
the last character of the last line. The following is an example
of a BinHex file. It includes only the first few lines. The
first line in brackets is often seen starting a BinHex section.

(This file must be converted with BinHex 4.0)

:$%Y&@5e'38Y&,N0263"*3Ne#58*0)!#3"!H[!*!%#p[T@3BJ5d9C
b,M!D!*!&qj`ZJ$iA!3"e"TdZrbi6!5lq$KF"kr-!N!EX!3!!qj`Z


___________________________________________________
Appendix D

REFERENCES


D1 UUENCODE/UUDECODE

The $10 shareware version of UUEXE540.zip, called UUDEC542.zip,
contains UUENCODE and UUDECODE and is available from:

Richard Marks
931 Sulgrave Lane
Bryn Mawr, PA
19010 USA

Send an e-mail address with your order and Mr. Marks will e-mail
the program back directly to you.

These programs are also available as freeware as UUEXE540.zip.
The freeware version handles UU and XX encoding and decoding, and
MIME encoding but will not decode MIME. Only the shareware
version will decode MIME.

The freeware version of this .ZIP file is available from many
Internet anonymous FTP sites, including:

ftp.coast.net

in directory:

/mirrors/SimTel/msdos/decode

or via the WWW at:

http://www.coast.net/SimTel/msdos/decode.html


D2 UUSER.doc

This file contains user documentation for the UUENCODE and
UUDECODE programs by Richard Marks. This file is included in
UUEXE540.zip.

D3 UUTECH.doc

This file contains a write-up of the more technical aspects of
UUencoding and UUdecoding. This file is also included in
UUEXE540.zip.

D4 MIME.TXT

MIME.txt by Mark Grand (e-mail: mark@premenos.sf.ca.us) is
entitled "MIME Overview" and is dated May 1993.

This is a readable discussion of the MIME standard. Mime.txt is
packaged with MIME64d.zip a DOS MIME encoder - decoder which can
be found in FTP site:

ftp.coast.net

in directory:

/mirrors/SimTel/msdos/decode

or via the WWW at:

http://www.coast.net/SimTel/msdos/decode.html

D5 Common Internet Terms

For definitions of common Internet terms see document:

http://www.matisse.net/files/glossary.html

D6 File Formats, Compression, and Archiving

For an understanding of file formats, compression, and archiving,
see Indiana University document:

http://www.indiana.edu/~ucspubs/f033/

This document discusses Mac, DOS, and Unix platforms.

For another source of information on common Internet file formats
see document:

http://www.matisse.net/files/formats.html

D7 PKZIP Compression Software

This is the most popular compression software. PKZIP creates a
compressed '.zip' file and PKUNZIP is used to recover the
original file or files by decompressing the '.zip' file.
PKZ204g.exe can be found in FTP site:

ftp.coast.net

in directory:

/mirrors/SimTel/msdos/zip

or via the WWW at:

http://www.coast.net/SimTel/msdos/zip.html

___________________________________________________
Appendix E

UUENCODE UUDECODE Command Line Syntax


The command line syntax is included here for reference. See
UUSER.TXT for detailed info.

UU-ENCODE 95 (v40) FOR PC. by Richard Marks

Usage: uuencode [-clshouxt] []
-c : do not create any checksums.
-l : put checksum on every line.
-s : do not split output file into sections.
-s nnn : section contains nnn lines (default=950 lines).
-h nnn : leave room in first file for nnn header text lines.
-o : write to standard output.
-u : create unix format file (default is MS-DOS)
-x : encode using XXDECODE characters.
upper case X, also default to .XXE extension.
-6 : encode using Mime compatible base 64 form
-t : put character mapping table into output


UU-DECODE 95 (v40) FOR PC. by Richard Marks

Usage: uudecode [-uxclsizeqyn] []
-u/x : use UU/XX decode mode. (default: auto determine)
-6 : base64 MIME compliant decode
-c : do not analyze any checksums
(-C: no input file checksums)
-l : do not analyze checksums on lines
(use if invalid char on line message with valid lines)
-s : do not handle split-up input files
-i : input from stdin (implies -s)
-z : "cut line" (in quotes, exact, case sensitive, match)
use if autodetect of cut line, in multisection file,
fails
-e : call UNARCUUE.BAT on successful completion:
%1=output file name, %2=extension
%3=input file name, %4=extension,
%5=number of sections
(-E: followed by name of .BAT file)
-q : quiet mode, turn off sounds
-y/n : assume Y/N responses, permits responseless operation


___________________________________________________




File: TN951201.doc .txt

-30-