com.hunnysoft.jmime
Class QuotedPrintableDecoder

java.lang.Object
  extended bycom.hunnysoft.jmime.QuotedPrintableDecoder

public class QuotedPrintableDecoder
extends java.lang.Object

Class that performs quoted-printable decoding.

Quoted-printable encoding encodes 8-bit text into printable ASCII characters for sending through the Internet mail system. The encoding is required because 8-bit characters cannot pass reliably through the Internet mail system. In the quoted-printable encoding, a character that is not in the ASCII 7-bit character set is encoded as an equals sign (the character "=") followed by two hex digits (0-9 and A-F). Certain ASCII characters are also encoded. The quoted-printable encoding also encodes "soft" line breaks, since long lines also cannot pass reliably through the Internet mail system. The details of quoted-printable encoding can be found in RFC 2045.

QuotedPrintableDecoder provides two interfaces for performing quoted-printable decoding.

A high-level interface decodes from an input ByteString to an output ByteString. This interface comprises a single method, decode(ByteString).

A low-level interface allows decoding by passing multiple buffers to the decoder. The correct procedure for using this interface is described below.

QuotedPrintableDecoder allows you to change certain options, which affect the behavior of the decoder:

Using the Low-Level Interface

The low-level interface allows you to decode data one buffer at a time; thus you may decode data of unlimited size using a limited amount of memory. For example, if you want to decode data from an input file to an output file, you may read from the input file one buffer at a time, pass each buffer to the decoder, and write to the output file one buffer at a time.

The low-level interface comprises three methods: start(), decodeSegment(ByteBuffer,ByteBuffer), and finish(ByteBuffer). The procedure is described here:

  1. Call start() to initialize the decoder.
  2. Initialize an input buffer and an output buffer. These buffers are instances of ByteBuffer. To initialize an input buffer named inBuf, set inBuf.bytes to a byte array that contains the data to be decoded, set inBuf.pos to the offset of the beginning of the data in inBuf.bytes, and set inBuf.endPos to the offset of the first byte past the end of the data in inBuf.bytes. To initialize an output buffer named outBuf, set outBuf.bytes to a byte array, set outBuf.pos to zero, and set outBuf.endPos to the length of the array referenced by outBuf.bytes.
  3. Call decodeSegment(ByteBuffer,ByteBuffer) with the input buffer and output buffer as arguments.
  4. Check to see if the output buffer is full or if the input buffer is empty. If outBuf.pos == outBuf.endPos, then the output buffer is full, and you must make room in the output buffer before you call decodeSegment again. If inBuf.pos == inBuf.endPos, then the input buffer is empty, and you must supply the input buffer with more data before you call decodeSegment again.
  5. Repeat steps 3 and 4 until the last input buffer is empty.
  6. Call finish(ByteBuffer) to flush any internally buffered data to the output buffer. If the output buffer is full after finish returns, you must make room in the output buffer and call finish again. If finish returns and the output buffer is not full, then the decoding is finished.

You may use the same decoder object for multiple decode operations.

Dealing With Errors

The decoder correctly decodes all data that is correctly encoded. However, if the data is not correctly encoded, the decoder detects these errors. All decoding errors are treated as non-fatal errors -- the decoder tries to recover. After a decode operation, you may call errorDetected() to discover if the decoder detected a decoding error.

See Also:
QuotedPrintableDecoderW, Quoted-printable in RFC 2045

Constructor Summary
QuotedPrintableDecoder()
          Default constructor.
 
Method Summary
 ByteString decode(ByteString encoded)
          Performs single-step buffer-to-buffer quoted-printable decoding.
 void decodeSegment(ByteBuffer inBuf, ByteBuffer outBuf)
          Decodes data from the input buffer to the output buffer.
 boolean errorDetected()
          Indicates if an error occurred while decoding.
 void finish(ByteBuffer outBuf)
          Finishes a multiple-buffer decode operation.
 boolean outputCrLf()
          Gets the CRLF end-of-line characters option.
 void setOutputCrLf(boolean b)
          Sets the CRLF end-of-line characters option.
 void start()
          Starts a multiple-buffer decode operation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

QuotedPrintableDecoder

public QuotedPrintableDecoder()
Default constructor.

Method Detail

setOutputCrLf

public void setOutputCrLf(boolean b)
Sets the CRLF end-of-line characters option.

If this option is true, then the decoder uses CR LF as the end-of-line characters for hard line breaks in the decoded output. If this option is false, then the decoder uses LF alone.

Normally, you do not need to set this option, because the decoder performs correctly by default. When your program starts, and before you create any threads, set TextUtil.EOL to either TextUtil.LF_EOL or TextUtil.CRLF_EOL. (The default is TextUtil.LF_EOL.) Then, the quoted-printable decoder sets the value of this option based on the value of TextUtil.EOL.

Parameters:
b - true value causes the decoder to output CR LF for the end-of-line characters; false causes it to output LF.

outputCrLf

public boolean outputCrLf()
Gets the CRLF end-of-line characters option.

Returns:
boolean value of this option
See Also:
setOutputCrLf(boolean)

errorDetected

public boolean errorDetected()
Indicates if an error occurred while decoding.

After a decode operation, you may check the return value of this method to discover if the decoder detected any errors while decoding.

The decoder correctly decodes all content that is encoded according to the MIME standard. However, if the content is encoded incorrectly, the decoder detects an error, and the decoder may not be able to correctly decode the content. The decoder treats all errors as non-fatal and tries to recover from them.

The decoder treats errors as non-fatal errors because most quoted-printable-encoded content is text. Even if there are errors in decoding the content, the text should still be presented to a human user, since humans are much better than machines at understanding text.

Returns:
true if the decoder detected an error while decoding

start

public void start()
Starts a multiple-buffer decode operation.

If you use the low-level interface for multiple-buffer decoding, you must call start to begin the decode operation. You may use a QuotedPrintableDecoder instance for many decode operations, but you must call start to begin each operation.

For more information on using the low-level interface, see the overview section for QuotedPrintableDecoder.

You do not need to call this method if you use the decode(ByteString) method for decoding.


decodeSegment

public void decodeSegment(ByteBuffer inBuf,
                          ByteBuffer outBuf)
Decodes data from the input buffer to the output buffer.

This method is an essential part of the low-level interface and performs most of the work of decoding for the QuotedPrintableDecoder class. It takes an input buffer and an output buffer as parameters, and decodes data from the input buffer until the input buffer is empty or the output buffer is full. In other words, one of the following conditions is guaranteed to be satisfied when the method returns:

You may call the method multiple times to decode multiple buffers of input data. However, before you call the method, both of the following conditions should be true:

For more information on using the low-level interface, see the overview section for QuotedPrintableDecoder.

Parameters:
inBuf - input buffer
outBuf - output buffer

finish

public void finish(ByteBuffer outBuf)
Finishes a multiple-buffer decode operation.

When you use the low-level interface, the decoder buffers some data internally. Therefore, after you have passed all input data to the decoder, you must call this method to flush the internal buffer.

The following condition must be satisfied when you call the method:

The above condition must also be satisified after the method returns in order to guarantee that all output data has been written to the output buffer. You may need to call finish more than once before the above condition is satisfied when the method returns.

For more information on using the low-level interface, see the overview section for QuotedPrintableDecoder.

Parameters:
outBuf - output buffer

decode

public ByteString decode(ByteString encoded)
Performs single-step buffer-to-buffer quoted-printable decoding.

To perform quoted-printable decoding using this method, create a ByteString containing the data you want to decode and pass it as the method's argument. The returned ByteString contains the decoded output.

This method makes it very simple to perform quoted-printable decoding. The disadvantage of this method is that it requires all the data to be kept in memory for processing. You may use the low-level interface, described in the overview section, to perform quoted-printable decoding of large data using limited memory.

This method uses the low-level interface internally. Any options set for the decoder object have the same effect using either this method or the low-level interface.

Parameters:
encoded - byte string containing the encoded data
Returns:
byte string containing the decoded data