MimeParser.h File Reference

Parse MIME messages. More...


Functions

XMIME_API void MimeParser_setMinimumFieldBodyBufferSize (size_t n)
 Sets the minimum buffer size for saving a partial header field body.
XMIME_API void MimeParser_setDefaultMaxDepth (int n)
 Sets the default value of the maximum parsing depth for the MIME Parser.
XMIME_API MimeParser * MimeParser_create ()
 Creates and returns a new, initialized MimeParser object.
XMIME_API void MimeParser_setVtable (MimeParser *parser, MimeParserVtable *vtable, void *data)
 Installs the vtable of callback functions into a MimeParser object.
XMIME_API void MimeParser_destroy (MimeParser *parser)
 Destroys a MimeParser object.
XMIME_API void MimeParser_setMaxDepth (MimeParser *parser, int n)
 Sets the maximum parsing depth for the MIME Parser object.
XMIME_API void MimeParser_start (MimeParser *parser)
 Starts the parsing of a MIME message.
XMIME_API void MimeParser_parseBuffer (MimeParser *parser, const char *buffer, size_t length)
 Continues the parsing of a MIME message.
XMIME_API void MimeParser_finish (MimeParser *parser)
 Finishes the parsing of a message.
XMIME_API size_t MimeParser_bytePos (MimeParser *parser)
 Gets the current byte position in the stream.


Detailed Description

The MimeParser module provides an object that can parse MIME messages or documents, including complex nested multipart documents. The parser is highly optimized for speed. It is, perhaps, the fastest MIME parser available.

Concepts

Before using the MimeParser module, it is important to understand a few basic concepts behind the operation of the MIME parser.

First, the MIME parser is stream-based. This means that a MIME message is presented to the parser one buffer at a time. This allows a program to parse very large messages using only a very small amount of memory. For example, you may choose to use a buffer size of 8192 bytes, and present all messages to the parser 8192 bytes at a time. Thus, if a message were very large, say 1,000,000 bytes, you could call the parser's parsing function 122 times with a full 8192-byte buffer and call it one final time with a partial buffer of 576 bytes. (1,000,000 = 122 * 8192 + 576) The amount of memory used by the parser is very small: in addition to the 8192 buffer, only a few thousand additional bytes of memory are required.

Second, the MIME parser is event-based. This means that the parser will report to your program various events that are considered significant as it parses the buffers. It is your program's responsibility to track these events and update its state based on them. You must understand these events to use the parser effectively. Examples of events include: Begin Message, Begin Headers, Begin Body Part, End Body Part, End Headers, End Message, and so on. See the reference documentation below for a complete listing of the events that are reported. Experienced XML programmers will recognize the event-based parser interface as being similar to the SAX interface used for parsing XML documents.

To handle the events, you must define your event handler functions, create an instance of the MimeParserVtable structure, and assign pointers to your functions to the members of the structure. The event handler functions are callback functions. They are documented as members of the MimeParserVtable.h file. Before you being parsing, you assign the vtable structure to the MimeParser object. Note that it is not necessary to allocate a MimeParserVtable structure dynamically: you can define the structure and initialize it statically. See the example code in simple.c.

Using MimeParser

To use the MIME parser, follow these steps:

  1. Call MimeParser_create() to create an instance of the parser.

  2. Call MimeParser_setVtable() to install your callback functions and client data.

  3. Call MimeParser_start() to start a parse operation. When you call this function, the parser calls some of the callback functions that you installed in Step 2.

  4. Call MimeParser_parseBuffer() as many times as necessary to present the entire MIME message to the parser. When you call this function, the parser calls some of the callback functions that you installed in Step 2.

  5. Call MimeParser_finish() to finish the parse operation. When you call this function, the parser calls some of the callback functions that you installed in Step 2.

  6. You may repeat Steps 3 through 5 many times. When you have finished using the parser, call MimeParser_destroy() to destroy the parser object and to free its memory.

All programs that use the MIME parser must follow the steps listed above. To accomplish your programming goals, however, you must create callback functions specific to your application. The callback functions are described in the MimeParserCallback.h reference page.

Tips

There is a certain amount of overhead associated with saving the parser state when leaving MimeParser_parseBuffer() and restoring the parser state when entering MimeParser_parseBuffer(). This overhead is neglible when buffer sizes are large. A good buffer size for file I/O is 8192.

Be careful that the buffer you use is not too large. You will get much better performance if the data you pass to the parser is in the processor's secondary cache. If you use a buffer that is too large, you will exceed the capacity of the secondary cache when you fill the buffer. The processor will then have to wait for the data to be retrieved from main memory, slowing performance.

Because of the way the parser operates, it is inefficient to present the message data to the parser one line at a time. The parser can't send the line-terminating CRLF to the Bytes() callback function until it knows that the CRLF is not part of a multipart boundary that follows the CRLF. Therefore, when you present the message data one line at a time, the parser makes two calls to the Bytes() callback function for every line in the body of a body part.

For best performance, create an instance of the parser and re-use that instance for multiple parse operations. Not only does this avoid the overhead of creating and destroying a parser object -- which can be significant if many small messages must be parsed -- but it also allows the parser object to use its pool of cached objects (these are objects that are used internally by the parser).


Function Documentation

XMIME_API size_t MimeParser_bytePos ( MimeParser *  parser  ) 

Gets the current byte position in the stream.

The parser tracks the byte position in the stream. You may call this function to get the current byte position.

Currently, this function returns the accurate byte position only when it's called from the Event callback function.

Note: The byte position is relative to the beginning of the stream. If the stream is presented to the parser in multiple buffers, the byte position might not be a position in the current buffer. For example, if buffer A contains the bytes at positions 0-8191, and buffer B contains bytes at positions 8192-16383, a byte position reported while the parser processes buffer B may be 8190, which refers to a position in buffer A.

Parameters:
parser the parser object
Returns:
current byte position in the stream

XMIME_API MimeParser* MimeParser_create (  ) 

Creates and returns a new, initialized MimeParser object.

The function allocates memory for the object, which you must eventually free by calling MimeParser_destroy().

If there is an error allocating memory, the function returns a NULL pointer.

Returns:
pointer to an initialized parser object if successful; NULL pointer if unsuccessful

XMIME_API void MimeParser_destroy ( MimeParser *  parser  ) 

Destroys a MimeParser object.

The function frees all memory. Failure to call this function when you have finished using a parser object causes a memory leak.

After MimeParser_destroy() returns, the parser pointer should be considered invalid.

Parameters:
parser the parser object

XMIME_API void MimeParser_finish ( MimeParser *  parser  ) 

Finishes the parsing of a message.

You must call this function after your last call to MimeParser_parseBuffer() in order to finish any pending processing. Before you use the parser to parse another message, you must call MimeParser_start() again after you call MimeParser_finish().

While executing MimeParser_finish(), the parser calls the callback functions in the installed vtable.

Parameters:
parser the parser object

XMIME_API void MimeParser_parseBuffer ( MimeParser *  parser,
const char *  buffer,
size_t  length 
)

Continues the parsing of a MIME message.

MimeParser_parseBuffer() presents a single buffer of byte data to the parser for processing. You may call this function multiple times after you call MimeParser_start() and before you call MimeParser_finish().

buffer is a pointer to the byte data to be processed by the parser. length is the number of bytes in buffer to be processed.

While executing MimeParser_parseBuffer(), the parser calls the callback functions in the installed vtable.

Parameters:
parser the parser object
buffer pointer to a char array containing bytes to parse
length number of characters in the buffer to parse

XMIME_API void MimeParser_setDefaultMaxDepth ( int  n  ) 

Sets the default value of the maximum parsing depth for the MIME Parser.

This function sets the default value of the maximum depth property for all MIME Parser objects. The initial default value is 100. When your program starts, you may call this function to change the default value. When you create a MIME Parser object, the MIME Parser constructor sets its maximum depth property to the default value. If the object requires a non-default value, you may then call MimeParser_setMaxDepth() to change the property for that specific object.

See the description of the MimeParser_setMaxDepth() function for details on how the maximum depth property affects the parser's behavior.

Parameters:
n the new default value
See also:
MimeParser_setMaxDepth

XMIME_API void MimeParser_setMaxDepth ( MimeParser *  parser,
int  n 
)

Sets the maximum parsing depth for the MIME Parser object.

MIME messages (or documents) may contain nested MIME parts. In the MIME specification, there is no limit to the depth to which parts may be nested. However, for practical reasons, it is desirable to be able to limit the depth to which the MIME parser will descend when parsing nested parts. This depth is controlled by the maximum depth property of the MIME parser object. You may call this function to set the value of this property.

The value of the maximum depth property determines how many multipart bodies the parser will parse. For example, if the value is zero, the parser will not parse any multipart body; instead, it will report the multipart body as a simple body. If the value is one, the parser will parse the first multipart body, presenting the parts it contains, but will not parse any nested multipart bodies.

When the MIME parser constructor creates a new MIME parser object, it sets the value of the maximum depth property to the default value, which is initially set to 100, and which you may change by calling MimeParser_setDefaultMaxDepth().

Parameters:
parser the parser object
n the new value
See also:
MimeParser_setDefaultMaxDepth()

XMIME_API void MimeParser_setMinimumFieldBodyBufferSize ( size_t  n  ) 

Sets the minimum buffer size for saving a partial header field body.

When MimeParser_parseBuffer() finishes a buffer while it's in the process of parsing header fields, either in the message or a body part, it might need to copy a partial header field body to a buffer it maintains. MimeParser creates and resizes this buffer dynamically, as needed. This parameter determines the minimum initial buffer size.

The default value for this parameter is 8192.

This parameter is a global value. If you want a value other than the default value, then you should set it once when your application starts.

Parameters:
n the new minimum buffer size

XMIME_API void MimeParser_setVtable ( MimeParser *  parser,
MimeParserVtable vtable,
void *  clientData 
)

Installs the vtable of callback functions into a MIME parser object.

parser is the MIME parser object into which you want to install the vtable. vtable is the struct containing pointers to callback functions that the parser will call. clientData is the client data that is supplied as the client data argument to the callback functions.

See the MimeParserVtable.h reference page for detailed descriptions of the callback functions.

Note: The library code does not free the memory pointed to by the vtable argument. This means you may create a single static instance of the vtable to be shared by all parser instances. However, if you dynamically allocate memory for a vtable structure, then your code also has the responsibility to free the memory. Similarly, the library code does not free the memory pointed to by the clientData parameter. This makes sense, because the library code treats the client data pointer as an opaque pointer, about which it has no information.

Parameters:
parser the parser object
vtable structure containing pointers to callback functions
clientData opaque data passed back to the client in the callback functions

XMIME_API void MimeParser_start ( MimeParser *  parser  ) 

Starts the parsing of a MIME message.

You must call this function before you call MimeParser_parseBuffer() for the first time. You may use a single parser to parse more than one MIME message, provided that you call MimeParser_start() to start the parsing of each message and call MimeParser_finish() to finish the parsing of each message.

While executing MimeParser_start(), the parser calls the callback functions in the installed vtable.

Parameters:
parser the parser object

Copyright © 2001-2010 Hunny Software, Inc. All rights reserved.