Main Page | Data Structures | File List | Data Fields | Globals

AddrParser.h File Reference


Detailed Description

The AddrParser module provides an object that can parse email address lists, such as those found in the field bodies of the "To" or "From" header fields of an Internet mail message. As a special case, an AddrParser object can parse a single email address, which is just a list containing a single address.

Email Address Terminology

There are two kinds of email addresses: mailbox names and groups. A mailbox name is what most people recognize as an email address: it's what they put on their business cards. A typical mailbox name is the following:

Pete Sake <psake@example.com>

This mailbox name contains three parts: a display name, a local name, and a domain. The display name has no significance other than providing a user-friendly name that is often displayed to a human in a user interface. The display name is always optional. In the example mailbox name above, the display name is "Pete Sake". The local name is the part of the mailbox name that follows the display name and precedes the '@' symbol. In the example mailbox name above, the local name is "psake". Finally, the domain is the part of the mailbox name that follows the '@' symbol. In the example mailbox name above, the domain is "example.com".

(The local name is sometimes referred to as a user or user name. This is technically inaccurate, since there may not be a user -- in the sense that it is usually understood in computer terminology -- associated with the local name. For example, if the mailbox name is sales@example.com, there is probably no user in any system with the user name "sales". Perhaps the best descriptive term would be local mailbox name. A good term for the fully qualified mailbox name might also be global mailbox name.)

(Similarly, the domain is sometimes referred to as the host or host name. This, too, is technically inaccurate, since the domain need not be the name of any host. The domain is a name that can be looked up in the DNS distributed database to retrieve MX records. The MX records indicate the name of a host that the mail should be delivered to. Consider the mailbox name john@aol.com. There is no host named "aol.com"; however, "aol.com" is the name of a valid mail domain. If there are no MX records in the DNS database for a particular domain name, only then will an MTA treat the domain name as a host name.)

A group identifies one or more mailboxes. Groups are somewhat rare, but they are used often enough that they must not be ignored. An example of a group is the following:

Three Stooges: moe@silly.tv, larry@silly.tv, curly@silly.tv;

A group contains a group name and zero or more mailbox names separated by commas. The example group above has the group name "Three Stooges" and a list of three mailbox names. A group always ends with a semicolon.

Using AddrParser

AddrParser is a tool for parsing email address lists. The parser correctly parses all address lists that conform to the syntax of RFC 2822, including mailbox names and groups.

An AddrParser object must be initialized before it can be used. You call the function AddrParser_initialize() to initialize an AddrParser object. You need to initialize the AddrParser object only once. After it has been initialized, you may use it multiple times to parse different address lists.

After you initialize an AddrParser object, you call AddrParser_start() to set the address list that you want to parse and to reset the internal state of the parser. You then call AddrParser_parseNext() and test the return value. Depending on the return value, you may call zero or more of the functions AddrParser_displayName(), AddrParser_route(), AddrParser_localName(), and AddrParser_domain(). When the return value from AddrParser_parseNext() indicates the end of the list, you call AddrParser_finish() to free any memory that the parser owns.

If the address list contains only mailbox names -- the typical case -- then AddrParser_parseNext() returns ADDR_PARSER_MAILBOX_NAME each time it is called until it reaches the end of the list, when it returns ADDR_PARSER_END_LIST. After AddrParser_parseNext() returns ADDR_PARSER_MAILBOX_NAME, you call AddrParser_displayName(), AddrParser_localName(), and AddrParser_domain() to get the display name, local name, and domain, respectively. You may also call AddrParser_route() to get the route, if you need it.

If the address list contains a group, AddrParser_parseNext() returns ADDR_PARSER_BEGIN_GROUP after it has parsed the group name. You may then call AddrParser_displayName() to get the group name. AddrParser_parseNext() then returns ADDR_PARSER_MAILBOX_NAME zero or more times -- once for each mailbox name in the group. When the parser detects the end of the group, AddrParser_parseNext() returns ADDR_PARSER_END_GROUP.

If the parser detects an error in the syntax of the address list, AddrParser_parseNext() returns ADDR_PARSER_ERROR. The parser tries to recover, so that you can continue to call AddrParser_parseNext() to parse the rest of the list.

When the parser reaches the end of the list, AddrParser_parseNext() returns ADDR_PARSER_END_LIST. You must then call AddrParser_finish() to free the memory allocated by the parser.


Functions

void AddrParser_initialize (AddrParser *parser)
 Initializes the parser object.
void AddrParser_start (AddrParser *parser, const char *buffer, size_t begin, size_t end)
 Starts a parsing operation.
void AddrParser_finish (AddrParser *parser)
 Finishes a parsing operation.
int AddrParser_parseNext (AddrParser *parser)
 Continues a parsing operation.
const char * AddrParser_displayName (AddrParser *parser)
 Gets the display name of a mailbox name or a group.
const char * AddrParser_route (AddrParser *parser)
 Gets the route of a mailbox name.
const char * AddrParser_localName (AddrParser *parser)
 Gets the local name of a mailbox name.
const char * AddrParser_domain (AddrParser *parser)
 Gets the domain of a mailbox name.


Function Documentation

const char* AddrParser_displayName AddrParser *  parser  ) 
 

Gets the display name of a mailbox name or a group.

You may call this function after AddrParser_parseNext() returns ADDR_PARSER_MAILBOX_NAME or ADDR_PARSER_BEGIN_GROUP.

The return value is a pointer that becomes invalid after the next call to AddrParser_parseNext(). If you wish to store the display name, you must copy the string.

The function never returns a NULL pointer. If the mailbox name or group does not contain a display name, then the function returns a zero-length string.

Parameters:
parser the parser object
Returns:
display name of the most recently parsed mailbox name or group

const char* AddrParser_domain AddrParser *  parser  ) 
 

Gets the domain of a mailbox name.

You may call this function after AddrParser_parseNext() returns ADDR_PARSER_MAILBOX_NAME.

The return value is a pointer that becomes invalid after the next call to AddrParser_parseNext(). If you wish to store the domain, you must copy the string.

The function never returns a NULL pointer, but it may return an empty string.

Parameters:
parser the parser object
Returns:
domain of the most recently parsed mailbox name

void AddrParser_finish AddrParser *  parser  ) 
 

Finishes a parsing operation.

You must call this function in order to free any memory that the parser allocated during the parse operation.

Parameters:
parser the parser object

void AddrParser_initialize AddrParser *  parser  ) 
 

Initializes the parser object.

You must initialize the parser object before you use it. After you have initialized it, you may use the parser object for multiple parse operations.

Parameters:
parser parser object

const char* AddrParser_localName AddrParser *  parser  ) 
 

Gets the local name of a mailbox name.

You may call this function after AddrParser_parseNext() returns ADDR_PARSER_MAILBOX_NAME.

The return value is a pointer that becomes invalid after the next call to AddrParser_parseNext(). If you wish to store the local name, you must copy the string.

The function never returns a NULL pointer, but it may return an empty string.

Parameters:
parser the parser object
Returns:
local name of the most recently parsed mailbox name

int AddrParser_parseNext AddrParser *  parser  ) 
 

Continues a parsing operation.

After you call the function AddrParser_start(), you call this function to parse addresses from the address list. The function's return value indicates the result of the operation.

If the function returns ADDR_PARSER_MAILBOX_NAME, then the parser has just parsed a mailbox name. You call the functions AddrParser_displayName(), AddrParser_route(), AddrParser_localName(), or AddrParser_domain() to get the parts of the mailbox name.

If the parser returns ADDR_PARSER_BEGIN_GROUP, then the parser has just parsed the beginning of a group. You call the function AddrParser_displayName() to get the name of the group. Any following mailbox names up until AddrParser_parseNext() returns ADDR_PARSER_END_GROUP are part of the mailbox list contained in the group.

If the parser returns ADDR_PARSER_ERROR, then the parser has failed to parse an address. The parser recovers, so you can continue to call AddrParser_parseNext().

If the parser returns ADDR_PARSER_END_LIST, then the parser has reached the end of the address list. When you get this value, you terminate your loop, since there are no more addresses to parse.

Parameters:
parser the parser object
Returns:
enumerated value that indicates the result of the parsing

void AddrParser_start AddrParser *  parser,
const char *  buffer,
size_t  begin,
size_t  end
 

Starts a parsing operation.

You call this function to begin parsing an address list.

The parser begins parsing at the character buffer[begin]. The parser processes characters in buffer up to and including the character buffer[end-1].

The parser does not copy the array pointed to by buffer; therefore, you must not free the array until you have finished parsing the address list and you have called AddrParser_finish().

The parser does not free the array pointed to by buffer.

Parameters:
parser the parser object
buffer pointer to a char array containing the address list to parse
begin offset in the array to begin parsing
end offset in the array to end parsing

Copyright © 2001-2006 Hunny Software, Inc. All rights reserved.