Address Syntax

Address Syntax Check

The address syntax check works similar to the quality check, but focusses purely on the syntax of an email address. The input address can include encoded Unicode characters.

To call the syntax check use this syntax:

Syntax …/svc/2.0/address/syntax/<e-mail address>
Example …/svc/2.0/address/syntax/foo@bar.com
Parameter An email address as last part of the URL

Result

As XML:

<syntaxStatus>
    <decoded>foo@xn—blmchen-o2a.de</decoded>
    <extSyntax>1</extSyntax>
    <result>2</result>
</syntaxStatus>

As JSON:

   {
       "decoded": "foo@xn—blmchen-o2a.de",
       "extSyntax":1
       "result":1,
   }
result

Tests the syntax of the address against the e-mail addressing standards, possible result values are

  • 0: invalid syntax
  • 1: valid syntax
  • 2: probably valid syntax, Unicode problems were solved, see decoded

The test is stricter than the standards because it requires a valid domain name in the address. Localhost addresses and other exotic cases will not be accepted, because it is unlikely that these are e-mail addresses valid for business.

If the syntax result is 0 (invalid) or 2 (probably valid) the structure contains also syntax warnings explaining the problems, see below.

decoded

If the syntax test ended with a result of 2, this field will contain the decoded ASCII address. A syntax test result of 2 means that the address contained Unicode characters (e.g., umlauts, arabic or chinese characters), which are invalid in an e-mail address. These characters were successfully converted and the resulting, valid ASCII address was stored in decoded. Further tests should always use this decoded address.

The decoding during the syntax test is done in two stages:

  1. the local part of the address is checked for German umlauts. If found, they are converted to their usual ASCII counterparts (ü ⟶ ue, ä ⟶ ae, ö ⟶ oe, ß ⟶ ss)
  2. the domain part of the address is transformed to Punycode, according to the standard for international domains
extSyntax

Many e-mail providers have their own rules for valid e-mail addresses of their domains. These ypically include the minimal length of an address, which punctuation characters are allowed etc. The extended syntax check verifies addresses against these rules. Possible results are:

  • 0: invalid syntax for this domain
  • 1: valid syntax for this domain

If the extended syntax check fails, the result structure will include syntax warnings explaining the problem, see below.

syntaxWarnings

If one of the syntax checks fails, the result structure will include one ore more syntaxWarnings elements:

<syntaxStatus>
    ...
    <syntaxWarnings>synm002</syntaxWarnings>
    <syntaxWarnings>....
</syntaxStatus>

Each element contains a message code, which can be used to identify the problem. See the page syntax warnings for codes and explanations.

domainScores

The domainScores element consists of a list of similar sounding domain names ordered by a calculated score:

<syntaxStatus>
  ...
  <domainScores>
    <domainScore>
      <domain>teleos-web.de</domain><score>1.0</score>
    </domainScore>
  </domainScores>
</syntaxStatus>

The higher the score the higher is the probability that this domain was intended. The system calculates the score by searching for similar domain names that are popular with e-mail marketing users.

Functionality

This section describes the execution of the address quality check in detail. The check consist of the following steps:

  1. checking the mailbox (local part) of the address for Unicode characters. Unicode is not allowed in the local part, so this routine only checks for typical, language-specific typos. Currently this includes only German umlauts. If found, they are converted to their usual ASCII counterparts: ü ⟶ ue, ä ⟶ ae, ö ⟶ oe, ß ⟶ ss. If at least one of these conversions happens, the syntax warning synm018 is added to the result, and the decoded ASCII value of the mailbox is stored in decoded.

  2. checking the domain name of the address for Unicode characters. If Unicode is found, the domain is probably an IRI, and the domain name will be transformed according to the Punycode standard (RFC 3942). In this case the syntax warning synm017 is added to the result, and the decoded ASCII value of the domain name is stored in decoded.

  3. checking the syntax according to the standards. As mentioned above, exotic cases, like localhost addresses or addresses with comments, will be rejected. It expects real addresses usable for e-mail transfer accross domains. If the snytax check fails, the test assumes an input error and result includes a list of similar sounding, popular domain names, taken from the domain_response table, in domainScores.

  4. checking the syntax against the rulebase of extended syntax criteria. These are provider-specific syntax rules that can be changed in the rulebase, without recoding.

Each execution of a quality check will be documented as a business event (database table business_event) with type 103.