EDI 810 vs 850 Schema Mapping Permalink to this section

↑ Part of Core Architecture & Data Mapping for Reconciliation.

Reconciling purchase orders (EDI 850) against invoices (EDI 810) is the foundational control point for procurement-to-pay accuracy. The two transaction sets describe the same commercial event from opposite sides of the ledger — one asserts intent to buy, the other asserts an obligation to pay — and the reconciliation engine only works if every field that should agree is mapped to a single canonical key before comparison. When that mapping is loose, three-way match logic silently leaks variance into ERP settlement, and month-end close becomes a forensic exercise instead of an automated pass.

This is a decision-heavy mapping problem, not a one-shot translation. You have to choose which segment elements are authoritative anchors, where tolerance is allowed, and how to route a transaction when the 810 and 850 disagree. The patterns below are implementation-ready: deterministic X12 parsing, a canonical schema both documents normalize into, configurable tolerance routing, idempotent orchestration, and the dead-letter-queue (DLQ) recovery flow that keeps a noisy trading-partner stream auditable. Within the broader Core Architecture & Data Mapping for Reconciliation reference, this page defines the mapping contract that the match engine downstream depends on.

Core Concept & Decision Criteria Permalink to this section

X12 850 and 810 transactions share a hierarchical segment grammar but diverge in intent. The 850 opens with the BEG segment (Beginning Segment for Purchase Order) and carries procurement intent; the 810 opens with BIG (Beginning Segment for Invoice) and asserts a financial obligation. Both lean on the N1/N2/N3/N4 party loops, IT1 for line-item detail, and CTT/TDS for control totals. The reconciliation engine’s job is to collapse those two grammars into one canonical record so that a field-by-field comparison is even meaningful.

The decision signal that governs everything else is which element is the authoritative join key. The PO number is the spine: it appears as BEG03 on the order and resurfaces on the invoice as either BIG04 or a REF segment qualified PK. If a trading partner populates both inconsistently, you must pick a precedence order at mapping time rather than discovering the ambiguity during settlement. The second decision is where tolerance is permitted — quantity and extended price almost always need a tolerance window, whereas the PO number and supplier identity must match exactly.

The table below is the field-alignment contract the rest of this page implements. Treat the “Match rule” column as the policy the engine enforces.

Concept	EDI 850 (Purchase Order)	EDI 810 (Invoice)	Match rule
Transaction header	`BEG` (purpose, PO type, PO number, date)	`BIG` (invoice date, invoice number, PO date, PO number)	Structural anchor only
PO number (join key)	`BEG03`	`BIG04` or `REF*PK`	Exact, normalized
Invoice number	—	`BIG02`	Idempotency component
Supplier / vendor	`N1*SU` party loop	`N1*SU` party loop	Exact on agreed ID qualifier
Line sequence	`IT101`	`IT101`	Exact ordinal alignment
Quantity	`IT102` (ordered)	`IT102` (invoiced)	Tolerance window
Unit price	`IT104`	`IT104`	Tolerance window
Product ID	`IT106/IT107` qualifier+value	`IT106/IT107` qualifier+value	Exact after alias resolution
Dates	`DTM` (requested ship)	`DTM` (invoice/ship)	ISO-8601 normalized
Control total	`CTT` / amount	`TDS01` (total)	Cross-foot, tolerance

Two anchors deserve extra care. DTM segments encode dates in CCYYMMDD and frequently arrive in a trading partner’s local time, so they must be normalized to a single UTC anchor before they are compared or used for FX lookups — the deterministic patterns in Timezone Normalization for Global Supply Chains prevent off-by-one-day discrepancies at period boundaries. And monetary elements must never be coerced through floating point; carry them as fixed-point decimals from the moment they leave the parser.

Implementation Permalink to this section

The mapping layer is a stateless transformer: it parses raw X12 into a positional segment index, projects each document into the canonical schema above, validates it with pydantic, and emits a reconciliation-ready record. Keeping it stateless is what makes the pipeline replayable — the same pair of payloads always produces the same canonical record, which is the precondition for idempotency further downstream. Structured logging at each stage gives you the audit fields the recovery section depends on.

PYTHON

import logging
import re
from datetime import datetime, timezone
from decimal import Decimal
from typing import Dict, List, Optional

from pydantic import BaseModel, Field, ValidationError

logger = logging.getLogger("edi.recon.mapping")

# X12 IT1 positional layout:
# [seg_id, line_no, qty, uom, unit_price, basis, prod_qual, prod_id, ...]
IT1_LINE_NO, IT1_QTY, IT1_PRICE, IT1_PROD_QUAL, IT1_PROD_ID = 1, 2, 4, 6, 7


class LineItem(BaseModel):
    po_line: str
    item_id: str
    qty_ordered: Decimal = Decimal("0")
    qty_invoiced: Decimal = Decimal("0")
    unit_price_ordered: Decimal = Decimal("0")
    unit_price_invoiced: Decimal = Decimal("0")
    qty_tolerance_units: Decimal = Decimal("2")
    price_tolerance_pct: Decimal = Decimal("0.015")


class ReconciliationRecord(BaseModel):
    po_number: str
    invoice_number: str
    supplier_id: str
    invoice_date: datetime
    line_items: List[LineItem]
    total_ordered: Decimal
    total_invoiced: Decimal
    status: str = Field(default="PENDING")


def parse_x12_segments(raw_edi: str) -> Dict[str, List[List[str]]]:
    """Index an X12 envelope by segment id. Pure and side-effect free."""
    parsed: Dict[str, List[List[str]]] = {}
    for seg in (s.strip() for s in raw_edi.split("~") if s.strip()):
        parts = seg.split("*")
        parsed.setdefault(parts[0], []).append(parts)
    logger.debug("parsed_segments segment_types=%d", len(parsed))
    return parsed


def _elem(seg_rows: List[List[str]], idx: int, pos: int) -> Optional[str]:
    """Safely read element `pos` of occurrence `idx` of a segment."""
    if idx < len(seg_rows) and pos < len(seg_rows[idx]):
        return seg_rows[idx][pos].strip() or None
    return None


def _po_number(p810: Dict[str, List[List[str]]]) -> Optional[str]:
    """Resolve the join key with explicit precedence: BIG04 then REF*PK."""
    big04 = _elem(p810.get("BIG", []), 0, 4)
    if big04:
        return big04
    for ref in p810.get("REF", []):
        if len(ref) > 2 and ref[1] == "PK":
            return ref[2].strip() or None
    return None


def extract_alignment(raw_850: str, raw_810: str) -> ReconciliationRecord:
    """Project a matched 850/810 pair into the canonical reconciliation record."""
    p850, p810 = parse_x12_segments(raw_850), parse_x12_segments(raw_810)

    po_num = _elem(p850.get("BEG", []), 0, 3)
    inv_num = _elem(p810.get("BIG", []), 0, 2)
    supplier = _elem(p850.get("N1", []), 0, 4) or _elem(p850.get("N1", []), 0, 2)
    inv_raw = _elem(p810.get("BIG", []), 0, 1)
    inv_date = (
        datetime.strptime(inv_raw, "%Y%m%d").replace(tzinfo=timezone.utc)
        if inv_raw
        else datetime.now(timezone.utc)
    )

    join_810 = _po_number(p810)
    if join_810 and po_num and join_810 != po_num:
        logger.warning("po_join_mismatch order=%s invoice=%s", po_num, join_810)

    lines_850, lines_810 = p850.get("IT1", []), p810.get("IT1", [])
    items: List[LineItem] = []
    for l850, l810 in zip(lines_850, lines_810):
        items.append(
            LineItem(
                po_line=l850[IT1_LINE_NO] if len(l850) > IT1_LINE_NO else "",
                item_id=l850[IT1_PROD_ID] if len(l850) > IT1_PROD_ID else "",
                qty_ordered=Decimal(l850[IT1_QTY]) if len(l850) > IT1_QTY else Decimal("0"),
                qty_invoiced=Decimal(l810[IT1_QTY]) if len(l810) > IT1_QTY else Decimal("0"),
                unit_price_ordered=Decimal(l850[IT1_PRICE]) if len(l850) > IT1_PRICE else Decimal("0"),
                unit_price_invoiced=Decimal(l810[IT1_PRICE]) if len(l810) > IT1_PRICE else Decimal("0"),
            )
        )

    try:
        record = ReconciliationRecord(
            po_number=po_num or "",
            invoice_number=inv_num or "",
            supplier_id=supplier or "",
            invoice_date=inv_date,
            line_items=items,
            total_ordered=sum((i.qty_ordered * i.unit_price_ordered for i in items), Decimal("0")),
            total_invoiced=sum((i.qty_invoiced * i.unit_price_invoiced for i in items), Decimal("0")),
        )
    except ValidationError as exc:
        logger.error("canonical_validation_failed po=%s errors=%s", po_num, exc.errors())
        raise

    logger.info(
        "mapped po=%s invoice=%s lines=%d delta=%s",
        record.po_number, record.invoice_number, len(items),
        record.total_invoiced - record.total_ordered,
    )
    return record

The validation step is deliberately strict: a payload that cannot be coerced into the canonical schema is rejected at the boundary rather than allowed to contaminate the ledger. The same pydantic-first discipline applies across every ingress on this site — the contract design is covered in depth in Schema Validation Using Pydantic.

Configuration & Threshold Calibration Permalink to this section

Tolerance windows are the single most important configuration surface, and they should be vendor-tier specific rather than global. A line passes the financial gate only when both the quantity and the extended price fall inside their respective windows:

\left| q_{inv} - q_{po} \right| \le \tau_{qty} \quad\text{and}\quad \left| \frac{p_{inv} - p_{po}}{p_{po}} \right| \le \tau_{price}

Calibrate $\tau_{qty}$ and $\tau_{price}$ per trading-partner tier. High-volume commodity suppliers with clean EDI typically run a tight ±0.5% price window and zero quantity slack; strategic partners with legacy translators may need ±1.5% and a one- or two-unit quantity allowance to absorb pack-rounding. Never widen a tolerance to clear a backlog — that converts a data-quality alert into silent financial leakage. The methodology for choosing these bands, including how to derive them from historical variance distributions, is detailed in Setting Quantity and Price Tolerance Windows.

Parameter	Recommended default	Tier override range	Rationale
`price_tolerance_pct`	`0.015`	`0.005`–`0.025`	Absorbs FX rounding and pack-price drift
`qty_tolerance_units`	`0`	`0`–`2`	Allows pack/case rounding for bulk SKUs
`po_join_precedence`	`["BIG04", "REF*PK"]`	partner-specific	Resolves duplicate PO references
`date_format`	`%Y%m%d` → UTC	fixed	Eliminates regional `DTM` ambiguity
`monetary_type`	`Decimal`	fixed	Prevents float rounding breaches
`control_total_check`	`TDS01` vs `Σ(IT1)`	`±0.01`	Catches truncated/oversized line sets

When the 850 and 810 originate in different fiscal zones, the price tolerance must be applied after currency conversion, with the FX rate pinned to the invoice timestamp. Folding rate selection into the mapping layer is where reconciliation drift creeps in; route it through the standardized Multi-Currency Reconciliation Frameworks instead so rate application and rounding are consistent across every settlement batch.

Orchestration & Integration Permalink to this section

The mapping layer sits between ingestion and the match engine, and it must guarantee exactly-once settlement under retry. Derive an idempotency key from po_number + invoice_number + line_sequence and persist it before posting; a replayed 810 (a VAN redelivery, an AS2 retry) then resolves to the same key and is suppressed at the commit boundary rather than double-paying a supplier.

Upstream, the canonical record consumes whatever the parsing stage produces and assumes it is already structurally valid — that boundary contract belongs to the ingestion layer, not here. Downstream, the matched record flows into the broader Core Architecture & Data Mapping for Reconciliation pipeline, which adjudicates exact, tolerance, and similarity tiers. Keep parsing, mapping, validation, and routing as distinct micro-batches so each scales independently and a spike in malformed payloads cannot stall settlement. Once the canonical record is produced, the field-level transformation into your warehouse’s procurement tables — flattening the IT1 loops into relational rows or JSONB, and applying explicit casting rules per trading-partner variant — is covered in How to Map EDI 810 Invoices to Internal PO Schemas. Adhering to the version-specific segment requirements in the X12 standards documentation keeps the mapper compatible with both modern and legacy trading partners.

Debugging & Pipeline Recovery Permalink to this section

When a 810/850 pair fails to reconcile, the goal is a self-clearing exception queue, not a manual scavenger hunt. Route every failure to a structured DLQ that carries the full mapping context, then tag it so root-cause analytics can spot systemic partner issues.

DLQ payload contract. Each entry stores the raw 850 and 810 envelopes, the resolved canonical record, the normalized join key, and the computed quantity/price deltas. Without the deltas, an analyst has to re-derive the failure by hand.
Failure-reason taxonomy. Tag every record with one of NO_PO_MATCH, QTY_TOLERANCE_EXCEEDED, PRICE_TOLERANCE_EXCEEDED, SUPPLIER_MISMATCH, CONTROL_TOTAL_MISMATCH, or SCHEMA_INVALID. This single field turns a flat queue into a triage dashboard.
Audit log fields. Emit po_number, invoice_number, line_sequence, match_status, tolerance_applied, fx_rate_id, and mapped_at for every record — matched or not. Write them to append-only storage so SOX and internal audit reviews can replay any decision.
Monitoring signals & alert thresholds. Track the failure-reason distribution per partner. A CONTROL_TOTAL_MISMATCH rate above ~1% usually means truncated line sets from a translator change; a climbing NO_PO_MATCH rate points at PO-reference drift on the invoice side. Alert the trading-partner onboarding team rather than loosening a tolerance. For date parsing rigor that eliminates regional misalignment at the source, reference the ISO 8601 date and time format specification.

FAQ Permalink to this section

Why do invoices reference the PO number in two different places? Permalink to this section

X12 lets the PO number live in BIG04 and also in a REF*PK segment, and trading partners are inconsistent about which they populate. Resolve it with an explicit precedence list (BIG04 first, then REF*PK) at mapping time and log a po_join_mismatch whenever the two disagree, so you catch partner-side drift before it reaches settlement instead of silently picking the wrong key.

Should I match line items by IT101 sequence or by product ID? Permalink to this section

Use the product ID (IT106/IT107) as the authoritative line key after alias resolution, and treat IT101 as a fallback ordinal only when product identifiers are missing. Sequence-only alignment breaks the moment a supplier reorders, splits, or consolidates lines on the invoice — which is common with partial shipments.

Why must monetary values be parsed as Decimal rather than float? Permalink to this section

Floating-point arithmetic introduces representation error that can push an otherwise-clean line just outside a tight price window, generating a false PRICE_TOLERANCE_EXCEEDED. Carrying every amount as a fixed-point Decimal from the parser through the tolerance check keeps the comparison exact and the audit trail defensible.

How do I stop a redelivered 810 from paying a supplier twice? Permalink to this section

Persist an idempotency key of po_number + invoice_number + line_sequence before you post to ERP. A VAN redelivery or AS2 retry then maps to the same key and is suppressed at the commit boundary, so the pipeline stays exactly-once even though the transport layer is at-least-once.

What belongs in the mapping layer versus the match engine? Permalink to this section

The mapping layer only normalizes both documents into the canonical schema and validates structure; it does not decide whether a pair reconciles. Tier selection (exact, tolerance, similarity) and exception adjudication belong to the match engine downstream, which keeps the mapper stateless and replayable.

EDI 810 vs 850 Schema Mapping Permalink to this section#

Core Concept & Decision Criteria Permalink to this section#

Implementation Permalink to this section#

Configuration & Threshold Calibration Permalink to this section#

Orchestration & Integration Permalink to this section#

Debugging & Pipeline Recovery Permalink to this section#

FAQ Permalink to this section#

Why do invoices reference the PO number in two different places? Permalink to this section#

Should I match line items by IT101 sequence or by product ID? Permalink to this section#

Why must monetary values be parsed as Decimal rather than float? Permalink to this section#

How do I stop a redelivered 810 from paying a supplier twice? Permalink to this section#

What belongs in the mapping layer versus the match engine? Permalink to this section#

Related Permalink to this section#