How to Map EDI 810 Invoices to Internal PO Schemas
When an EDI 810 invoice arrives, your reconciliation pipeline must resolve it against an existing 850 purchase order, normalize line-level quantities and pricing, and commit the result to your ERP or data warehouse without introducing duplicate financial records. This guide provides the exact parsing patterns, schema definitions, and recovery mechanisms required to build a production-grade mapping pipeline. The architecture assumes a Python-based ETL stack, Pydantic for validation, and transactional database writes. For foundational design principles governing this workflow, refer to the Core Architecture & Data Mapping for Reconciliation framework before implementing segment-level transformations.
1. X12 810 Parsing & Segment Normalization
EDI 810 files use tilde (~) as segment terminators and asterisk (*) as element delimiters. Do not rely on naive string splitting; implement a stateful parser that respects hierarchical loops (IT1, PID, AMT) and handles carriage returns or malformed line breaks. The following implementation normalizes whitespace, isolates segments, and groups them by identifier for downstream traversal.
import re
from typing import Dict, List, Tuple
def parse_edi_810(raw_payload: str) -> Dict[str, List[List[str]]]:
"""
Splits raw EDI 810 into a structured dictionary of segments.
Handles ~ terminators, strips trailing whitespace, and preserves element arrays.
"""
normalized = raw_payload.replace('\r\n', '\n').replace('\r', '\n')
segments = [seg.strip() for seg in normalized.split('~') if seg.strip()]
parsed: Dict[str, List[List[str]]] = {}
for seg in segments:
parts = seg.split('*')
seg_id = parts[0]
parsed.setdefault(seg_id, []).append(parts)
return parsed
Configuration Tuning:
- Set
MAX_SEGMENT_LENGTH = 350to reject truncated transmissions before they hit your mapper. - Enable
STRICT_LOOP_ORDER = Trueif your trading partners consistently violate X12 4010/5010 loop sequencing. - Log raw payloads to a secure S3 bucket with
KMSencryption before parsing to satisfy audit requirements.
Debugging Parsing Failures:
- Verify segment terminator consistency using
grep -c '~' <file>against expected line counts. - If
IndexErroroccurs duringparts[0], check for empty trailing tildes or embedded newlines within quoted strings. - Validate ISA/IEA envelope counts:
ISAmust equalIEA, andSTmust equalSE. Mismatches indicate truncated network drops.
2. Internal PO Schema Definition
Define a strict target schema using Pydantic v2. This prevents downstream type coercion errors and enforces procurement compliance rules before reconciliation begins. The schema below models both header-level financial data and line-item granularity.
from pydantic import BaseModel, Field, field_validator, model_validator
from decimal import Decimal, ROUND_HALF_UP
from datetime import date
from typing import Optional, List
class InvoiceLineItem(BaseModel):
line_number: int = Field(ge=1)
supplier_sku: str = Field(min_length=1)
internal_sku: Optional[str] = None
quantity_invoiced: Decimal = Field(ge=0, decimal_places=2)
unit_price: Decimal = Field(ge=0, decimal_places=4)
uom: str = Field(min_length=2, max_length=3, pattern="^[A-Z]{2,3}$")
extended_amount: Optional[Decimal] = None
@field_validator('extended_amount', mode='before')
@classmethod
def calc_extended(cls, v: Optional[Decimal], info) -> Decimal:
if v is not None:
return v.quantize(Decimal('0.01'), rounding=ROUND_HALF_UP)
qty = info.data.get('quantity_invoiced')
price = info.data.get('unit_price')
if qty and price:
return (qty * price).quantize(Decimal('0.01'), rounding=ROUND_HALF_UP)
raise ValueError("extended_amount requires quantity_invoiced and unit_price")
class InvoiceHeader(BaseModel):
invoice_number: str = Field(min_length=1, max_length=20)
po_number: str = Field(min_length=1, max_length=20)
invoice_date: date
supplier_id: str
currency_code: str = Field(min_length=3, max_length=3)
line_items: List[InvoiceLineItem]
total_amount: Decimal = Field(ge=0, decimal_places=2)
Schema validation must occur immediately after parsing. Refer to the EDI 810 vs 850 Schema Mapping reference when aligning po_number fields against your internal procurement master.
3. Segment-to-Schema Transformation Logic
Mapping X12 segments to the Pydantic model requires deterministic traversal. The following transformation sequence guarantees data integrity:
| X12 Segment | Target Field | Transformation Rule |
|---|---|---|
BIG (02) |
invoice_number |
Strip leading/trailing whitespace. Validate against duplicate index. |
BIG (03) |
invoice_date |
Parse YYYYMMDD. Reject if future-dated beyond T+3 business days. |
REF*IA |
po_number |
Cross-reference against active PO table. Fail if status != OPEN or PARTIAL. |
IT1 (01) |
line_number |
Cast to int. Validate sequential ordering. |
IT1 (02) |
quantity_invoiced |
Cast to Decimal. Apply ROUND_HALF_UP to 2 places. |
IT1 (04) |
unit_price |
Cast to Decimal. Apply ROUND_HALF_UP to 4 places. |
IT1 (06) |
uom |
Normalize to uppercase. Map supplier UOM to internal standard (e.g., CS → EA via conversion factor). |
TDS (01) |
total_amount |
Compare against SUM(line.extended_amount). Allow ±0.05 rounding tolerance. |
UOM & Currency Normalization:
- Maintain a lookup table mapping supplier UOMs to base units. Multiply
quantity_invoicedby the conversion factor before committing. - If
currency_code != USD, apply daily FX rates from your treasury API. Store both original and converted amounts for audit trails.
4. Production Debugging & Exception Recovery
Reconciliation pipelines fail predictably. Implement the following diagnostic workflow to isolate and resolve mapping errors:
- PO Mismatch (
404orSTALE)
- Query:
SELECT status, last_updated FROM purchase_orders WHERE po_number = ? - Action: If
CLOSED, route toAP_HOLD. IfDRAFT, trigger PO activation webhook. - Log:
{"error": "PO_NOT_FOUND", "po": po_number, "invoice": invoice_number}
- Quantity/Price Variance Exceeds Tolerance
- Threshold:
abs(invoiced_qty - po_qty) / po_qty > 0.02orabs(price_diff) > 0.05 - Action: Flag line item for manual review. Do not auto-reject unless variance > 15%.
- Debug: Compare
IT1elements against850baseline. Check for partial shipments or backorder splits.
- Duplicate Invoice Submission
- Constraint: Unique composite index on
(supplier_id, invoice_number, currency_code) - Action: Return
HTTP 409 Conflict. Log duplicate attempt. Do not process. - Validation: Pydantic
@model_validatorcan pre-check against a Redis cache of recently processed invoices.
- Schema Drift from Trading Partner
- Symptom: Unexpected
NTEorAMTloops breakingIT1parsing. - Action: Enable
STRICT_LOOP_ORDER = Falsetemporarily. Implement fallback regex extraction for critical fields. - Monitor: Track segment count deviations using Prometheus metrics (
edi_segment_anomalies_total).
5. Transactional Commit & Audit Compliance
Once validation passes, execute an idempotent upsert to prevent financial duplication. Wrap the operation in a database transaction with explicit rollback on constraint violation.
import psycopg2
from psycopg2.extras import execute_values
def commit_reconciliation(invoice: InvoiceHeader, db_conn):
with db_conn.cursor() as cur:
try:
# Header insert
cur.execute(
"""INSERT INTO invoices (invoice_number, po_number, invoice_date,
supplier_id, currency_code, total_amount, status)
VALUES (%s, %s, %s, %s, %s, %s, 'RECONCILED')
ON CONFLICT (supplier_id, invoice_number, currency_code)
DO UPDATE SET status = EXCLUDED.status""",
(invoice.invoice_number, invoice.po_number, invoice.invoice_date,
invoice.supplier_id, invoice.currency_code, invoice.total_amount)
)
# Line items batch insert
line_data = [
(invoice.invoice_number, item.line_number, item.supplier_sku,
item.internal_sku, item.quantity_invoiced, item.unit_price,
item.uom, item.extended_amount)
for item in invoice.line_items
]
execute_values(cur,
"""INSERT INTO invoice_lines (invoice_number, line_number, supplier_sku,
internal_sku, qty_invoiced, unit_price, uom, extended_amount)
VALUES %s
ON CONFLICT DO NOTHING""",
line_data
)
db_conn.commit()
except Exception as e:
db_conn.rollback()
raise RuntimeError(f"Transaction failed for invoice {invoice.invoice_number}") from e
Final Validation Checklist:
Adhere to Pydantic v2 validation patterns when extending field-level checks, and consult the ASC X12 Standards Documentation for version-specific segment requirements. This pipeline guarantees deterministic mapping, audit-ready state transitions, and zero-tolerance financial drift.