Data Mapping & Transformation: Handling Different APIs & Data Formats

Master field mapping, data normalization, unit conversion, currency handling, and category taxonomy mapping across eCommerce APIs and data formats.

E

ECOSIRE Research and Development Team

ECOSIRE-Team

15. März 202610 Min. Lesezeit2.2k Wörter

Dieser Artikel ist derzeit nur auf Englisch verfügbar. Die Übersetzung folgt bald.

Teil unserer eCommerce Integration-Serie

Den vollständigen Leitfaden lesen

Data Mapping & Transformation: Handling Different APIs & Data Formats

Every eCommerce platform speaks a different language. Amazon sends orders as XML with nested address objects. Shopify returns JSON with flat string fields. eBay uses a mix of REST and legacy XML-RPC. WooCommerce embeds metadata in key-value arrays. Your ERP expects everything in a specific internal format with validated data types.

Data mapping and transformation is the translation layer that makes multi-channel integration work. Get it right and data flows silently between systems. Get it wrong and you spend hours debugging why customer phone numbers are populating the city field or why product weights are off by a factor of 2.2.

Key Takeaways

  • A canonical data model (internal standard) eliminates N-to-N mapping in favor of N-to-1 plus 1-to-N
  • Unit conversion errors are the most common and most expensive data mapping bug in cross-border eCommerce
  • Defensive parsing — validate every field, default every missing value — prevents cascading failures
  • Version your mappings alongside your code; API changes break integrations silently without versioned schemas

The Canonical Data Model

Without a canonical model, connecting 5 channels to your ERP requires 10 unique mappings (5 inbound + 5 outbound), each handling the other system's quirks. Adding a sixth channel requires 2 more mappings.

With a canonical model, each channel maps to and from a single internal format. Adding a sixth channel requires only 1 new inbound mapper and 1 new outbound mapper — regardless of how many other channels exist.

Designing the Canonical Model

Your canonical model should be:

  • Superset of all channels: Include every field any channel might need, even if some channels do not use every field
  • Strongly typed: Dates are ISO 8601, weights are in grams, currencies use ISO 4217 codes, prices are integers (cents) not floats
  • Versioned: Schema changes are explicit and backward-compatible
  • Documented: Every field has a description, data type, validation rule, and source mapping

Example: Canonical Order Model

A simplified canonical order:

| Field | Type | Source: Shopify | Source: Amazon | Source: eBay | |-------|------|----------------|---------------|-------------| | externalOrderId | string | order.id | AmazonOrderId | OrderID | | customerEmail | string | order.email | BuyerInfo.BuyerEmail | TransactionArray.Transaction.Buyer.Email | | shippingName | string | order.shipping_address.name | ShippingAddress.Name | ShippingAddress.Name | | lineItems[].sku | string | line_items[].sku | OrderItems[].SellerSKU | TransactionArray.Transaction.Item.SKU | | lineItems[].quantity | integer | line_items[].quantity | OrderItems[].QuantityOrdered | TransactionArray.Transaction.QuantityPurchased | | lineItems[].priceInCents | integer | line_items[].price * 100 | OrderItems[].ItemPrice.Amount * 100 | TransactionArray.Transaction.TransactionPrice * 100 | | currency | string (ISO 4217) | order.currency | OrderTotal.CurrencyCode | TransactionArray.Transaction.AmountPaid.currencyID | | shippingMethod | enum | order.shipping_lines[0].title | ShipServiceLevel | ShippingServiceSelected.ShippingService | | orderDate | string (ISO 8601) | order.created_at | PurchaseDate | CreatedDate |

Notice how every source maps to the same canonical structure. The transformation handles path differences (nested vs flat), naming differences (camelCase vs PascalCase vs snake_case), and format differences (dates, numbers, currencies).


Common Mapping Challenges and Solutions

Data mapping is full of edge cases. Here are the most common problems and how to solve them.

| Challenge | Example | Solution | |----------|---------|----------| | Missing fields | eBay does not send customer email for guest checkout | Default to empty string, flag for manual review | | Different date formats | Shopify: ISO 8601, Amazon: ISO 8601, eBay: US format sometimes | Parse with a library (dayjs, date-fns), always store as ISO 8601 | | Price as float vs integer | Shopify: "19.99" (string), Amazon: 19.99 (float) | Multiply by 100, round, store as integer cents | | Name splitting | One field: "John Smith" vs two fields: first/last | Split on last space, handle edge cases (Jr., III, van der) | | Address formatting | US: state as 2-letter code, UK: no state, DE: different format | Normalize to structured address (line1, line2, city, state, postal, country) | | Phone number formats | "+1 (555) 123-4567" vs "5551234567" vs "+15551234567" | Strip non-digits, parse with libphonenumber, store in E.164 format | | Weight units | Shopify: pounds/ounces, Amazon: configurable, eBay: varies | Convert everything to grams internally, convert outbound per channel | | HTML in text fields | Description with HTML tags vs plain text requirement | Strip HTML for plain text channels, preserve for HTML channels | | Enum mismatches | Order status: "paid" vs "Completed" vs "CONFIRMED" | Map to internal enum via lookup table | | Null vs empty string | Some APIs distinguish null (not provided) from "" (explicitly empty) | Normalize to null for missing, "" for explicitly empty |


Unit Conversion

Unit conversion errors cause real financial damage. A product listed at 2.2 kg on your site appearing as 2.2 lbs on Amazon means shipping cost estimates are wrong, dimensional weight calculations are wrong, and customers receive a product twice as heavy as expected.

Weight Conversion

| From | To Grams | To Ounces | To Pounds | To Kilograms | |------|---------|----------|----------|-------------| | 1 gram | 1 | 0.03527 | 0.002205 | 0.001 | | 1 ounce | 28.3495 | 1 | 0.0625 | 0.02835 | | 1 pound | 453.592 | 16 | 1 | 0.45359 | | 1 kilogram | 1000 | 35.274 | 2.20462 | 1 |

Rule: Store all weights in grams internally. Convert outbound to whatever unit each channel requires. Never trust the unit label from incoming data — validate that the value makes sense for the product category. A laptop weighing 2 grams is obviously in kilograms.

Dimension Conversion

Dimensions are equally treacherous. Amazon US expects inches. Amazon DE expects centimeters. Your shipping software might need millimeters.

Rule: Store all dimensions in millimeters internally. Convert outbound per channel. Validate that dimensions are physically plausible.

Currency Conversion

Multi-currency handling adds another layer. Your canonical model stores prices in the smallest unit of the base currency (cents for USD, pence for GBP, centimes for EUR).

For cross-border orders, store both the original currency amount and the converted base currency amount with the exchange rate used. This creates an audit trail for currency-related discrepancies.


Data Normalization Patterns

Raw marketplace data is messy. Normalization cleans it up before it enters your canonical model.

Text Normalization

  • Trim whitespace: Leading and trailing spaces are common in API responses
  • Normalize Unicode: Convert full-width characters, smart quotes, and special characters to their ASCII equivalents where appropriate
  • Case standardization: Store internal data in a consistent case (e.g., UPPER for country codes, Title Case for names, lower for emails)
  • HTML entity decoding: &amp; to &, &lt; to <, etc.

Address Normalization

Addresses are the most inconsistent data type across channels. A normalization pipeline should:

  1. Parse free-text addresses into structured components (street, city, state, postal, country)
  2. Validate postal codes against country format rules
  3. Normalize country to ISO 3166-1 alpha-2 codes (US, GB, DE — not "United States", "UK", "Germany")
  4. Normalize state/province to standard abbreviations
  5. Validate that city/state/postal combinations are geographically consistent

SKU Normalization

SKUs from different sources may use different formats for the same product:

  • Supplier: "ABC-001-BLK-L"
  • Amazon: "ABC001BLKL"
  • Shopify: "abc-001-black-large"
  • eBay: "ABC 001 Black L"

Your canonical model should use a single internal SKU format and maintain a lookup table mapping external SKU formats to internal IDs.


API Format Handling

Different APIs return data in different formats. Your transformation layer must handle all of them.

JSON (Shopify, Walmart, TikTok Shop)

Most modern APIs use JSON. Parsing is straightforward, but watch for:

  • Numeric precision: JSON numbers can lose precision for large integers (order IDs above 2^53). Parse as strings if needed.
  • Nested structures: Shopify nests shipping addresses inside orders inside the response. Use proper path navigation.
  • Pagination: Cursor-based (Shopify) or page-based. Handle rate limiting between pages.

XML (Amazon SP-API reports, eBay)

XML adds complexity with namespaces, attributes vs elements, and encoding declarations.

  • Namespace handling: Amazon reports use XML namespaces that must be registered before XPath queries work
  • CDATA sections: Text content may be wrapped in CDATA, which some parsers strip and others preserve
  • Character encoding: Always parse as UTF-8. Some legacy feeds declare ISO-8859-1.

CSV/TSV (Google Shopping, Amazon flat files)

Feed-based channels accept tabular data.

  • Column order matters: Some feeds are position-dependent, not header-dependent
  • Escaping: Fields containing commas must be quoted. Fields containing quotes must use double-quoting.
  • Encoding: BOM (Byte Order Mark) at file start causes parsing failures in some systems. Strip it.
  • Line endings: Windows (CRLF) vs Unix (LF). Normalize before parsing.

EDI (Enterprise retail, 3PLs)

Electronic Data Interchange is still used by large retailers and 3PLs. EDI documents (850 Purchase Order, 856 Advance Ship Notice, 810 Invoice) use fixed-width or delimiter-separated formats defined by X12 or EDIFACT standards.


Error Handling in Transformation

When data does not match your expected schema, the transformation layer must decide: fail, default, or flag.

Strategy Matrix

| Error Type | Strategy | Example | |-----------|---------|---------| | Missing required field | Fail (reject the record) | Order without customer email | | Missing optional field | Default value | No phone number — default to null | | Invalid format | Attempt correction, flag if unable | Date "03/15/2026" parsed as ISO | | Out of range value | Flag for review | Weight of 0 grams (likely missing) | | Unknown enum value | Map to "other", flag for review | New shipping method not in lookup | | Encoding issues | Clean and log | Mojibake in product titles | | Schema version mismatch | Transform using version adapter | API v2 response to v3 handler |

Validation Pipeline

Every record should pass through a validation pipeline after transformation:

  1. Schema validation: Does the record match the expected structure?
  2. Type validation: Are numbers actually numbers, dates actually dates?
  3. Business rule validation: Is the order total positive? Is the shipping address in a country you serve?
  4. Referential validation: Does the SKU exist in your product catalog?

Records that fail validation are quarantined — stored in an error queue for manual review rather than silently dropped or processed with bad data.

For monitoring these validation failures, see Integration Monitoring.


Versioning and Change Management

APIs change. Shopify introduces a new API version every quarter. Amazon updates SP-API models periodically. eBay deprecates legacy endpoints. Your mapping layer must handle these changes without downtime.

Versioning Strategy

  • Pin API versions: Always specify the API version you are calling. Shopify lets you request 2025-01. Amazon SP-API uses dated model versions.
  • Version your mappers: When a channel API changes, create a new mapper version rather than modifying the existing one. Run both versions in parallel during the transition.
  • Automated regression tests: For each mapper, maintain a set of sample inputs and expected outputs. When a mapper changes, tests catch unintended regressions.
  • Deprecation monitoring: Subscribe to API changelogs and sunset notifications. Plan migrations 60 days before deprecation dates.

For the complete integration architecture, see the pillar post: The Ultimate eCommerce Integration Guide.


Frequently Asked Questions

How do I handle fields that exist on one channel but not another?

Your canonical model includes the superset of all fields. When transforming inbound data from a channel that lacks a field, set it to null or a sensible default. When transforming outbound to a channel that does not accept a field, simply omit it. The canonical model acts as a universal translator — not every language has a word for every concept, and that is fine.

What is the best library for data transformation in a Node.js stack?

For JSON transformations, libraries like JSONata, Lodash (for path access and manipulation), and Zod (for validation) cover most needs. For XML, use fast-xml-parser for parsing and xmlbuilder2 for construction. For CSV, Papa Parse handles edge cases well. For complex ETL pipelines, consider Apache NiFi or custom transformation functions with thorough unit tests.

How do I test data mappings without hitting live APIs?

Record real API responses as fixtures and use them in unit tests. Each mapper should have a comprehensive test suite with real-world examples, edge cases (empty fields, maximum lengths, special characters), and error cases (malformed data). Run these tests in CI/CD on every commit that modifies mapping code. Tools like Nock (Node.js) or WireMock (Java) can mock API endpoints for integration tests.

Should I use an ETL tool or write custom transformation code?

For standard eCommerce integrations with well-known platforms, custom code in your application layer (Node.js/TypeScript or Python) is more maintainable than a separate ETL tool. ETL platforms (Fivetran, Airbyte, Apache NiFi) add value when you are integrating 20+ data sources with complex transformation pipelines. For 3-8 channel eCommerce integrations, purpose-built mappers with good test coverage are simpler and more debuggable.


What Is Next

Data mapping is the unglamorous foundation that makes multi-channel integration reliable. When your transformation layer handles every edge case gracefully, the rest of your integration stack operates on clean, consistent, validated data — and the late-night debugging sessions disappear.

Explore ECOSIRE's integration services for pre-built data mappers connecting Odoo to 15+ marketplaces, or contact our team to discuss custom transformation requirements for your integration.


Published by ECOSIRE — helping businesses scale with AI-powered solutions across Odoo ERP, Shopify eCommerce, and OpenClaw AI.

E

Geschrieben von

ECOSIRE Research and Development Team

Entwicklung von Enterprise-Digitalprodukten bei ECOSIRE. Einblicke in Odoo-Integrationen, E-Commerce-Automatisierung und KI-gestützte Geschäftslösungen.

Chatten Sie auf WhatsApp