About Us

Industry

Services

Published - 2 months ago | 7 min read

Model Once, Represent Everywhere: How Netflix’s Unified Data Architecture (UDA) Solves the Spider-Man Problem

management

Imagine watching Stranger Things on Netflix. Now picture all the behind-the-scenes work that gets a new episode from a director’s camera to your living room—editing, cataloging, translating, encoding, reporting, promoting, even recommending. Each department and platform—production, media processing, marketing, streaming, analytics—has to handle information about the same episode: its name, cast, season, assets, language versions, and more.

Now, what happens if one team calls the show a “series,” another calls it a “title,” and a third models it as a collection of “assets”? Multiply that by every film, series, game, live event, or ad Netflix manages. Suddenly, you have dozens of models of the same core business concepts, each isolated, slightly different, and often conflicting. 

This is the “Spider-Man Pointing Meme” of data: every system is pointing at the same thing, but nobody agrees on what it is. 

Netflix, like most tech-driven companies, hit a wall: inconsistent models, broken references, and data that doesn’t connect across microservices. The answer? Model once—represent everywhere. That’s the core promise of UDA (Unified Data Architecture), Netflix’s ambitious approach to semantic data integration at a global scale.

Let’s break down why this matters, how it works, and how Netflix is quietly reshaping the data plumbing behind the world’s favorite streaming platform.

The Pain of Isolated Data Models

The Real-World Problem

Imagine you’re launching a new Netflix original movie. Here’s how the same concept—say, “The Gray Man”—gets modeled in different systems:
- Enterprise GraphQL Gateway: Defines a Movie type with actors, genres, and release date.
- Asset Management Platform: Tracks media files as “assets” linked to production IDs.
- Encoding Pipelines: Sees the movie as a set of video/audio encodings with technical metadata.
- Ad Platform: Treats the title as an ad inventory unit.

Each system uses its own language, definitions, and data structures. When the marketing team wants to launch a campaign, they struggle to join data from these sources. Worse, technical teams spend weeks mapping, reconciling, and fixing mismatched definitions—work that multiplies with every new service or data product.

Key Issues:

- Duplicated, inconsistent models
- Terminology chaos (series vs. title vs. asset)
- Broken references and data quality nightmares
- Limited connectivity—data can’t flow or be joined across systems

Introducing UDA—Unified Data Architecture

Netflix’s UDA solves the Spider-Man problem with a deceptively simple principle:  “Define your business concepts once, then project (represent) those concepts everywhere you need them.”  UDA lets Netflix model “Movie,” “Actor,” or “Game” in a single place—with rich semantics and relationships—and then automatically generate consistent technical schemas for each system: GraphQL, SQL, Avro, Java, and more.

How UDA Works (The Netflix Way)

- Model Once: Domain experts and engineers collaborate to define core business concepts in a single, shared model.
- Connect and Catalog: Each model is mapped to real-world data containers—databases, APIs, file stores—so everyone knows where and how to find the data behind the concept.
- Transpile & Project: UDA can automatically generate the schemas needed for each system (e.g., GraphQL types, SQL tables) directly from the conceptual model, keeping them in sync.
- Faithful Data Movement: Data flows between systems—say, from asset management to analytics—without losing meaning or integrity, because all sides speak the same language.
- Discovery & Graph Traversal: Anyone can search or programmatically explore the unified model, following relationships (e.g., “which movies does this actor appear in?”) without guessing terminology.
- Code Integration: APIs and SDKs let engineers interact with the unified model in their language of choice.

Think of UDA as Netflix’s “Rosetta Stone” for business data.

A Netflix Analogy—How UDA Works Behind the Scenes

Let’s use a Netflix example. Suppose you’re working on two Netflix projects:
- Launching a New Show
- Generating a “Top 10” Report

Old Ways:

- Each app, database, and report defines its own “Show” object.
- Data engineers write custom scripts to map/convert fields between systems.
- Any change to “Show” (e.g., adding “IsInteractive”) requires manual updates across every schema.
- Reports often break due to misaligned field names or data types.

With UDA:

- Product team defines “Show” once—complete with title, cast, genres, release window, etc.—in UDA.
- UDA automatically generates GraphQL types for the API, Avro schemas for data pipelines, and SQL for reporting.
- All systems refer to the same concept, with mapped relationships and attributes.
- When “IsInteractive” is added, UDA propagates the change to every system’s schema.
- Generating a “Top 10” report? Just ask for “Show” and its relationships; UDA’s knowledge graph connects the dots across APIs and tables.

UDA lets Netflix scale new features, automate schema updates, and join data reliably—all while reducing duplicated effort and human error.

The Technical Backbone—Knowledge Graphs and Upper

UDA is a Knowledge Graph

At its core, UDA is a knowledge graph—a graph-based model that connects business concepts, technical schemas, and real-world data containers. Instead of isolated definitions, everything’s a node with relationships:
- “Actor” → appears_in → “Show”
- “Show” → has_asset → “Media File”
- “Show” → has_ad_slot → “Ad Inventory”

Technologies:
- RDF: Provides the flexible graph structure.
- SHACL: Supports validation and constraint checks.

Upper: The Model of All Models

UDA’s secret sauce is a metamodel called Upper. Upper is a formal language for describing any domain—business or system—in a way that’s introspectable, versionable, and programmatically accessible. It enables:
- Controlled vocabularies (think: business glossaries)
- Taxonomies and hierarchies (e.g., genre trees)
- Domain extensions (e.g., “InteractiveShow” extends “Show”)
- Rich datatypes and relationships

Upper itself is self-referencing, self-describing, and self-validating. It lets UDA bootstrap its own  infrastructure—every other model, schema, or mapping is an extension of Upper. 

Analogy: Upper is like Netflix’s “master screenplay”—every character, plot point, and relationship is defined in one canonical place.

Connecting Models to Data—Representations, Mappings, and Projections

1. Data Container Representations

Each system—GraphQL, Data Mesh, Iceberg tables, Java APIs—stores information differently. UDA represents each container’s schema as part of the knowledge graph, mapping every field, type, and relationship.

2. Mappings

Mappings are the glue. They connect the high-level “Show” model in UDA to the actual database table, API endpoint, or file column where its data lives. Need to know which table holds “Actor”? Follow the mapping in the graph.
Mappings also power intent-based automation: Want to move data from the ad platform to analytics? UDA can automatically configure the right data pipelines because it knows the meaning and structure behind every piece.

3. Projections

Projections are UDA’s way of “projecting” the conceptual model into a technical schema for a given system. For example:
- Transpiling a UDA domain model into a GraphQL schema
- Generating an Avro schema for a data pipeline
- Creating a new Iceberg table, ready for population

All with semantic consistency—no more hand-coded schemas.

Real-World Use Cases—PDM and Sphere

1. Primary Data Management (PDM): Controlled Vocabularies

PDM is Netflix’s platform for managing the “official” definitions of things like genres, show statuses, production milestones, or ad types. Here’s how UDA makes this easy:
- Governance: All business terms are defined once, with controlled vocabularies (think: what values are valid for “Show Status”?)
- Automation: UDA auto-generates the schemas and APIs needed for every downstream system.
- Consistency: No matter where you use a genre—marketing, UI, reporting—it’s always the same, thanks to UDA’s knowledge graph.

2. Sphere: Operational Reporting

Business users need quick, self-service reports—“Top 10 actors by screen time,” “Shows released this month.” With UDA:
- Users search with familiar terms—no need to know technical table names.
- UDA’s graph connects “actor” to “show” to “asset” to “ad campaign,” no matter where those concepts live.
- Sphere auto-generates correct, joinable SQL queries by traversing the graph.
- The system prevents impossible or semantically wrong queries by leveraging the unified model.

Result: Business users get what they need, while engineers maintain consistency and control.

Why This Matters—A Strategic Advantage

Netflix’s UDA is about much more than making engineers’ lives easier (though it does). It enables:
- Faster feature rollout: Add new business concepts or features once, and they appear everywhere.
- Lower integration costs: No more costly mapping exercises between misaligned schemas.
- Data quality and trust: Errors, inconsistencies, and data drift are caught and fixed at the model level.
- Business agility: New business lines (games, live events, ads) can be integrated using the same unified backbone.
- Innovation at scale: With a semantic layer, automation and AI initiatives (like recommendation or content discovery) have a solid, trustworthy foundation.

In short: UDA is Netflix’s way of future-proofing its data architecture for whatever comes next.

Conclusion

Next time you’re scrolling Netflix, remember: behind every recommendation, every personalized banner, every instant playback, there’s a silent infrastructure making sure “movie,” “series,” “game,” and every other concept mean the same thing everywhere—no matter which app, report, or microservice you touch.
That’s the power of UDA. By solving the “Spider-Man problem” of duplicate models and disconnected data, Netflix has built a scalable foundation for innovation, automation, and operational excellence.
Key takeaway: If your organization is wrangling with data silos, inconsistent definitions, and costly integrations, Netflix’s approach offers a practical, proven blueprint: model your business concepts once, represent them everywhere, and let your knowledge graph do the heavy lifting.

Written by / Author

Manasi Maheshwari

Found this useful? Share With

Top blogs

Most Read Blogs

2 years ago -

10 min read

Why Website Design is so important?

technology

a year ago -

15 min read

Top 14 AI-Powered Web Accessibility Tools

technology

tools

a year ago -

7 min read

Large Behavior Models vs. Large Language Models

technology

tools

Wits Innovation Lab is where creativity and innovation flourish. We provide the tools you need to come up with innovative solutions for today's businesses, big or small.

General

Los Angeles, California

Crafted in-house by WIL’s talented minds