pydantic-ai

Advanced Pydantic Patterns

Master advanced Pydantic patterns for production systems. Learn polymorphism, performance optimization, and type safety.

3 modules · 12 lessons · free to read

What you'll learn

  • Design polymorphic APIs with discriminated unions
  • Optimize validation for high-throughput scenarios
  • Create generic models that work with any type
  • Integrate with type checkers for static analysis
  • Build production-ready data processing pipelines

01Complex Models and Validation

Master advanced Pydantic patterns including discriminated unions, root models, cross-field validators, and conditional validation.

1.Discriminated Unions and Polymorphism

In real-world APIs, you often need to handle multiple types of data in the same field. For example, a notification system might send emails, SMS, or push notifications. Discriminated unions let you specify which type it is.

python
from pydantic import BaseModel, Field from typing import Union, Literal class EmailNotification(BaseModel): type: Literal["email"] = "email" email: str subject: str class SMSNotification(BaseModel): type: Literal["sms"] = "sms" phone: str message: str class PushNotification(BaseModel): type: Literal["push"] = "push" title: str body: str class NotificationRequest(BaseModel): notification: Union[EmailNotification, SMSNotification, PushNotification] = Field(..., discriminator="type") request = NotificationRequest(notification={"type": "email", "email": "user@example.com", "subject": "Hello"})

FastAPI then knows to deserialize the union correctly based on the discriminator field. The discriminator parameter tells Pydantic which field identifies the type.

Constraints

  • Use Literal types for the discriminator field
  • Use Union with Field discriminator
  • Return both text and image content
Practice Lesson 1

2.Root Models and Custom Data Structures

RootModel lets you wrap any data structure with Pydantic validation. Instead of a model with fields, you validate a single value or custom type:

python
from pydantic import RootModel from typing import List class Items(RootModel[List[str]]): root: List[str] class Matrix(RootModel[List[List[int]]]): root: List[List[int]] items = Items(["a", "b", "c"]) matrix = Matrix([[1, 2], [3, 4]])

This is useful when:

  1. You want to validate a list or dict directly
  2. You need custom serialization for non-standard types
  3. You're wrapping external data structures

You can also add validators to root models:

python
from pydantic import RootModel, field_validator class PositiveNumbers(RootModel[List[int]]): root: List[int] @field_validator("root") @classmethod def validate_positive(cls, v): for num in v: if num <= 0: raise ValueError("All numbers must be positive") return v

Constraints

  • Use RootModel with type parameters
  • Create instances with the data
  • Return the root values
Practice Lesson 2

3.Advanced Validators with Dependencies

Advanced validators can access other fields and perform complex logic:

python
from pydantic import BaseModel, field_validator, model_validator class User(BaseModel): username: str password: str password_confirm: str age: int @field_validator("password") @classmethod def validate_password(cls, v): if len(v) < 8: raise ValueError("Password must be 8+ characters") return v @model_validator(mode="after") def check_passwords_match(self): if self.password != self.password_confirm: raise ValueError("Passwords do not match") return self @field_validator("age") @classmethod def validate_age(cls, v): if v < 0 or v > 150: raise ValueError("Invalid age") return v

Key patterns:

  • mode='before': Validate before type coercion
  • mode='after': Validate after type coercion (default)
  • Use model_validator to access multiple fields
  • Return the modified value or raise ValueError

Constraints

  • Use @model_validator(mode='after')
  • Compare both fields
  • Raise ValueError if invalid
Practice Lesson 3

4.Conditional Models and Dynamic Schemas

Sometimes a field is required only if another field has a specific value. Pydantic lets you create conditional validations:

python
from pydantic import BaseModel, field_validator, model_validator from typing import Optional, Literal class ShippingInfo(BaseModel): address_type: Literal["home", "business", "po_box"] street: str city: str postal_code: str business_name: Optional[str] = None po_box_number: Optional[str] = None @model_validator(mode="after") def validate_conditional(self): if self.address_type == "business" and not self.business_name: raise ValueError("business_name required for business addresses") if self.address_type == "po_box" and not self.po_box_number: raise ValueError("po_box_number required for PO box addresses") return self

Alternatively, use dynamic fields:

python
from pydantic import BaseModel, Field class DynamicModel(BaseModel): type: str = Field(..., discriminator="type") metadata: dict = {}

Constraints

  • Use Literal for the type field
  • Make payment fields optional
  • Validate based on type
Practice Lesson 4

02Performance Optimization

Optimize Pydantic validation for high-throughput scenarios with streaming, caching, and batch processing.

1.Lazy Validation and Streaming

When processing large datasets, validating everything at once can be slow. Lazy validation validates data on-demand:

python
from pydantic import BaseModel, ConfigDict class Item(BaseModel): model_config = ConfigDict(validate_assignment=False) name: str price: float # Validate only when needed def process_items(items_data): for item_dict in items_data: item = Item(**item_dict) # Validate one at a time yield item # Streaming processing

For streaming large files:

python
def stream_parse_json(file_path): with open(file_path) as f: for line in f: data = json.loads(line) item = Item(**data) yield item

This validates one record at a time rather than loading the entire file into memory.

Constraints

  • Use a generator function with yield
  • Validate one record at a time
  • Return a list of validated records
Practice Lesson 1

2.Model Caching and Memory Management

Reusing model instances and caching can significantly improve performance. Frozen models are especially efficient:

python
from pydantic import BaseModel, ConfigDict from functools import lru_cache class Config(BaseModel): model_config = ConfigDict(frozen=True) # Immutable, hashable app_name: str debug: bool version: str @lru_cache(maxsize=128) def get_config(app_name: str, debug: bool, version: str): return Config(app_name=app_name, debug=debug, version=version) # Reuse the same instance config1 = get_config("MyApp", True, "1.0") config2 = get_config("MyApp", True, "1.0") # Same object assert config1 is config2 # True

Frozen models:

  • Are immutable (can't change fields after creation)
  • Are hashable (can be used as dict keys)
  • Can be cached with lru_cache
  • Reduce memory and improve performance

Constraints

  • Use ConfigDict(frozen=True)
  • Use @lru_cache decorator
  • Return whether objects are the same
Practice Lesson 2

3.Batch Processing and Bulk Operations

Processing many records at once is more efficient than one at a time. Use batch operations:

python
from pydantic import BaseModel, ValidationError from typing import List class Item(BaseModel): id: int name: str price: float def validate_batch(items_data: List[dict]): valid = [] invalid = [] for item_dict in items_data: try: item = Item(**item_dict) valid.append(item) except ValidationError as e: invalid.append({"data": item_dict, "errors": e.errors()}) return {"valid": valid, "invalid": invalid}

Batch processing:

  • Reduces function call overhead
  • Allows partial failures
  • Better for database bulk inserts

Constraints

  • Use try/except for ValidationError
  • Collect both valid and invalid results
  • Return summary counts
Practice Lesson 3

4.Benchmarking and Profiling

Measure validation performance to find bottlenecks:

python
import time from pydantic import BaseModel from typing import List class Item(BaseModel): id: int name: str price: float description: str def benchmark_validation(num_items: int): items_data = [ {"id": i, "name": f"Item {i}", "price": 19.99, "description": f"Description {i}"} for i in range(num_items) ] start = time.perf_counter() validated = [Item(**item) for item in items_data] end = time.perf_counter() elapsed = end - start avg_per_item = (elapsed / num_items) * 1000 # milliseconds return { "total_time_seconds": elapsed, "avg_per_item_ms": avg_per_item, "items_per_second": num_items / elapsed }

Profiles show you:

  • Total validation time
  • Time per item
  • Throughput (items/second)
  • Identify slow validators

Constraints

  • Use time.perf_counter() for accurate timing
  • Calculate items per second
  • Return performance metrics
Practice Lesson 4

03Type Safety and Advanced Features

Master generics, custom serializers, validator selection, and type checker integration.

1.Generic Models and Type Variables

Generic models work with any type, making your code reusable:

python
from pydantic import BaseModel, TypeVar from typing import Generic T = TypeVar('T') class Response(BaseModel, Generic[T]): data: T status: str code: int class User(BaseModel): id: int name: str class Post(BaseModel): id: int title: str user_response = Response[User](data=User(id=1, name="Alice"), status="ok", code=200) post_response = Response[Post](data=Post(id=1, title="My Post"), status="ok", code=200)

Generic models:

  • Work with any data type
  • Maintain type safety
  • Reduce code duplication
  • Enable flexible APIs

Constraints

  • Define generic model with Generic[T]
  • Create specific subclasses
  • Maintain type safety
Practice Lesson 1

2.Custom Serializers and JSON Encoding

Custom serializers control how fields are converted to JSON:

python
from pydantic import BaseModel, field_serializer from datetime import datetime class Event(BaseModel): name: str timestamp: datetime @field_serializer('timestamp') def serialize_timestamp(self, value: datetime) -> str: return value.isoformat() event = Event(name="Deployment", timestamp=datetime.now()) event.model_dump_json() # timestamp serialized as ISO string

Custom serializers:

  • Format dates and times
  • Encrypt sensitive data
  • Convert special types to JSON-compatible formats
  • Control field output

Example with decimal:

python
from decimal import Decimal class Price(BaseModel): amount: Decimal @field_serializer('amount') def serialize_amount(self, value: Decimal) -> str: return str(value)

Constraints

  • Use @field_serializer decorator
  • Convert date to ISO format
  • Return serialized model
Practice Lesson 2

3.Field Validators vs Model Validators

Choose the right validator type for your needs:

Field Validators:

  • Run on individual fields
  • Can modify field values
  • Run before model validators
  • Good for single-field constraints
python
@field_validator('age') @classmethod def validate_age(cls, v): if v < 0: raise ValueError('Age cannot be negative') return v

Model Validators:

  • Run on entire model
  • Access all fields
  • Run after field validators
  • Good for cross-field logic
python
@model_validator(mode='after') def check_consistency(self): if self.start_date > self.end_date: raise ValueError('Start must be before end') return self

Mode matters:

  • mode='before': Before type coercion
  • mode='after': After type coercion (default)

Constraints

  • Field validator for single field
  • Model validator for cross-field
  • Return validated account
Practice Lesson 3

4.Integration with Type Checkers

Use mypy and pyright with Pydantic for static type checking:

python
from pydantic import BaseModel from typing import Optional class User(BaseModel): id: int name: str email: Optional[str] = None def get_user(user_id: int) -> User: return User(id=user_id, name="Alice", email="alice@example.com") user = get_user(1) print(user.name) # OK: name is str print(user.email.lower()) # ERROR: email is Optional[str], could be None if user.email: print(user.email.lower()) # OK: checked for None first

Type checkers catch:

  • Type mismatches
  • Missing fields
  • Optional field access without checks
  • Function return type errors

Configure in pyproject.toml:

python
[tool.mypy] python_version = "3.10" warn_return_any = true warn_unused_configs = true

Constraints

  • Use type hints on all functions
  • Return proper types
  • Handle Optional fields safely
Practice Lesson 4

Frequently Asked Questions

What is a discriminated union?
A union type with no validation. A discriminated union uses a field (like 'type') to identify which model to use.
When would you use RootModel?
For models with many fields. RootModel is used when your model represents a single value like a list or custom type.
How do you validate across multiple fields?
Use @field_validator on each field. @model_validator(mode='after') runs after all fields are parsed, letting you access them all.
What does conditional validation enable?
Faster model creation. Conditional validation lets you require fields based on other field values.
When using discriminated unions, what must be included?
The discriminator field with the correct value. The discriminator field must be included so Pydantic knows which type to use.
How do I create two models: TextResponse (type='text', content: str) and ImageResponse (type='image', url: str). Use them in a union with discriminator='type'. Create a ContentUnion model and a function that creates both types.?
In real-world APIs, you often need to handle multiple types of data in the same field. For example, a notification system might send emails, SMS, or push notifications. Discriminated unions let you specify which type it is.
How do I create a StringList RootModel that wraps List[str] and an IntMatrix RootModel that wraps List[List[int]]. Create a function that instantiates both and returns their root values.?
RootModel lets you wrap any data structure with Pydantic validation. Instead of a model with fields, you validate a single value or custom type:
How do I create a DateRange model with start_date, end_date, and name. Use a model_validator to ensure start_date is before end_date. Create a function that instantiates a valid range.?
Advanced validators can access other fields and perform complex logic:
How do I create a PaymentMethod model with method_type (card/bank/wallet) and optional card_number, bank_account, wallet_id. Use conditional validation to ensure the correct field is provided for each type.?
Sometimes a field is required only if another field has a specific value. Pydantic lets you create conditional validations:
Why would you use streaming validation?
To validate faster. Streaming validates one record at a time, avoiding loading huge files into memory.
What makes frozen models good for caching?
They are faster to create. Frozen models are hashable and immutable, making them usable with lru_cache.
What's an advantage of batch processing?
Simpler code. Batch processing validates many records together, reducing overhead and allowing selective failures.
What metric matters most for throughput?
Total time. Items per second (throughput) shows how many records you can validate in a given time.
What tool would you use to measure validation time?
logging module. time.perf_counter() gives accurate timing for benchmarking validation performance.
How do I create a Record model with id, name, and value. Create a stream_records generator that yields validated records one at a time. Test it with a function that processes a list of dicts.?
When processing large datasets, validating everything at once can be slow. Lazy validation validates data on-demand:
How do I create a frozen Setting model. Use lru_cache to cache instances. Test that the same arguments return the same object.?
Reusing model instances and caching can significantly improve performance. Frozen models are especially efficient:
How do I create batch validation that processes multiple products and separates valid from invalid. Return counts of each.?
Processing many records at once is more efficient than one at a time. Use batch operations:
How do I create a benchmark function that measures validation time for a given number of items. Return total time and throughput (items/second).?
Measure validation performance to find bottlenecks:
What's the benefit of generic models?
They're easier to read. Generic models are reusable with any type while maintaining full type safety.
When would you use a custom serializer?
For all fields. Custom serializers control how fields are converted to JSON, especially for dates, decimals, etc.
What does mypy do?
Validates Pydantic models at runtime. mypy is a static type checker that catches type errors before runtime.
When should you use field_validator vs model_validator?
Always use field_validator. Use field validators for single-field constraints, model validators for relationships.
What does TypeVar enable?
Faster type checking. TypeVar creates flexible generic types that work with any concrete type.
How do I create a generic Container model with a TypeVar T. Create StringContainer and IntContainer subclasses. Instantiate both and return them.?
Generic models work with any type, making your code reusable:
How do I create a Person model with name and birth_date fields. Use a field_serializer to convert the date to ISO format string. Test serialization.?
Custom serializers control how fields are converted to JSON:
How do I create an Account model with username, password, confirm_password. Use field_validator for username (min 3 chars). Use model_validator to ensure passwords match.?
Choose the right validator type for your needs:
How do I create a Product model with optional description. Write type-checked functions that create and process products. Ensure type safety with return types.?
Use mypy and pyright with Pydantic for static type checking:

Ready to write code?

Theory is just the start. Write real code, run tests, build the habit.

Open the playground →