Skip to content

Data Models

Pydantic data models provide validation and type checking for survey data processing.

Overview

Data models represent individual records (rows) and define:

  • Required and optional fields
  • Field validation rules and constraints
  • Foreign key relationships between tables
  • Pipeline step requirements

Models use Pydantic's BaseModel with custom field validators to ensure data quality throughout the processing pipeline.

Key Features

Field Validation

Each field includes validation rules:

age: AgeCategory = step_field(required_in_steps=["extract_tours"])
home_lat: float = step_field(ge=-90, le=90, required_in_steps=["extract_tours"])

Foreign Key Relationships

Models enforce referential integrity:

hh_id: int = step_field(
    ge=1,
    fk_to="households.hh_id",
    required_child=True,
)

Pipeline Step Requirements

Fields specify which processing steps require them:

person_num: int = step_field(ge=1, required_in_steps=["format_ctramp", "format_daysim"])

Usage Example

from data_canon.models.survey import PersonModel

person = PersonModel(
    person_id=1,
    hh_id=100,
    person_num=1,
    age=AgeCategory.AGE_35_64,
    gender=Gender.FEMALE,
    employment=Employment.FULL_TIME,
    student=Student.NOT_STUDENT,
    # ... other fields
)

Survey Data Models

Core data models used in the processing pipeline for households, persons, days, trips, and tours.

data_canon.models.survey.HouseholdModel pydantic-model

Household attributes (minimal for tour building).

Fields:

hh_id pydantic-field

hh_id: int

home_lat pydantic-field

home_lat: float

home_lon pydantic-field

home_lon: float

residence_rent_own pydantic-field

residence_rent_own: ResidenceRentOwn

residence_type pydantic-field

residence_type: ResidenceType

income pydantic-field

income: int | None = None

income_bin pydantic-field

income_bin: IncomeBroad

hh_weight pydantic-field

hh_weight: float | None

num_vehicles pydantic-field

num_vehicles: int

complete pydantic-field

complete: bool

data_canon.models.survey.PersonModel pydantic-model

Person attributes for tour building.

Fields:

person_id pydantic-field

person_id: int

hh_id pydantic-field

hh_id: int

person_num pydantic-field

person_num: int

age pydantic-field

gender pydantic-field

gender: Gender

work_lat pydantic-field

work_lat: float | None

work_lon pydantic-field

work_lon: float | None

school_lat pydantic-field

school_lat: float | None

school_lon pydantic-field

school_lon: float | None

job_type pydantic-field

job_type: JobType | None = None

employment pydantic-field

employment: Employment

student pydantic-field

student: Student

school_type pydantic-field

school_type: SchoolType | None

work_park pydantic-field

work_park: WorkParking | None

work_mode pydantic-field

work_mode: Mode | None

race pydantic-field

race: Race | None = None

ethnicity pydantic-field

ethnicity: Ethnicity | None = None

telework_freq pydantic-field

telework_freq: CommuteFreq | None = None

commute_freq pydantic-field

commute_freq: CommuteFreq | None = None

commute_subsidy_use_3 pydantic-field

commute_subsidy_use_3: BooleanYesNo | None = None

commute_subsidy_use_4 pydantic-field

commute_subsidy_use_4: BooleanYesNo | None = None

is_proxy pydantic-field

is_proxy: bool | None = None

num_days_complete pydantic-field

num_days_complete: int = 0

complete pydantic-field

complete: bool | None = None

person_weight pydantic-field

person_weight: float | None = None

data_canon.models.survey.PersonDayModel pydantic-model

Daily activity pattern summary with clear purpose-specific counts.

Fields:

person_id pydantic-field

person_id: int

day_id pydantic-field

day_id: int

hh_id pydantic-field

hh_id: int

travel_date pydantic-field

travel_date: datetime

travel_dow pydantic-field

travel_dow: TravelDow

complete pydantic-field

complete: bool | None = False

day_weight pydantic-field

day_weight: float | None = None

data_canon.models.survey.UnlinkedTripModel pydantic-model

Trip data model for validation.

Fields:

Validators:

unlinked_trip_id pydantic-field

unlinked_trip_id: int

day_id pydantic-field

day_id: int

person_id pydantic-field

person_id: int

hh_id pydantic-field

hh_id: int

linked_trip_id pydantic-field

linked_trip_id: int

tour_id pydantic-field

tour_id: int | None

o_lon pydantic-field

o_lon: float

o_lat pydantic-field

o_lat: float

d_lon pydantic-field

d_lon: float

d_lat pydantic-field

d_lat: float

o_purpose pydantic-field

o_purpose: Purpose

d_purpose pydantic-field

d_purpose: Purpose

o_purpose_category pydantic-field

o_purpose_category: PurposeCategory

d_purpose_category pydantic-field

d_purpose_category: PurposeCategory

mode_type pydantic-field

mode_type: ModeType

mode_1 pydantic-field

mode_1: Mode | None

mode_2 pydantic-field

mode_2: Mode | None

mode_3 pydantic-field

mode_3: Mode | None

mode_4 pydantic-field

mode_4: Mode | None

duration_minutes pydantic-field

duration_minutes: float

distance_meters pydantic-field

distance_meters: float

depart_time pydantic-field

depart_time: datetime | None

arrive_time pydantic-field

arrive_time: datetime | None

num_travelers pydantic-field

num_travelers: int

complete pydantic-field

complete: bool | None = None

unlinked_trip_weight pydantic-field

unlinked_trip_weight: float | None = None

validate_arrival_after_departure pydantic-validator

validate_arrival_after_departure() -> UnlinkedTripModel

Ensure arrive_time is after depart_time.

Raises:

Type Description
ValueError

If arrival time is before or equal to departure time

data_canon.models.survey.LinkedTripModel pydantic-model

Linked Trip data model for validation.

Fields:

day_id pydantic-field

day_id: int

person_id pydantic-field

person_id: int

hh_id pydantic-field

hh_id: int

linked_trip_id pydantic-field

linked_trip_id: int

joint_trip_id pydantic-field

joint_trip_id: int | None = None

tour_id pydantic-field

tour_id: int

travel_dow pydantic-field

travel_dow: TravelDow

o_purpose pydantic-field

o_purpose: Purpose

o_purpose_category pydantic-field

o_purpose_category: PurposeCategory

o_lat pydantic-field

o_lat: float

o_lon pydantic-field

o_lon: float

d_purpose pydantic-field

d_purpose: Purpose

d_purpose_category pydantic-field

d_purpose_category: PurposeCategory

d_lat pydantic-field

d_lat: float

d_lon pydantic-field

d_lon: float

mode_type pydantic-field

mode_type: ModeType

driver pydantic-field

driver: Driver

num_travelers pydantic-field

num_travelers: int

access_mode pydantic-field

access_mode: AccessEgressMode | None = None

egress_mode pydantic-field

egress_mode: AccessEgressMode | None = None

duration_minutes pydantic-field

duration_minutes: float

distance_meters pydantic-field

distance_meters: float

depart_time pydantic-field

depart_time: datetime

arrive_time pydantic-field

arrive_time: datetime

tour_direction pydantic-field

tour_direction: TourDirection

complete pydantic-field

complete: bool | None = None

linked_trip_weight pydantic-field

linked_trip_weight: float | None = None

data_canon.models.survey.TourModel pydantic-model

Tour-level records with clear, descriptive step_field names.

Fields:

Validators:

tour_id pydantic-field

tour_id: int

hh_id pydantic-field

hh_id: int

person_id pydantic-field

person_id: int

day_id pydantic-field

day_id: int

tour_num pydantic-field

tour_num: int

subtour_num pydantic-field

subtour_num: int

parent_tour_id pydantic-field

parent_tour_id: int

joint_tour_id pydantic-field

joint_tour_id: int | None = None

tour_purpose pydantic-field

tour_purpose: PurposeCategory | None = None

tour_category pydantic-field

tour_category: TourCategory

single_trip_tour pydantic-field

single_trip_tour: bool = False

origin_depart_time pydantic-field

origin_depart_time: datetime

origin_arrive_time pydantic-field

origin_arrive_time: datetime

dest_arrive_time pydantic-field

dest_arrive_time: datetime | None = None

dest_depart_time pydantic-field

dest_depart_time: datetime | None = None

origin_linked_trip_id pydantic-field

origin_linked_trip_id: int

dest_linked_trip_id pydantic-field

dest_linked_trip_id: int | None = None

o_lat pydantic-field

o_lat: float

o_lon pydantic-field

o_lon: float

d_lat pydantic-field

d_lat: float

d_lon pydantic-field

d_lon: float

o_location_type pydantic-field

o_location_type: LocationType

d_location_type pydantic-field

d_location_type: LocationType

tour_mode pydantic-field

tour_mode: ModeType

outbound_mode pydantic-field

outbound_mode: ModeType | None

inbound_mode pydantic-field

inbound_mode: ModeType | None

num_travelers pydantic-field

num_travelers: int = 1

complete pydantic-field

complete: bool | None = None

tour_weight pydantic-field

tour_weight: float | None = None

validate_complete_tours pydantic-validator

validate_complete_tours() -> TourModel

Validate that complete tours have all required fields.

Single-trip tours (where person made one trip but didn't return home) are allowed to have null tour_purpose, destination times, and dest_linked_trip_id. Complete tours must have these fields populated.

data_canon.models.survey.JointTripModel pydantic-model

Joint trip group containing multiple linked trips from same household.

Represents a detected shared trip where multiple household members traveled together. Each joint trip has a unique ID and aggregated spatiotemporal attributes from its member trips.

Fields:

joint_trip_id pydantic-field

joint_trip_id: int

hh_id pydantic-field

hh_id: int

day_id pydantic-field

day_id: int

num_joint_travelers pydantic-field

num_joint_travelers: int

Number of travelers in this joint trip

o_lat_mean pydantic-field

o_lat_mean: float

Mean origin latitude across member trips

o_lon_mean pydantic-field

o_lon_mean: float

Mean origin longitude across member trips

d_lat_mean pydantic-field

d_lat_mean: float

Mean destination latitude across member trips

d_lon_mean pydantic-field

d_lon_mean: float

Mean destination longitude across member trips

depart_time_mean pydantic-field

depart_time_mean: datetime

Mean departure time across member trips

depart_arrive_mean pydantic-field

depart_arrive_mean: datetime

Mean arrival time across member trips

complete pydantic-field

complete: bool | None = None

joint_trip_weight pydantic-field

joint_trip_weight: float | None = None

Travel Model-formatted Data Models

DaySim Models

Output file format models for the DaySim activity-based travel demand model.

CTRAMP Models

Output file format models for the CT-RAMP travel demand model.