Skip to content

CT-RAMP Formatting

processing.formatting.ctramp.format_ctramp

CT-RAMP Formatting Step.

Transforms canonical survey data (persons, households, tours, trips) into CT-RAMP (Coordinated Travel - Regional Activity Modeling Platform) format for use with activity-based travel demand models. See CTRAMP Data Models.

This module orchestrates the transformation of households, persons, tours, and trips from canonical format into seven CT-RAMP tables, intelligently handling missing data and providing configurable income thresholds and filtering options.

Components

Data Flow

Key Dependencies

  1. households_ctramp must be created first (needed by all downstream formatters)
  2. individual_tours_ctramp must be created before persons_ctramp (provides tour statistics)
  3. persons_ctramp and trips use already-formatted tour/household data
  4. Joint tables can be processed independently from individual tables

Mermaid Diagram

flowchart TD
    %% Canonical inputs
    subgraph inputs["Canonical Inputs"]
        direction TB
        hh_canon[("households")]
        per_canon[("persons")]
        tours_canon[("tours")]
        trips_canon[("linked_trips")]
        joint_trips_canon[("joint_trips")]
    end

    %% Stage 1
    subgraph s1["Stage 1: Households"]
        direction LR
        fmt_hh["format_households"]
        hh_ctramp{{"households_ctramp"}}
        fmt_hh --> hh_ctramp
    end

    %% Stage 2
    subgraph s2["Stage 2: Tours"]
        direction TB
        fmt_ind_tour["format_individual_tour"]
        ind_tour_ctramp{{"individual_tours_ctramp"}}
        fmt_ind_tour --> ind_tour_ctramp

        fmt_joint_tour["format_joint_tour"]
        joint_tour_ctramp{{"joint_tours_ctramp"}}
        fmt_joint_tour --> joint_tour_ctramp
    end

    %% Stage 3
    subgraph s3["Stage 3: Persons"]
        direction LR
        fmt_persons["format_persons"]
        per_ctramp{{"persons_ctramp"}}
        fmt_persons --> per_ctramp
    end

    %% Stage 4
    subgraph s4["Stage 4: Locations & Trips"]
        direction TB
        fmt_mand_loc["format_mandatory_location"]
        mand_loc_ctramp{{"mandatory_locations_ctramp"}}
        fmt_mand_loc --> mand_loc_ctramp

        fmt_ind_trip["format_individual_trip"]
        ind_trip_ctramp{{"individual_trips_ctramp"}}
        fmt_ind_trip --> ind_trip_ctramp

        fmt_joint_trip["format_joint_trip"]
        joint_trip_ctramp{{"joint_trips_ctramp"}}
        fmt_joint_trip --> joint_trip_ctramp
    end

    %% Canonical inputs to stages (consolidated)
    inputs -.-> s1
    inputs -.-> s2
    inputs -.-> s3
    inputs -.-> s4

    %% Stage ordering (vertical layout)
    s1 ~~~ s2
    s2 ~~~ s3
    s3 ~~~ s4

    %% Formatted table dependencies (thick solid)
    hh_ctramp ==> fmt_ind_tour
    hh_ctramp ==> fmt_joint_tour
    ind_tour_ctramp ==> fmt_persons
    hh_ctramp ==> fmt_mand_loc
    hh_ctramp ==> fmt_ind_trip
    ind_tour_ctramp ==> fmt_ind_trip
    hh_ctramp ==> fmt_joint_trip

    %% Styling
    classDef canonClass fill:#e1f5ff,stroke:#0277bd,stroke-width:2px
    classDef formatterClass fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
    classDef outputClass fill:#e8f5e9,stroke:#2e7d32,stroke-width:3px

    class hh_canon,per_canon,tours_canon,trips_canon,joint_trips_canon canonClass
    class fmt_hh,fmt_ind_tour,fmt_persons,fmt_mand_loc,fmt_ind_trip formatterClass
    class fmt_joint_tour,fmt_joint_trip formatterClass
    class hh_ctramp,per_ctramp,ind_tour_ctramp,ind_trip_ctramp outputClass
    class mand_loc_ctramp,joint_tour_ctramp,joint_trip_ctramp outputClass

Legend:

  • 🔵 Blue cylinders: Canonical input tables
  • 🟠 Orange boxes: Formatter functions
  • 🟢 Green hexagons: CT-RAMP output tables (all formatted)
  • Dashed arrows (⋯→): Canonical inputs available to formatters (see component details above for specifics)
  • Thick solid arrows (⟹): Formatted table dependencies (execution order)

Note: Canonical input details for each formatter are listed in the Components section above.

Execution Stages

  1. TAZ Filtering (optional): Households without valid home_taz are removed, cascading to persons, tours, and trips
  2. Households: Formatted first to establish income brackets needed downstream
  3. Tours: Formatted before persons to provide tour frequency statistics
  4. Persons: Combines tour-derived fields (activity patterns, frequencies) with demographic characteristics
  5. Mandatory Locations: Uses both canonical (TAZ) and formatted (income) household data
  6. Trips: Formatted last, linking to already-formatted tours

Configuration

All thresholds and defaults are managed via CTRAMPConfig

Implementation Notes

Excluded Fields
  • Random Number Fields: Excluded because they are simulation-specific and not derivable from survey data (household: ao_rn, fp_rn, cdap_rn; tour: imtf_rn, imtod_rn, immc_rn, jtf_rn, jtl_rn, jtod_rn, etc.)
  • Model Output Fields: auto_suff (auto sufficiency), walk_subzone (walk-to-transit accessibility), wait times and logsums
Placeholder Values

When tour data is unavailable or incomplete, person-level tour frequency fields use these defaults:

  • activity_pattern: 'H' (home all day)
  • imf_choice: 0 (no mandatory tours)
  • inmf_choice: 1 (minimum valid code)
  • wfh_choice: 0 (no work from home)
  • jtf_choice: Derived from joint tour data if available; otherwise NONE_NONE (-4)
Empty Data Handling

The module gracefully handles missing data:

  • Empty tour DataFrames result in placeholder person fields
  • Missing joint tour IDs default to individual tour treatment
  • Missing or null values use sensible defaults per field type

format_ctramp

format_ctramp(
    persons: pl.DataFrame,
    households: pl.DataFrame,
    linked_trips: pl.DataFrame,
    tours: pl.DataFrame,
    joint_trips: pl.DataFrame,
    income_low_threshold: int,
    income_med_threshold: int,
    income_high_threshold: int,
    income_base_year_dollars: int,
    taz_field: str = "taz",
    drop_missing_taz: bool = True,
) -> dict[str, pl.DataFrame]

Format canonical survey data to CT-RAMP model specification.

Transforms person, household, tour, and trip data from canonical format to CT-RAMP format required by the activity-based travel demand model. See module docstring for complete component descriptions and data flow.

Parameters:

Name Type Description Default
persons pl.DataFrame

Canonical person data with demographic fields. Required columns: person_id, hh_id, person_num, age, gender, employment, student, school_type, commute subsidies.

required
households pl.DataFrame

Canonical household data with income and dwelling fields. Required columns: hh_id, home_taz, income_detailed, income_followup, num_vehicles.

required
linked_trips pl.DataFrame

Canonical linked trip data. Required columns: linked_trip_id, tour_id, o_purpose_category, d_purpose_category, mode_type, o_taz, d_taz, tour_direction, times, person_id, hh_id.

required
tours pl.DataFrame

Canonical tour data. Required columns: tour_id, hh_id, person_id, person_num, tour_category, tour_purpose, o_taz, d_taz, times, tour_mode, joint_tour_id, parent_tour_id.

required
joint_trips pl.DataFrame

Aggregated joint trip data. Required columns: joint_trip_id, hh_id, num_joint_travelers.

required
income_low_threshold int

Dollar value dividing low from medium income bracket. Must be less than income_med_threshold.

required
income_med_threshold int

Dollar value dividing medium from high income bracket. Must be between income_low_threshold and income_high_threshold.

required
income_high_threshold int

Dollar value dividing high from very high income bracket. Must be greater than income_med_threshold.

required
income_base_year_dollars int

Target year for income conversion (e.g., 2000, 2010). Categorical income values are converted to midpoint dollars in this base year.

required
taz_field str

Field name containing the TAZ ID for CTRAMP formatting (default: "taz").

'taz'
drop_missing_taz bool

If True, remove households without valid TAZ IDs. This cascades to persons, tours, and trips (default: True).

True

Returns:

Type Description
dict[str, pl.DataFrame]

Dictionary with keys: - households_ctramp: Formatted household data (7 core fields including income, TAZ, size) - persons_ctramp: Formatted person data (20+ fields including person type, activity patterns, tour frequencies) - mandatory_locations_ctramp: Mandatory location data (work/school location records) - individual_tours_ctramp: Individual tour data (tour-level attributes: purpose, mode, time, stops) - individual_trips_ctramp: Individual trip data (trip-level attributes for individual tours) - joint_tours_ctramp: Joint tour data (joint tours with composition and participants) - joint_trips_ctramp: Joint trip data (trip-level attributes for joint tours)

Example

from processing.formatting.ctramp import format_ctramp

result = format_ctramp(
    persons=canonical_persons,
    households=canonical_households,
    linked_trips=canonical_linked_trips,
    tours=canonical_tours,
    joint_trips=canonical_joint_trips,
    income_low_threshold=60000,          # $60k divides low from medium
    income_med_threshold=150000,         # $150k divides medium from high
    income_high_threshold=250000,        # $250k divides high from very high
    income_base_year_dollars=2000,       # Convert income to $2000
    drop_missing_taz=True                # Remove households without TAZ
)

# Access formatted tables
households_ctramp = result["households_ctramp"]
persons_ctramp = result["persons_ctramp"]
mandatory_locations_ctramp = result["mandatory_locations_ctramp"]
individual_tours_ctramp = result["individual_tours_ctramp"]
joint_tours_ctramp = result["joint_tours_ctramp"]
individual_trips_ctramp = result["individual_trips_ctramp"]
joint_trips_ctramp = result["joint_trips_ctramp"]

processing.formatting.ctramp.ctramp_config

Configuration model for CT-RAMP formatting.

CTRAMPConfig pydantic-model

Configuration for CT-RAMP formatting.

Attributes:

Name Type Description
income_low_threshold int

Income threshold for low income bracket

income_med_threshold int

Income threshold for medium income bracket

income_high_threshold int

Income threshold for high income bracket

drop_missing_taz bool

If True, remove households without valid TAZ IDs

age_adult int

Age threshold for adult vs child in joint tour composition

income_base_year_dollars int

Base year for income dollar units

gender_default_for_missing str

Default gender for missing/non-binary

Config:

  • extra: forbid

Fields:

Validators:

income_low_threshold pydantic-field

income_low_threshold: int

Income threshold for low income bracket (in dollars). Income below this is classified as 'work_low'. Example: 60000 for $60k annual income

income_med_threshold pydantic-field

income_med_threshold: int

Income threshold for medium income bracket (in dollars). Household income below this threshold (but above low) is classified as 'work_med'. Example: 150000 for $150k annual income

income_high_threshold pydantic-field

income_high_threshold: int

Income threshold for high income bracket (in dollars). Household income below this threshold (but above med) is classified as 'work_high'. Above this is 'work_very high'. Example: 240000 for $240k annual income

taz_field pydantic-field

taz_field: str = 'taz'

The field name in the household data that contains the TAZ ID for CTRAMP formatting.

drop_missing_taz pydantic-field

drop_missing_taz: bool = True

If True, remove households without valid TAZ IDs

age_adult pydantic-field

age_adult: int = 4

Age category threshold for adult vs child in joint tours. Use AgeCategory enum value (e.g., 4 = AGE_18_TO_24 and higher are adults). Default: 4 (18+)

income_base_year_dollars pydantic-field

income_base_year_dollars: int

Base year for income dollar units (e.g., 2023 for $2023). Used for converting income to CT-RAMP format.

gender_default_for_missing pydantic-field

gender_default_for_missing: str = 'f'

Default gender ('m' or 'f') for missing/non-binary values when CT-RAMP requires binary gender. Default: 'f'

fixed_location_buffer_meters pydantic-field

fixed_location_buffer_meters: float = 200.0

Spatial buffer in meters for matching fixed work/school locations to trip destinations when calculating distances.

distance_unit pydantic-field

distance_unit: str = 'miles'

Distance unit for CT-RAMP formatting ('miles', 'kilometers', or 'meters').

model_config class-attribute instance-attribute

model_config = ConfigDict(extra='forbid')

validate_income_ordering pydantic-validator

validate_income_ordering() -> CTRAMPConfig

Validate that income thresholds are in proper order.

processing.formatting.ctramp.format_households

Household formatting for CT-RAMP.

Transforms canonical household data into CT-RAMP model format, including:

  • Income Conversion: Converts categorical income values to midpoint dollars in configurable base year
  • Income Bracketing: Classifies households into low/med/high/very high income brackets using user-defined thresholds
  • Spatial Fields: Maps home TAZ (optionally filters households without valid TAZ)
  • Joint Tour Frequency: Derives jtf_choice field from joint tour counts by purpose category
  • Excludes: Random number fields (ao_rn, fp_rn, cdap_rn) and auto_suff (model simulation fields)

Note: Model-output fields (walk_subzone, humanVehicles, autonomousVehicles, random number fields, auto_suff) are excluded as they are not derivable from survey data.

format_households

format_households(
    households_canonical: pl.DataFrame,
    persons_canonical: pl.DataFrame,
    tours_canonical: pl.DataFrame,
    config: CTRAMPConfig,
) -> pl.DataFrame

Format household data to CT-RAMP specification.

Transforms household data from canonical format to HouseholdCTRAMPModel Key transformations:

  • Rename fields to CT-RAMP conventions
  • Convert income categories to midpoint values
  • Compute household aggregates (size, workers, vehicles)
  • Map TAZ using configuration

Parameters:

Name Type Description Default
households_canonical pl.DataFrame

Canonical households DataFrame with hh_id, home_taz, income_detailed, income_followup, num_vehicles

required
persons_canonical pl.DataFrame

Canonical persons DataFrame with hh_id, employment (for computing household size and worker count)

required
tours_canonical pl.DataFrame

Canonical tours DataFrame with hh_id, joint_tour_id, tour_purpose (for computing joint tour frequency)

required
config CTRAMPConfig

CTRAMPConfig instance with configuration parameters

required

Returns:

Type Description
pl.DataFrame

DataFrame with CT-RAMP household fields:

  • hh_id: Household ID
  • taz: Home TAZ
  • income: Annual household income (midpoint value)
  • autos: Number of automobiles
  • size: Number of persons
  • workers: Number of workers
  • jtf_choice: Joint tour frequency (count of unique joint tours)
Notes
  • Model-output fields (walk_subzone, humanVehicles, autonomousVehicles, random number fields, auto_suff) are excluded as they are not derivable from survey data

processing.formatting.ctramp.format_persons

Person formatting for CT-RAMP.

Transforms canonical person data into CT-RAMP model format, including:

  • Person Type Classification: Derives CT-RAMP person type from age category, employment status, and student status
  • Gender Mapping: Converts to binary m/f format (configurable default for non-binary/missing)
  • Free Parking: Determines eligibility based on commute subsidies
  • Tour Frequency Fields: Derives imf_choice (mandatory tour frequency), inmf_choice (non-mandatory tour frequency), and activity_pattern from tour data when available
  • Work-from-Home: Sets wfh_choice based on work tours vs work days
  • Value of Time: Calculates based on employment type and household income bracket
  • Placeholders: Uses sensible defaults when tour data is unavailable

Note: Some fields like activity_pattern, imf_choice, and inmf_choice require tour data and are set to placeholder values. These would be populated from actual tour extraction results in a full pipeline.

format_persons

format_persons(
    persons_canonical: pl.DataFrame,
    tours_ctramp: pl.DataFrame,
    config: CTRAMPConfig,
) -> pl.DataFrame

Format person data to CT-RAMP specification.

Transforms person data from canonical format to CT-RAMP format. Key transformations:

  • Classify person type based on age, employment, student status
  • Map gender to m/f format
  • Determine free parking eligibility
  • Aggregate activity patterns and tour frequencies from tour data

Parameters:

Name Type Description Default
persons_canonical pl.DataFrame

Canonical persons DataFrame with derived person_type field (free parking), commute_subsidy_use_4 (discounted parking), value_of_time

required
tours_ctramp pl.DataFrame

Formatted CT-RAMP tours DataFrame with person_id and tour_purpose (CTRAMP-formatted purpose strings like 'work_low', 'school_grade', etc.)

required
config CTRAMPConfig

CT-RAMP configuration with age thresholds

required

Returns:

Type Description
pl.DataFrame

DataFrame with CT-RAMP person fields:

  • hh_id: Household ID
  • person_id: Person ID
  • person_num: Person number
  • age: Person age
  • gender: Gender (m/f)
  • type: Person type (1-8)
  • value_of_time: Value of time ($/hour)
  • fp_choice: Free parking choice (1/2)
  • activity_pattern: Daily activity pattern (M/N/H)
  • imf_choice: Individual mandatory tour frequency
  • inmf_choice: Individual non-mandatory tour frequency
  • wfh_choice: Work from home choice (0/1)
Notes
  • activity_pattern: M=mandatory tours, N=non-mandatory only, H=no tours
  • imf_choice: Count of mandatory tours (work/school)
  • inmf_choice: Count of non-mandatory tours
  • wfh_choice: Work from home indicator (currently always 0)

processing.formatting.ctramp.format_mandatory_location

Format mandatory locations for CT-RAMP specification.

format_mandatory_location

format_mandatory_location(
    persons_ctramp: pl.DataFrame,
    households_ctramp: pl.DataFrame,
    linked_trips_canonical: pl.DataFrame,
    config: CTRAMPConfig,
) -> pl.DataFrame

Format mandatory locations (work/school) to CT-RAMP specification.

Transforms person and household data to create mandatory location records for persons with work or school locations.

Parameters:

Name Type Description Default
persons_ctramp pl.DataFrame

Formatted CT-RAMP persons DataFrame with person_id, hh_id, person_num, person_type (integer code), age (non-binned continuous), employment_category, student_category, work_taz, school_taz

required
households_ctramp pl.DataFrame

Formatted CT-RAMP households DataFrame with hh_id, income, home_taz (for HomeTAZ field)

required
linked_trips_canonical pl.DataFrame

Canonical linked trips DataFrame for distance calculations

required
config CTRAMPConfig

CT-RAMP configuration with income_base_year_dollars

required

Returns:

Type Description
pl.DataFrame

DataFrame with CT-RAMP mandatory location fields: - HHID, PersonID, PersonNum - HomeTAZ, Income - PersonType, PersonAge - EmploymentCategory, StudentCategory - WorkLocation, SchoolLocation

Notes
  • Excludes model-only fields (walk subzones)
  • Filters to only persons with work OR school locations
  • Uses pre-computed employment_category and student_category from persons_ctramp

processing.formatting.ctramp.format_tours

Tour formatting for CT-RAMP.

Transforms canonical tour data into CT-RAMP model format, including:

  • Individual tours (non-joint tours)
  • Joint tours (household member group tours)

format_individual_tour

format_individual_tour(
    tours_canonical: pl.DataFrame,
    linked_trips_canonical: pl.DataFrame,
    persons_canonical: pl.DataFrame,
    households_ctramp: pl.DataFrame,
    config: CTRAMPConfig,
) -> pl.DataFrame

Format individual tours to CT-RAMP specification.

Transforms tour data (excluding joint tours) to CT-RAMP format. Key Transformations:

  • Filtering: Excludes joint tours (processes only tours with joint_tour_id IS NULL)
  • Purpose Mapping: Converts canonical purpose categories to CT-RAMP tour purposes
  • Mode Translation: Maps canonical mode types to CT-RAMP mode codes
  • Time-of-Day: Extracts start/end hours from departure/arrival times
  • Stop Counts: Computes outbound and inbound stop counts from linked trips
  • Subtour Frequency: Calculates atWork_freq by counting subtours for each parent work tour
  • Excludes: Model simulation fields (random numbers, wait times, logsums)

Parameters:

Name Type Description Default
tours_canonical pl.DataFrame

Canonical tours DataFrame with tour_id, hh_id, person_id, person_num, tour_category, tour_purpose, o_taz, d_taz, origin_depart_time, origin_arrive_time, tour_mode, joint_tour_id (for filtering), parent_tour_id (for subtour counting)

required
linked_trips_canonical pl.DataFrame

Canonical trips DataFrame with tour_id, tour_direction (1=outbound, 2=inbound, 3=subtour)

required
persons_canonical pl.DataFrame

Canonical persons DataFrame with person_id, person_num, person_type (for mode mapping), school_type (for purpose mapping) are optional but re-derived if missing or invalid

required
households_ctramp pl.DataFrame

Formatted CT-RAMP households DataFrame with hh_id, income

required
config CTRAMPConfig

CT-RAMP configuration with income thresholds

required

Returns:

Type Description
pl.DataFrame

DataFrame with CT-RAMP individual tour fields:

  • hh_id, person_id, person_num, person_type
  • tour_id, tour_category, tour_purpose
  • orig_taz, dest_taz
  • start_hour, end_hour
  • tour_mode
  • atWork_freq (subtour count)
  • num_ob_stops, num_ib_stops
Notes
  • Excludes joint tours (joint_tour_id IS NULL)
  • Excludes all model-only fields (random numbers, wait times, logsums)

format_joint_tour

format_joint_tour(
    tours_canonical: pl.DataFrame,
    linked_trips_canonical: pl.DataFrame,
    persons_canonical: pl.DataFrame,
    households_ctramp: pl.DataFrame,
    config: CTRAMPConfig,
) -> pl.DataFrame

Format joint tours to CT-RAMP specification.

Transforms joint tour data with participant aggregation. Key Transformations:

  • Identifies tours shared by multiple household members using joint_tour_id
  • Composition: Classifies as adults-only, children-only, or mixed based on configurable age threshold
  • Participant List: Maintains ordered list of person_num for all tour participants
  • Same Fields: Purpose, mode, time-of-day, stops, destinations (like individual tours)

Parameters:

Name Type Description Default
tours_canonical pl.DataFrame

Canonical tours DataFrame with tour_id, joint_tour_id, hh_id, person_id, person_num, tour_category, tour_purpose, o_taz, d_taz, origin_depart_time, origin_arrive_time, tour_mode, subtour_num

required
linked_trips_canonical pl.DataFrame

Canonical trips DataFrame with joint_tour_id, tour_direction, person_id

required
persons_canonical pl.DataFrame

Canonical persons DataFrame with person_id, person_num, age_category (for composition determination)

required
households_ctramp pl.DataFrame

Formatted CT-RAMP households DataFrame with hh_id, income

required
config CTRAMPConfig

CT-RAMP configuration with income thresholds and age_adult category

required

Returns:

Type Description
pl.DataFrame

DataFrame with CT-RAMP joint tour fields

Notes
  • Filters to joint tours only (joint_tour_id IS NOT NULL)
  • Aggregates participants into space-separated string
  • Determines composition from participant ages

processing.formatting.ctramp.format_trips

Trip formatting for CT-RAMP.

Transforms canonical trip data into CT-RAMP model format, including:

  • Individual trips (trips on individual tours)
  • Joint trips (trips on joint tours)

format_individual_trip

format_individual_trip(
    linked_trips_canonical: pl.DataFrame,
    tours_ctramp: pl.DataFrame,
    persons_canonical: pl.DataFrame,
    households_ctramp: pl.DataFrame,
    config: CTRAMPConfig,
) -> pl.DataFrame

Format individual trips to CT-RAMP specification.

Transforms linked trip data (for individual tours only) to CT-RAMP format. Links trips to individual tours via tour_id. Includes stop purpose, mode, location, and sequence within tour. Distinguishes outbound, inbound, and subtour trips.

Parameters:

Name Type Description Default
linked_trips_canonical pl.DataFrame

Canonical DataFrame with linked trip fields (tour_id, o_purpose_category, d_purpose_category, mode_type, o_taz, d_taz, tour_direction, depart_time, arrive_time, person_id, hh_id)

required
tours_ctramp pl.DataFrame

Formatted CT-RAMP individual tours DataFrame (already filtered to individual tours only, without joint_tour_id). Contains tour_id, tour_purpose, tour_mode, tour_category

required
persons_canonical pl.DataFrame

Canonical persons DataFrame with person_id, person_num, school_type

required
households_ctramp pl.DataFrame

Formatted CT-RAMP households DataFrame with hh_id, income

required
config CTRAMPConfig

CT-RAMP configuration with income thresholds

required

Returns:

Type Description
pl.DataFrame

DataFrame with CT-RAMP individual trip fields

Notes
  • Excludes trips on joint tours
  • Creates stop_id sequence starting at 1 per half-tour
  • Excludes model-only fields (parking costs, value of time, etc.)

format_joint_trip

format_joint_trip(
    joint_trips_canonical: pl.DataFrame,
    linked_trips_canonical: pl.DataFrame,
    tours_canonical: pl.DataFrame,
    households_ctramp: pl.DataFrame,
    config: CTRAMPConfig,
) -> pl.DataFrame

Format joint trips to CT-RAMP specification.

Transforms joint trip data using mean coordinates from joint_trips table. Links trips to joint tours using aggregated joint trip data. Maintains participant information for each trip leg.

Parameters:

Name Type Description Default
joint_trips_canonical pl.DataFrame

Aggregated joint trip DataFrame with joint_trip_id, hh_id, num_joint_travelers (mean locations)

required
linked_trips_canonical pl.DataFrame

Canonical linked trips DataFrame to bridge joint_trip_id to tour_id (contains joint_trip_id, tour_id, o_purpose_category, d_purpose_category, mode_type, depart_time, arrive_time, num_travelers, o_taz, d_taz)

required
tours_canonical pl.DataFrame

Canonical tours DataFrame with tour_id, joint_tour_id, tour_purpose, tour_category, tour_mode

required
households_ctramp pl.DataFrame

Formatted CT-RAMP households DataFrame with hh_id, income

required
config CTRAMPConfig

CT-RAMP configuration with income thresholds

required

Returns:

Type Description
pl.DataFrame

DataFrame with CT-RAMP joint trip fields

Notes
  • Uses mean coordinates from joint_trips aggregation table
  • Deduplicates to one row per joint_trip_id