Skip to the content.

TM2 PopulationSim Process Overview

What is PopulationSim?

PopulationSim is a synthetic population generator that creates realistic household and person records for transportation modeling. The TM2 (Travel Model 2) pipeline synthesizes a population for the 9-county San Francisco Bay Area.

Overall Process Flow

PUMS Data → Geographic Crosswalk → Seed Population → Marginal Controls → PopulationSim → Synthetic Population

1. Data Inputs

2. Geographic Framework

Step-by-Step Process

Step 1: PUMS Data Download

Purpose: Obtain raw household and person microdata from US Census Input: Census API or cached files Output:

What happens:

Step 2: Geographic Crosswalk Creation

Purpose: Link geographic zones (MAZ-TAZ-PUMA-County) for population synthesis Input:

What happens:

Step 3: Seed Population Generation

Purpose: Process PUMS data into PopulationSim-ready format Input:

What happens:

Step 4: Marginal Controls Generation

Purpose: Create target totals that synthetic population must match Input:

What happens:

Step 5: PopulationSim Synthesis

Purpose: Generate synthetic population matching marginal controls Input:

What happens:

Key Algorithms

Geographic Assignment

  1. PUMA Assignment: Households stay in their original PUMA
  2. County Assignment: Based on PUMA-to-county crosswalk
  3. TAZ/MAZ Assignment: PopulationSim chooses based on controls

Control Balancing

  1. IPF (Iterative Proportional Fitting): Adjusts weights to match marginals
  2. Multi-level convergence: Ensures consistency across MAZ/TAZ/County
  3. Constraint handling: Respects geographic and demographic relationships

Group Quarters Handling (Updated October 2025)

Important Approach Change: PopulationSim now uses person-level group quarters controls that align directly with Census data structure.

  1. Person-level controls: GQ controls count individuals, not households
    • pers_gq_university: University GQ persons (Census P5_008N)
    • pers_gq_noninstitutional: Military + other GQ persons (Census P5_009N+P5_011N+P5_012N)
  2. Selective institutional inclusion:
    • INCLUDED: University/college housing (dorms, student housing)
    • INCLUDED: Military barracks and base housing
    • INCLUDED: Other non-institutional group quarters (group homes, worker housing)
    • EXCLUDED: Nursing homes, prisons, mental health institutions
  3. Two-level structure:
    • Household level: hhgqtype (0=regular, 1=university, 2=noninstitutional)
    • Person level: gq_type (0=regular, 1=university, 2=noninstitutional)
  4. Direct Census alignment: Controls use Census person counts without household-level conversion
  5. Travel behavior: Non-institutional GQ residents participate in regular travel patterns

Rationale: Person-level controls eliminate conversion assumptions and ensure direct data consistency with Census P5 series tables while maintaining travel modeling utility.

Data Quality Measures

Validation Checks

Key Metrics

Technology Stack

Core Software

Data Sources

Output Uses

Quality Assurance

Automated Validation

Manual Review