Controls
processing.weighting.controls
Weighting control definitions and registry.
Provides the CONTROLS registry mapping control names to
ControlTarget instances, plus
household-level and person-level concrete control classes that define
pums_expr / survey_expr recoding logic.
processing.weighting.controls.base
Base class and helpers for weighting controls.
Contains ControlLevel, ControlTarget base class, and shared
expression helpers used by the household and person control subclasses.
ControlLevel
Whether a control is at the household or person level.
HOUSEHOLD
class-attribute
instance-attribute
HOUSEHOLD = 'household'
PERSON
class-attribute
instance-attribute
PERSON = 'person'
ControlTarget
Base class for a single weighting control.
Subclasses set class attributes and override survey_expr /
pums_expr to return native Polars expressions.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Registry key, e.g. |
level |
ControlLevel
|
|
description |
str
|
Human-readable label. |
categories |
type[Enum]
|
|
survey_fields |
tuple[str, ...]
|
Canonical survey column names (metadata). |
pums_fields |
tuple[str, ...]
|
PUMS column names (metadata). |
processing.weighting.controls.registry
Control registry and resolution helpers.
The CONTROLS dict is the single lookup table mapping control names
to :class:ControlTarget instances. resolve_targets and
pums_variables provide the main query API used by the rest of the
weighting pipeline.
Dynamic cross-tab creation:
register_crosstab creates a CrosstabControlTarget instance at runtime
from dimension control names, allowing cross-tabs to be defined in YAML
config without requiring Python class definitions.
CONTROLS
module-attribute
CONTROLS: dict[str, ControlTarget] = {
(t.name): t
for t in [
HHTotalControl(),
HHSizeControl(),
HHIncomeControl(),
HHWorkersControl(),
HHVehiclesControl(),
HHChildrenControl(),
PersonTotalControl(),
GenderControl(),
EmploymentControl(),
CommuteModeControl(),
StudentControl(),
EducationControl(),
RaceControl(),
EthnicityControl(),
AgeControl(),
]
}
resolve_targets
resolve_targets(
targets: list[str], level: ControlLevel | None = None
) -> list[ControlTarget]
Return ControlTarget objects for targets, optionally filtered.
pums_variables
pums_variables(level: ControlLevel) -> set[str]
PUMS variable names needed for all controls at level.
processing.weighting.controls.enums
Custom IntEnum categories for weighting controls.
These define the output bins for the 8 collapsing controls where the
canonical survey granularity is reduced to fewer bins for weighting. The
5 identity controls (income, education, race, ethnicity, age) reuse
canonical LabeledEnum directly and are not duplicated here.
HHSizeCategory
Household size bins (1-9, 10+).
SIZE_1
class-attribute
instance-attribute
SIZE_1 = 1
SIZE_2
class-attribute
instance-attribute
SIZE_2 = 2
SIZE_3
class-attribute
instance-attribute
SIZE_3 = 3
SIZE_4
class-attribute
instance-attribute
SIZE_4 = 4
SIZE_5
class-attribute
instance-attribute
SIZE_5 = 5
SIZE_6
class-attribute
instance-attribute
SIZE_6 = 6
SIZE_7
class-attribute
instance-attribute
SIZE_7 = 7
SIZE_8
class-attribute
instance-attribute
SIZE_8 = 8
SIZE_9
class-attribute
instance-attribute
SIZE_9 = 9
SIZE_10_PLUS
class-attribute
instance-attribute
SIZE_10_PLUS = 10
HHWorkersCategory
Number of workers in household (0-4, 5+).
WORKERS_0
class-attribute
instance-attribute
WORKERS_0 = 0
WORKERS_1
class-attribute
instance-attribute
WORKERS_1 = 1
WORKERS_2
class-attribute
instance-attribute
WORKERS_2 = 2
WORKERS_3
class-attribute
instance-attribute
WORKERS_3 = 3
WORKERS_4
class-attribute
instance-attribute
WORKERS_4 = 4
WORKERS_5_PLUS
class-attribute
instance-attribute
WORKERS_5_PLUS = 5
HHVehiclesCategory
Vehicles available to household (0-5, 6+).
VEH_0
class-attribute
instance-attribute
VEH_0 = 0
VEH_1
class-attribute
instance-attribute
VEH_1 = 1
VEH_2
class-attribute
instance-attribute
VEH_2 = 2
VEH_3
class-attribute
instance-attribute
VEH_3 = 3
VEH_4
class-attribute
instance-attribute
VEH_4 = 4
VEH_5
class-attribute
instance-attribute
VEH_5 = 5
VEH_6_PLUS
class-attribute
instance-attribute
VEH_6_PLUS = 6
HHChildrenCategory
Number of children in household (0-4, 5+).
CHILDREN_0
class-attribute
instance-attribute
CHILDREN_0 = 0
CHILDREN_1
class-attribute
instance-attribute
CHILDREN_1 = 1
CHILDREN_2
class-attribute
instance-attribute
CHILDREN_2 = 2
CHILDREN_3
class-attribute
instance-attribute
CHILDREN_3 = 3
CHILDREN_4
class-attribute
instance-attribute
CHILDREN_4 = 4
CHILDREN_5_PLUS
class-attribute
instance-attribute
CHILDREN_5_PLUS = 5
GenderCategory
Gender bins for weighting (male / female / other).
PUMS only has binary SEX.
MALE
class-attribute
instance-attribute
MALE = 1
FEMALE
class-attribute
instance-attribute
FEMALE = 2
EmploymentCategory
Employment status for weighting (full / part / not employed).
EMPLOYED_FULL
class-attribute
instance-attribute
EMPLOYED_FULL = 1
EMPLOYED_PART
class-attribute
instance-attribute
EMPLOYED_PART = 2
NOT_EMPLOYED
class-attribute
instance-attribute
NOT_EMPLOYED = 3
CommuteModeCategory
Commute mode for weighting.
NA
class-attribute
instance-attribute
NA = 0
MOSTLY_REMOTE
class-attribute
instance-attribute
MOSTLY_REMOTE = 1
DRIVE_ALONE
class-attribute
instance-attribute
DRIVE_ALONE = 2
CARPOOL
class-attribute
instance-attribute
CARPOOL = 3
TRANSIT
class-attribute
instance-attribute
TRANSIT = 4
WALK
class-attribute
instance-attribute
WALK = 5
BIKE
class-attribute
instance-attribute
BIKE = 6
OTHER
class-attribute
instance-attribute
OTHER = 7
StudentCategory
Student status (not student / K-12 / college).
NOT_STUDENT
class-attribute
instance-attribute
NOT_STUDENT = 0
STUDENT_K12
class-attribute
instance-attribute
STUDENT_K12 = 1
STUDENT_COLLEGE
class-attribute
instance-attribute
STUDENT_COLLEGE = 2
TotalCategory
Single-category enum for h_total / p_total structural controls.
TOTAL
class-attribute
instance-attribute
TOTAL = 1
processing.weighting.controls.household
Household-level weighting controls.
Each class maps raw survey / PUMS values into coarser category ints for household-level weighting targets.
All survey_expr / pums_expr overrides implement the interface
documented in ControlTarget
— individual method docstrings are omitted for brevity (ruff noqa: D102).
HHSizeControl
Household size (1-10+).
HHIncomeControl
Household income (canonical IncomeBroad bins).
HHWorkersControl
Number of workers in household (0-5+).
HHVehiclesControl
Vehicles in household (0-6+).
HHChildrenControl
Children in household (0-5+).
HHTotalControl
Structural control: total households (incidence = 1 per HH).
processing.weighting.controls.person
Person-level weighting controls.
Each class maps raw survey / PUMS values into coarser category ints for person-level weighting targets.
All survey_expr / pums_expr overrides implement the interface
documented in ControlTarget
— individual method docstrings are omitted for brevity (ruff noqa: D102) because the base class
fully documents the expected behavior and error handling.
GenderControl
Gender (male / female).
EmploymentControl
Employment status (full-time / part-time / not employed).
CommuteModeControl
Commute mode (drive, carpool, transit, bike, walk, mostly_remote, other, N/A).
Survey side: Uses a combination of job_type, telework_freq, and
commute_freq to identify mostly-remote workers — those whose telework
frequency exceeds their commute frequency. For all other workers the
observed work_mode determines the category.
PUMS side: JWTRNS=11 ("Worked at home") maps to MOSTLY_REMOTE.
This is the closest analog — the PUMS question asks about the usual
mode to work, so respondents who mostly remote-work select this.
StudentControl
Student status (K-12 / college / not a student).
Classification priority
- Explicit non-student (
student == NONSTUDENT) → NOT_STUDENT - Known K-12 school type (preschool thru high school) → K12
- Known college school type → COLLEGE
- Childcare / at-home (not school in the Census sense) → NOT_STUDENT
- Age-based fallback when both student & school_type are missing: school-age children (5-17) → K12, everyone else → NOT_STUDENT
- Active student with missing school_type → COLLEGE (adult default)
The student field is only collected for persons age 16+ in the
survey instrument; younger children have student = 995 (MISSING)
but typically have a valid school_type, so school_type is checked
before discarding missing students.
EducationControl
Education attainment (canonical Education enum).
RaceControl
Race (canonical Race enum).
EthnicityControl
Hispanic/Latino ethnicity (canonical Ethnicity enum).
AgeControl
Age (canonical AgeCategory breakpoints).
PersonTotalControl
Structural control: total persons (incidence = 1 per person).
When aggregated to the seed table (one row per household), the incidence column becomes the count of persons in the household — effectively the non-top-coded household size.