Skip to content

Roadway

The roadway module contains submodules which define and extend the links, nodes, and shapes dataframe objects which within a RoadwayNetwork object as well as other classes and methods which support and extend the RoadwayNetwork class.

Roadway Network Objects

Submodules which define and extend the links, nodes, and shapes dataframe objects which within a RoadwayNetwork object. Includes classes which define:

  • dataframe schemas to be used for dataframe validation using pandera
  • methods which extend the dataframes

Tables

Datamodels for Roadway Network Tables.

This module contains the datamodels used to validate the format and types of Roadway Network tables.

Includes:

  • RoadLinksTable
  • RoadNodesTable
  • RoadShapesTable
  • ExplodedScopedLinkPropertyTable

network_wrangler.models.roadway.tables.ExplodedScopedLinkPropertyTable

Bases: DataFrameModel

Datamodel used to validate an exploded links_df by scope.

Source code in network_wrangler/models/roadway/tables.py
class ExplodedScopedLinkPropertyTable(DataFrameModel):
    """Datamodel used to validate an exploded links_df by scope."""

    model_link_id: Series[int]
    category: Series[Any]
    timespan: Series[list[str]]
    start_time: Series[dt.datetime]
    end_time: Series[dt.datetime]
    scoped: Series[Any] = Field(default=None, nullable=True)

    class Config:
        """Config for ExplodedScopedLinkPropertySchema."""

        name = "ExplodedScopedLinkPropertySchema"
        coerce = True

network_wrangler.models.roadway.tables.RoadLinksTable

Bases: DataFrameModel

Datamodel used to validate if links_df is of correct format and types.

Attributes:

  • model_link_id (int) –

    Unique identifier for the link.

  • A (int) –

    model_node_id of the link’s start node. Foreign key to road_nodes.

  • B (int) –

    model_node_id of the link’s end node. Foreign key to road_nodes.

  • geometry (GeoSeries) –

    Warning: this attribute is controlled by wrangler and should not be explicitly user-edited. Simple A→B geometry of the link.

  • name (str) –

    Name of the link.

  • rail_only (bool) –

    If the link is only for rail. Default is False.

  • bus_only (bool) –

    If the link is only for buses. Default is False.

  • drive_access (bool) –

    If the link allows driving. Default is True.

  • bike_access (bool) –

    If the link allows biking. Default is True.

  • walk_access (bool) –

    If the link allows walking. Default is True.

  • truck_access (bool) –

    If the link allows trucks. Default is True.

  • distance (float) –

    Length of the link.

  • roadway (str) –

    Type of roadway per OSM definitions. Default is “road”.

  • projects (str) –

    Warning: this attribute is controlled by wrangler and should not be explicitly user-edited. Comma-separated list of project names applied to the link. Default is “”.

  • managed (int) –

    Warning: this attribute is controlled by wrangler and should not be explicitly user-edited. Indicator for the type of managed lane facility. Values can be:

    • 0 indicating no managed lane on this link.
    • 1 indicates that there is a managed lane on the link (std network) or that the link is a managed lane (model network).
    • -1 indicates that there is a parallel managed lane derived from this link (model network).
  • shape_id (str) –

    Identifier referencing the primary key of the shapes table. Default is None.

  • lanes (int) –

    Default number of lanes on the link. Default is 1.

  • sc_lanes (Optional[list[dict]]) –

    List of scoped link values for the number of lanes. Default is None. Example: [{'timespan':['12:00':'15:00'], 'value': 3},{'timespan':['15:00':'19:00'], 'value': 2}].

  • price (float) –

    Default price to use the link. Default is 0.

  • sc_price (Optional[list[dict]]) –

    List of scoped link values for the price. Default is None. Example: [{'timespan':['15:00':'19:00'],'category': 'sov', 'value': 2.5}].

  • ref (Optional[str]) –

    Reference numbers for link referring to a route or exit number per the OSM definition. Default is None.

  • access (Optional[Any]) –

    User-defined method to note access restrictions for the link. Default is None.

  • ML_projects (Optional[str]) –

    Warning: this attribute is controlled by wrangler and should not be explicitly user-edited. Comma-separated list of project names applied to the managed lane. Default is “”.

  • ML_lanes (Optional[int]) –

    Default number of lanes on the managed lane. Default is None.

  • ML_price (Optional[float]) –

    Default price to use the managed lane. Default is 0.

  • ML_access (Optional[Any]) –

    User-defined method to note access restrictions for the managed lane. Default is None.

  • ML_access_point (Optional[bool]) –

    If the link is an access point for the managed lane. Default is False.

  • ML_egress_point (Optional[bool]) –

    If the link is an egress point for the managed lane. Default is False.

  • sc_ML_lanes (Optional[list[dict]]) –

    List of scoped link values for the number of lanes on the managed lane. Default is None.

  • sc_ML_price (Optional[list[dict]]) –

    List of scoped link values for the price of the managed lane. Default is None.

  • sc_ML_access (Optional[list[dict]]) –

    List of scoped link values for the access restrictions of the managed lane. Default is None.

  • ML_geometry (Optional[GeoSeries]) –

    Warning: this attribute is controlled by wrangler and should not be explicitly user-edited. Simple A→B geometry of the managed lane. Default is None.

  • ML_shape_id (Optional[str]) –

    Identifier referencing the primary key of the shapes table for the managed lane. Default is None.

  • osm_link_id (Optional[str]) –

    Identifier referencing the OSM link ID. Default is “”.

  • GP_A (Optional[int]) –

    Warning: this attribute is controlled by wrangler and should not be explicitly user-edited. Identifier referencing the primary key of the associated general purpose link start node for a managed lane link in a model network. Default is None.

  • GP_B (Optional[int]) –

    Warning: this attribute is controlled by wrangler and should not be explicitly user-edited. Identifier referencing the primary key of the associated general purpose link end node for a managed lane link in a model network. Default is None.

User Defined Properties

Additional properites may be defined and are assumed to have the same definition of OpenStreetMap if they have overlapping property names.

Properties for parallel managed lanes

Properties for parallel managed lanes are prefixed with ML_. (Almost) any property, including an ad-hoc one, can be made to apply to a parallel managed lane by applying the prefix ML_, e.g. ML_lanes

Warning

The following properties should not be assigned an ML_ prefix by the user because they are assigned one within networkwrangler:

  • name
  • A
  • B
  • model_link_id
Time- or category-dependent properties

The following properties can be time-dependent, category-dependent, or both by adding sc_. The “plain” property without the prefix becomes the default when no scoped property applies.

Property # of Lanes Price
Default value lanes price
Time- and/or category-dependent value sc_lanes sc_price
Default value for managed lane ML_lanes ML_price
Time- and/or category-dependent value for managed lane sc_ML_lanes sc_ML_price

previous format for scoped properties

Some previous tooling was developed around a previous method for serializing scoped properties. In order to retain compatability with this format:

  • load_roadway_from_dir(), read_links(), and associated functions will “sniff” the network for the old format and apply the converter function translate_links_df_v0_to_v1()
  • write_links() has an boolean attribute to convert_complex_properties_to_single_field which can also be invoked from write_roadway() as convert_complex_link_properties_to_single_field.
Defining time-dependent properties

Time-dependent properties are defined as a list of dictionaries with timespans and values.

  • Timespans must be defined as a list of HH:MM or HH:MM:SS using a 24-hour clock: ('06:00':'09:00').
  • Timespans must not intersect.

Time-dependent property

$3 peak-period pricing

# default price
'price' = 0
'sc_price':
[
    {
        'time':['06:00':'09:00'],
        'value': 3
    },
    {
        'timespan':['16:00':'19:00'],
        'value': 3,
    }
]
Defining time- and category-dependent properties

Properties co-dependent on time- and category are defined as a list of dictionaries with value, category and time defined.

time- and category-dependent property

A pricing strategy which only applies in peak period for trucks and sovs:

# default price
"price": 0
# price scoped by time of day
"sc_price":
[
    {
        'timespan':['06:00':'09:00'],
        'category': ('sov','truck'),
        'value': 3
    },
    {
        'timespan':['16:00':'19:00'],
        'category': ('sov','truck'),
        'value': 3,
    }
]

Tip

There is no limit on other, user-defined properties being listed as time-dependent or time- and category-dependent.

User-defined variable by time of day

Define a variable access to represent which categories can access the network and vary it by time of day.

#access
{
    # default value for access
    'access': ('any'),
    # scoped value for access
    'sc_access': [
        {
            'timespan':['06:00':'09:00'],
            'value': ('no-trucks')
        },
        {
            'timespan':['16:00':'19:00'],
            'value': ('hov2','hov3','trucks')
        }
    ]
}
Source code in network_wrangler/models/roadway/tables.py
class RoadLinksTable(DataFrameModel):
    """Datamodel used to validate if links_df is of correct format and types.

    Attributes:
        model_link_id (int): Unique identifier for the link.
        A (int): `model_node_id` of the link's start node. Foreign key to `road_nodes`.
        B (int): `model_node_id` of the link's end node. Foreign key to `road_nodes`.
        geometry (GeoSeries): **Warning**: this attribute is controlled by wrangler and should not be explicitly user-edited.
            Simple A-->B geometry of the link.
        name (str): Name of the link.
        rail_only (bool): If the link is only for rail. Default is False.
        bus_only (bool): If the link is only for buses. Default is False.
        drive_access (bool): If the link allows driving. Default is True.
        bike_access (bool): If the link allows biking. Default is True.
        walk_access (bool): If the link allows walking. Default is True.
        truck_access (bool): If the link allows trucks. Default is True.
        distance (float): Length of the link.
        roadway (str): Type of roadway per [OSM definitions](https://wiki.openstreetmap.org/wiki/Key:highway#Roads).
            Default is "road".
        projects (str): **Warning**: this attribute is controlled by wrangler and should not be explicitly user-edited.
            Comma-separated list of project names applied to the link. Default is "".
        managed (int): **Warning**: this attribute is controlled by wrangler and should not be explicitly user-edited.
            Indicator for the type of managed lane facility. Values can be:

            - 0 indicating no managed lane on this link.
            - 1 indicates that there is a managed lane on the link (std network) or that the link is a
                managed lane (model network).
            - -1 indicates that there is a parallel managed lane derived from this link (model network).
        shape_id (str): Identifier referencing the primary key of the shapes table. Default is None.
        lanes (int): Default number of lanes on the link. Default is 1.
        sc_lanes (Optional[list[dict]]: List of scoped link values for the number of lanes. Default is None.
            Example: `[{'timespan':['12:00':'15:00'], 'value': 3},{'timespan':['15:00':'19:00'], 'value': 2}]`.

        price (float): Default price to use the link. Default is 0.
        sc_price (Optional[list[dict]]): List of scoped link values for the price. Default is None.
            Example: `[{'timespan':['15:00':'19:00'],'category': 'sov', 'value': 2.5}]`.
        ref (Optional[str]): Reference numbers for link referring to a route or exit number per the
            [OSM definition](https://wiki.openstreetmap.org/wiki/Key:ref). Default is None.
        access (Optional[Any]): User-defined method to note access restrictions for the link. Default is None.
        ML_projects (Optional[str]): **Warning**: this attribute is controlled by wrangler and should not be explicitly user-edited.
            Comma-separated list of project names applied to the managed lane. Default is "".
        ML_lanes (Optional[int]): Default number of lanes on the managed lane. Default is None.
        ML_price (Optional[float]): Default price to use the managed lane. Default is 0.
        ML_access (Optional[Any]): User-defined method to note access restrictions for the managed lane. Default is None.
        ML_access_point (Optional[bool]): If the link is an access point for the managed lane. Default is False.
        ML_egress_point (Optional[bool]): If the link is an egress point for the managed lane. Default is False.
        sc_ML_lanes (Optional[list[dict]]): List of scoped link values for the number of lanes on the managed lane.
            Default is None.
        sc_ML_price (Optional[list[dict]]): List of scoped link values for the price of the managed lane. Default is None.
        sc_ML_access (Optional[list[dict]]): List of scoped link values for the access restrictions of the managed lane.
            Default is None.
        ML_geometry (Optional[GeoSeries]): **Warning**: this attribute is controlled by wrangler and should not be explicitly user-edited.
            Simple A-->B geometry of the managed lane. Default is None.
        ML_shape_id (Optional[str]): Identifier referencing the primary key of the shapes table for the managed lane.
            Default is None.
        osm_link_id (Optional[str]): Identifier referencing the OSM link ID. Default is "".
        GP_A (Optional[int]): **Warning**: this attribute is controlled by wrangler and should not be explicitly user-edited.
            Identifier referencing the primary key of the associated general purpose link start node for
            a managed lane link in a model network. Default is None.
        GP_B (Optional[int]): **Warning**: this attribute is controlled by wrangler and should not be explicitly user-edited.
            Identifier referencing the primary key of the associated general purpose link end node for
            a managed lane link in a model network. Default is None.

    !!! tip "User Defined Properties"

        Additional properites may be defined and are assumed to have the same definition of OpenStreetMap if they
        have overlapping property names.

    ### Properties for parallel managed lanes

    Properties for parallel managed lanes are prefixed with `ML_`. (Almost) any property,
    including an ad-hoc one, can be made to apply to a parallel managed lane by applying
    the prefix `ML_`, e.g. `ML_lanes`

    !!! warning

        The following properties should **not** be assigned an `ML_` prefix by the user
        because they are assigned one within networkwrangler:

        - `name`
        - `A`
        - `B`
        - `model_link_id`

    ### Time- or category-dependent properties

    The following properties can be time-dependent, category-dependent, or both by adding `sc_`.
    The "plain" property without the prefix becomes the default when no scoped property applies.

    | Property | # of Lanes | Price |
    | -----------| ----------------- | ---------------- |
    | Default value | `lanes` | `price` |
    | Time- and/or category-dependent value | `sc_lanes` | `sc_price` |
    | Default value for managed lane | `ML_lanes` | `ML_price` |
    | Time- and/or category-dependent value for managed lane | `sc_ML_lanes` | `sc_ML_price` |


    !!! note "previous format for scoped properties"

        Some previous tooling was developed around a previous method for serializing scoped properties.  In order to retain compatability with this format:

        - `load_roadway_from_dir()`, `read_links()`, and associated functions will "sniff" the network for the old format and apply the converter function `translate_links_df_v0_to_v1()`
        - `write_links()` has an boolean attribute to `convert_complex_properties_to_single_field` which can also be invoked from `write_roadway()` as `convert_complex_link_properties_to_single_field`.

    #### Defining time-dependent properties

    Time-dependent properties are defined as a list of dictionaries with timespans and values.

    - Timespans must be defined as a list of HH:MM or HH:MM:SS using a 24-hour clock: `('06:00':'09:00')`.
    - Timespans must not intersect.

    !!! example  "Time-dependent property"

        $3 peak-period pricing

        ```python
        # default price
        'price' = 0
        'sc_price':
        [
            {
                'time':['06:00':'09:00'],
                'value': 3
            },
            {
                'timespan':['16:00':'19:00'],
                'value': 3,
            }
        ]
        ```

    #### Defining time- and category-dependent properties

    Properties co-dependent on time- and category are defined as a list of dictionaries with value, category and time defined.

    !!! example "time- and category-dependent property"

        A pricing strategy which only applies in peak period for trucks and sovs:

        ```python
        # default price
        "price": 0
        # price scoped by time of day
        "sc_price":
        [
            {
                'timespan':['06:00':'09:00'],
                'category': ('sov','truck'),
                'value': 3
            },
            {
                'timespan':['16:00':'19:00'],
                'category': ('sov','truck'),
                'value': 3,
            }
        ]
        ```

    !!! tip

        There is no limit on other, user-defined properties being listed as time-dependent or time- and category-dependent.

    !!! example "User-defined variable by time of day"

        Define a variable `access` to represent which categories can access the network and vary it by time of day.

        ```python
        #access
        {
            # default value for access
            'access': ('any'),
            # scoped value for access
            'sc_access': [
                {
                    'timespan':['06:00':'09:00'],
                    'value': ('no-trucks')
                },
                {
                    'timespan':['16:00':'19:00'],
                    'value': ('hov2','hov3','trucks')
                }
            ]
        }
        ```
    """

    model_link_id: Series[int] = Field(coerce=True, unique=True)
    model_link_id_idx: Optional[Series[int]] = Field(coerce=True, unique=True)
    A: Series[int] = Field(nullable=False, coerce=True)
    B: Series[int] = Field(nullable=False, coerce=True)
    geometry: GeoSeries = Field(nullable=False)
    name: Series[str] = Field(nullable=False, default="unknown")
    rail_only: Series[bool] = Field(coerce=True, nullable=False, default=False)
    bus_only: Series[bool] = Field(coerce=True, nullable=False, default=False)
    drive_access: Series[bool] = Field(coerce=True, nullable=False, default=True)
    bike_access: Series[bool] = Field(coerce=True, nullable=False, default=True)
    walk_access: Series[bool] = Field(coerce=True, nullable=False, default=True)
    distance: Series[float] = Field(coerce=True, nullable=False)

    roadway: Series[str] = Field(nullable=False, default="road")
    projects: Series[str] = Field(coerce=True, default="")
    managed: Series[int] = Field(coerce=True, nullable=False, default=0)

    shape_id: Series[str] = Field(coerce=True, nullable=True)
    lanes: Series[int] = Field(coerce=True, nullable=False)
    price: Series[float] = Field(coerce=True, nullable=False, default=0)

    # Optional Fields
    ref: Optional[Series[str]] = Field(coerce=True, nullable=True, default=None)
    access: Optional[Series[Any]] = Field(coerce=True, nullable=True, default=None)

    sc_lanes: Optional[Series[object]] = Field(coerce=True, nullable=True, default=None)
    sc_price: Optional[Series[object]] = Field(coerce=True, nullable=True, default=None)

    ML_projects: Series[str] = Field(coerce=True, default="")
    ML_lanes: Optional[Series[Int64]] = Field(coerce=True, nullable=True, default=None)
    ML_price: Optional[Series[float]] = Field(coerce=True, nullable=True, default=0)
    ML_access: Optional[Series[Any]] = Field(coerce=True, nullable=True, default=True)
    ML_access_point: Optional[Series[bool]] = Field(
        coerce=True,
        default=False,
    )
    ML_egress_point: Optional[Series[bool]] = Field(
        coerce=True,
        default=False,
    )
    sc_ML_lanes: Optional[Series[object]] = Field(
        coerce=True,
        nullable=True,
        default=None,
    )
    sc_ML_price: Optional[Series[object]] = Field(
        coerce=True,
        nullable=True,
        default=None,
    )
    sc_ML_access: Optional[Series[object]] = Field(
        coerce=True,
        nullable=True,
        default=None,
    )

    ML_geometry: Optional[GeoSeries] = Field(nullable=True, coerce=True, default=None)
    ML_shape_id: Optional[Series[str]] = Field(nullable=True, coerce=True, default=None)

    truck_access: Optional[Series[bool]] = Field(coerce=True, nullable=True, default=True)
    osm_link_id: Series[str] = Field(coerce=True, nullable=True, default="")
    # todo this should be List[dict] but ranch output something else so had to have it be Any.
    locationReferences: Optional[Series[Any]] = Field(
        coerce=True,
        nullable=True,
        default="",
    )

    GP_A: Optional[Series[Int64]] = Field(coerce=True, nullable=True, default=None)
    GP_B: Optional[Series[Int64]] = Field(coerce=True, nullable=True, default=None)

    class Config:
        """Config for RoadLinksTable."""

        add_missing_columns = True
        coerce = True
        unique: ClassVar[list[str]] = ["A", "B"]

    @pa.check("sc_*", regex=True, element_wise=True)
    def check_scoped_fields(cls, scoped_value: Series) -> Series[bool]:
        """Checks that all fields starting with 'sc_' or 'sc_ML_' are valid ScopedLinkValueList.

        Custom check to validate fields starting with 'sc_' or 'sc_ML_'
        against a ScopedLinkValueItem model, handling both mandatory and optional fields.
        """
        if scoped_value is None or (not isinstance(scoped_value, list) and pd.isna(scoped_value)):
            return True
        return validate_pyd(scoped_value, ScopedLinkValueList)

network_wrangler.models.roadway.tables.RoadLinksTable.check_scoped_fields

check_scoped_fields(scoped_value)

Checks that all fields starting with ‘sc_’ or ‘sc_ML_’ are valid ScopedLinkValueList.

Custom check to validate fields starting with ‘sc_’ or ‘sc_ML_’ against a ScopedLinkValueItem model, handling both mandatory and optional fields.

Source code in network_wrangler/models/roadway/tables.py
@pa.check("sc_*", regex=True, element_wise=True)
def check_scoped_fields(cls, scoped_value: Series) -> Series[bool]:
    """Checks that all fields starting with 'sc_' or 'sc_ML_' are valid ScopedLinkValueList.

    Custom check to validate fields starting with 'sc_' or 'sc_ML_'
    against a ScopedLinkValueItem model, handling both mandatory and optional fields.
    """
    if scoped_value is None or (not isinstance(scoped_value, list) and pd.isna(scoped_value)):
        return True
    return validate_pyd(scoped_value, ScopedLinkValueList)

network_wrangler.models.roadway.tables.RoadNodesTable

Bases: DataFrameModel

Datamodel used to validate if nodes_df is of correct format and types.

Must have a record for each node used by the links table and by the transit shapes, stop_times, and stops tables.

Attributes:

  • model_node_id (int) –

    Unique identifier for the node.

  • osm_node_id (Optional[str]) –

    Reference to open street map node id. Used for querying. Not guaranteed to be unique.

  • X (float) –

    Longitude of the node in WGS84. Must be in the range of -180 to 180.

  • Y (float) –

    Latitude of the node in WGS84. Must be in the range of -90 to 90.

  • geometry (GeoSeries) –

    Warning: this attribute is controlled by wrangler and should not be explicitly user-edited.

Source code in network_wrangler/models/roadway/tables.py
class RoadNodesTable(DataFrameModel):
    """Datamodel used to validate if nodes_df is of correct format and types.

    Must have a record for each node used by the `links` table and by the transit `shapes`, `stop_times`, and `stops` tables.

    Attributes:
        model_node_id (int): Unique identifier for the node.
        osm_node_id (Optional[str]): Reference to open street map node id. Used for querying. Not guaranteed to be unique.
        X (float): Longitude of the node in WGS84. Must be in the range of -180 to 180.
        Y (float): Latitude of the node in WGS84. Must be in the range of -90 to 90.
        geometry (GeoSeries): **Warning**: this attribute is controlled by wrangler and should not be explicitly user-edited.
    """

    model_node_id: Series[int] = Field(coerce=True, unique=True, nullable=False)
    model_node_idx: Optional[Series[int]] = Field(coerce=True, unique=True, nullable=False)
    X: Series[float] = Field(coerce=True, nullable=False)
    Y: Series[float] = Field(coerce=True, nullable=False)
    geometry: GeoSeries

    # optional fields
    osm_node_id: Series[str] = Field(
        coerce=True,
        nullable=True,
        default="",
    )
    projects: Series[str] = Field(coerce=True, default="")
    inboundReferenceIds: Optional[Series[list[str]]] = Field(coerce=True, nullable=True)
    outboundReferenceIds: Optional[Series[list[str]]] = Field(coerce=True, nullable=True)

    class Config:
        """Config for RoadNodesTable."""

        add_missing_columns = True
        coerce = True
        _pk: ClassVar[TablePrimaryKeys] = ["model_node_id"]

network_wrangler.models.roadway.tables.RoadShapesTable

Bases: DataFrameModel

Datamodel used to validate if shapes_df is of correct format and types.

Should have a record for each shape_id referenced in links table.

Attributes:

  • shape_id (str) –

    Unique identifier for the shape.

  • geometry (GeoSeries) –

    Warning: this attribute is controlled by wrangler and should not be explicitly user-edited. Geometry of the shape.

  • ref_shape_id (Optional[str]) –

    Reference to another shape_id that it may have been created from. Default is None.

Source code in network_wrangler/models/roadway/tables.py
class RoadShapesTable(DataFrameModel):
    """Datamodel used to validate if shapes_df is of correct format and types.

    Should have a record for each `shape_id` referenced in `links` table.

    Attributes:
        shape_id (str): Unique identifier for the shape.
        geometry (GeoSeries): **Warning**: this attribute is controlled by wrangler and should not be explicitly user-edited.
            Geometry of the shape.
        ref_shape_id (Optional[str]): Reference to another `shape_id` that it may
            have been created from. Default is None.
    """

    shape_id: Series[str] = Field(unique=True)
    shape_id_idx: Optional[Series[int]] = Field(unique=True)

    geometry: GeoSeries = Field()
    ref_shape_id: Optional[Series] = Field(nullable=True)

    class Config:
        """Config for RoadShapesTable."""

        coerce = True
        _pk: ClassVar[TablePrimaryKeys] = ["shape_id"]

Complex roadway types defined using Pydantic models to facilitation validation.

network_wrangler.models.roadway.types.LocationReferences module-attribute

LocationReferences = conlist(LocationReference, min_length=2)

List of at least two LocationReferences which define a path.

network_wrangler.models.roadway.types.LocationReference

Bases: BaseModel

SharedStreets-defined object for location reference.

Source code in network_wrangler/models/roadway/types.py
class LocationReference(BaseModel):
    """SharedStreets-defined object for location reference."""

    sequence: PositiveInt
    point: LatLongCoordinates
    bearing: float = Field(None, ge=-360, le=360)
    distanceToNextRef: NonNegativeFloat
    intersectionId: str

network_wrangler.models.roadway.types.ScopedLinkValueItem

Bases: RecordModel

Define the value of a link property for a particular timespan or category.

Attributes:

  • `category` (str) –

    Category or link user that this scoped value applies to, ex: HOV2, truck, etc. Categories are user-defined with the exception of any which is reserved as the default category. Default is DEFAULT_CATEGORY, which is all.

  • `timespan` (list[TimeString]) –

    timespan of the link property as defined as a list of two HH:MM(:SS) strings. Default is DEFAULT_TIMESPAN, which is ["00:00", "24:00"].

  • `value` (Union[float, int, str]) –

    Value of the link property for the given category and timespan.

Conflicting or matching scopes are not allowed in a list of ScopedLinkValueItems:

  • matching: a scope that could be applied for a given category/timespan combination. This includes the default scopes as well as scopes that are contained within the given category AND timespan combination.
  • overlapping: a scope that fully or partially overlaps a given category OR timespan combination. This includes the default scopes, all matching scopes and all scopes where at least one minute of timespan or one category overlap.
  • conflicting: a scope that is overlapping but not matching for a given category/timespan.

NOTE: Default scope values of category: any and timespan:["00:00", "24:00"] are not considered conflicting, but are applied to residual scopes.

Source code in network_wrangler/models/roadway/types.py
class ScopedLinkValueItem(RecordModel):
    """Define the value of a link property for a particular timespan or category.

    Attributes:
        `category` (str): Category or link user that this scoped value applies to, ex: `HOV2`,
            `truck`, etc.  Categories are user-defined with the exception of `any` which is
            reserved as the default category. Default is `DEFAULT_CATEGORY`, which is `all`.
        `timespan` (list[TimeString]): timespan of the link property as defined as a list of
            two HH:MM(:SS) strings. Default is `DEFAULT_TIMESPAN`, which is `["00:00", "24:00"]`.
        `value` (Union[float, int, str]): Value of the link property for the given category and
            timespan.

    Conflicting or matching scopes are not allowed in a list of ScopedLinkValueItems:

    - `matching`: a scope that could be applied for a given category/timespan combination. This includes the default scopes as well as scopes that are contained within the given category AND timespan combination.
    - `overlapping`: a scope that fully or partially overlaps a given category OR timespan combination.  This includes the default scopes, all `matching` scopes and all scopes where at least one minute of timespan or one category overlap.
    - `conflicting`: a scope that is overlapping but not matching for a given category/timespan.

    NOTE: Default scope values of `category: any` and `timespan:["00:00", "24:00"]` are **not** considered conflicting, but are applied to residual scopes.
    """

    require_any_of: ClassVar[AnyOf] = [["category", "timespan"]]
    model_config = ConfigDict(extra="forbid")
    category: Optional[Union[str, int]] = Field(default=DEFAULT_CATEGORY)
    timespan: Optional[list[TimeString]] = Field(default=DEFAULT_TIMESPAN)
    value: Union[int, float, str]

    @property
    def timespan_dt(self) -> list[list[datetime]]:
        """Convert timespan to list of datetime objects."""
        return str_to_time_list(self.timespan)

    @field_validator("timespan")
    @classmethod
    def validate_timespan(cls, v):
        """Validate the timespan field."""
        if v is not None:
            return validate_timespan_string(v)
        return v

network_wrangler.models.roadway.types.ScopedLinkValueItem.timespan_dt property

timespan_dt

Convert timespan to list of datetime objects.

network_wrangler.models.roadway.types.ScopedLinkValueItem.validate_timespan classmethod

validate_timespan(v)

Validate the timespan field.

Source code in network_wrangler/models/roadway/types.py
@field_validator("timespan")
@classmethod
def validate_timespan(cls, v):
    """Validate the timespan field."""
    if v is not None:
        return validate_timespan_string(v)
    return v

network_wrangler.models.roadway.types.ScopedLinkValueList

Bases: RootListMixin, RootModel

List of non-conflicting ScopedLinkValueItems.

Source code in network_wrangler/models/roadway/types.py
class ScopedLinkValueList(RootListMixin, RootModel):
    """List of non-conflicting ScopedLinkValueItems."""

    root: list[ScopedLinkValueItem]

    def overlapping_timespans(self, timespan: Timespan):
        """Identify overlapping timespans in the list."""
        timespan_dt = str_to_time_list(timespan)
        return [i for i in self if dt_overlaps(i.timespan_dt, timespan_dt)]

    @model_validator(mode="after")
    def check_conflicting_scopes(self):
        """Check for conflicting scopes in the list."""
        conflicts = []
        for i in self:
            if i.timespan == DEFAULT_TIMESPAN:
                continue
            overlapping_ts_i = self.overlapping_timespans(i.timespan)
            for j in overlapping_ts_i:
                if j == i:
                    continue
                if j.category == i.category:
                    conflicts.append((i, j))
        if conflicts:
            msg = "Conflicting scopes in ScopedLinkValueList:\n"
            WranglerLogger.error(msg + f" Conflicts: \n{conflicts}")
            raise ScopeLinkValueError(msg)

        return self

network_wrangler.models.roadway.types.ScopedLinkValueList.check_conflicting_scopes

check_conflicting_scopes()

Check for conflicting scopes in the list.

Source code in network_wrangler/models/roadway/types.py
@model_validator(mode="after")
def check_conflicting_scopes(self):
    """Check for conflicting scopes in the list."""
    conflicts = []
    for i in self:
        if i.timespan == DEFAULT_TIMESPAN:
            continue
        overlapping_ts_i = self.overlapping_timespans(i.timespan)
        for j in overlapping_ts_i:
            if j == i:
                continue
            if j.category == i.category:
                conflicts.append((i, j))
    if conflicts:
        msg = "Conflicting scopes in ScopedLinkValueList:\n"
        WranglerLogger.error(msg + f" Conflicts: \n{conflicts}")
        raise ScopeLinkValueError(msg)

    return self

network_wrangler.models.roadway.types.ScopedLinkValueList.overlapping_timespans

overlapping_timespans(timespan)

Identify overlapping timespans in the list.

Source code in network_wrangler/models/roadway/types.py
def overlapping_timespans(self, timespan: Timespan):
    """Identify overlapping timespans in the list."""
    timespan_dt = str_to_time_list(timespan)
    return [i for i in self if dt_overlaps(i.timespan_dt, timespan_dt)]

Functions to read in and write out a RoadLinksTable.

read_links(filename, in_crs=LAT_LON_CRS, config=DefaultConfig, nodes_df=None, filter_to_nodes=False)

Reads links and returns a geodataframe of links conforming to RoadLinksTable.

Sets index to be a copy of the primary key. Validates output dataframe using RoadLinksTable

Parameters:

  • filename (str) –

    file to read links in from.

  • in_crs (int, default: LAT_LON_CRS ) –

    coordinate reference system number any link geometries are stored in. Defaults to 4323.

  • config (WranglerConfig, default: DefaultConfig ) –

    WranglerConfig instance. Defaults to DefaultConfig.

  • nodes_df (DataFrame[RoadNodesTable], default: None ) –

    a RoadNodesTable to gather geometry from. Necesary if geometry is not provided. Defaults to None.

  • filter_to_nodes (bool, default: False ) –

    if True, will filter links to only those that connect to nodes. Requires nodes_df to be provided. Defaults to False.

Source code in network_wrangler/roadway/links/io.py
@validate_call_pyd
def read_links(
    filename: Path,
    in_crs: int = LAT_LON_CRS,
    config: WranglerConfig = DefaultConfig,
    nodes_df: DataFrame[RoadNodesTable] = None,
    filter_to_nodes: bool = False,
) -> DataFrame[RoadLinksTable]:
    """Reads links and returns a geodataframe of links conforming to RoadLinksTable.

    Sets index to be a copy of the primary key.
    Validates output dataframe using RoadLinksTable

    Args:
        filename (str): file to read links in from.
        in_crs: coordinate reference system number any link geometries are stored in.
            Defaults to 4323.
        config: WranglerConfig instance. Defaults to DefaultConfig.
        nodes_df: a RoadNodesTable to gather geometry from. Necesary if geometry is not
            provided. Defaults to None.
        filter_to_nodes: if True, will filter links to only those that connect to nodes. Requires
            nodes_df to be provided. Defaults to False.
    """
    WranglerLogger.info(f"Reading links from {filename}.")
    start_t = time.time()
    if filter_to_nodes is True and nodes_df is None:
        msg = "If filter_to_nodes is True, nodes_df must be provided."
        raise ValueError(msg)

    links_df = read_table(filename, read_speed=config.CPU.EST_PD_READ_SPEED)

    if filter_to_nodes:
        WranglerLogger.debug("Filtering links to only those that connect to nodes.")
        links_df = links_df[
            links_df["A"].isin(nodes_df.model_node_id) & links_df["B"].isin(nodes_df.model_node_id)
        ]

    WranglerLogger.debug(f"Read {len(links_df)} links in {round(time.time() - start_t, 2)}.")
    links_df = data_to_links_df(links_df, in_crs=in_crs, nodes_df=nodes_df)
    links_df.attrs["source_file"] = filename
    WranglerLogger.info(
        f"Read + transformed {len(links_df)} links from \
            {filename} in {round(time.time() - start_t, 2)}."
    )
    return links_df
write_links(links_df, out_dir='.', convert_complex_properties_to_single_field=False, prefix='', file_format='json', overwrite=False, include_geometry=False)

Writes links to a file.

Parameters:

  • links_df (DataFrame[RoadLinksTable]) –

    DataFrame[RoadLinksTable] to write out.

  • convert_complex_properties_to_single_field (bool, default: False ) –

    if True, will convert complex properties to a single column consistent with v0 format. This format is NOT valid with parquet and many other softwares. Defaults to False.

  • out_dir (Union[str, Path], default: '.' ) –

    directory to write files to. Defaults to “.”.

  • prefix (str, default: '' ) –

    prefix to add to the filename. Defaults to “”.

  • file_format (GeoFileTypes, default: 'json' ) –

    file format to write out to. Defaults to “json”.

  • overwrite (bool, default: False ) –

    if True, will overwrite existing files. Defaults to False.

  • include_geometry (bool, default: False ) –

    if True, will include geometry in the output. Defaults to False.

Source code in network_wrangler/roadway/links/io.py
@validate_call_pyd
def write_links(
    links_df: DataFrame[RoadLinksTable],
    out_dir: Union[str, Path] = ".",
    convert_complex_properties_to_single_field: bool = False,
    prefix: str = "",
    file_format: GeoFileTypes = "json",
    overwrite: bool = False,
    include_geometry: bool = False,
) -> None:
    """Writes links to a file.

    Args:
        links_df: DataFrame[RoadLinksTable] to write out.
        convert_complex_properties_to_single_field: if True, will convert complex properties to a
            single column consistent with v0 format.  This format is NOT valid
            with parquet and many other softwares. Defaults to False.
        out_dir: directory to write files to. Defaults to ".".
        prefix: prefix to add to the filename. Defaults to "".
        file_format: file format to write out to. Defaults to "json".
        overwrite: if True, will overwrite existing files. Defaults to False.
        include_geometry: if True, will include geometry in the output. Defaults to False.
    """
    if not include_geometry and file_format == "geojson":
        file_format = "json"

    links_file = Path(out_dir) / f"{prefix}link.{file_format}"

    if convert_complex_properties_to_single_field:
        if file_format == "parquet":
            WranglerLogger.error(
                "convert_complex_properties_to_single_column is not supported with parquet. \
                Setting to False."
            )
            convert_complex_properties_to_single_field = False
        v1_links_df = links_df.copy()
        links_df = translate_links_df_v1_to_v0(v1_links_df)

    if not include_geometry:
        geo_cols = links_df.select_dtypes(include=["geometry"]).columns.tolist()
        links_df = pd.DataFrame(links_df)
        links_df = links_df.drop(columns=geo_cols)

    links_df = order_fields_from_data_model(links_df, RoadLinksTable)
    write_table(links_df, links_file, overwrite=overwrite)

Functions for creating RoadLinksTables.

copy_links(links_df, link_id_lookup, node_id_lookup, updated_geometry_col=None, nodes_df=None, offset_meters=-5, copy_properties=None, rename_properties=None, name_prefix='copy of', validate=True)

Copy links and optionally offset them.

Will get geometry from another column if provided, otherwise will use nodes_df and then offset_meters to offset from previous geometry.

Parameters:

  • links_df (DataFrame[RoadLinksTable]) –

    links dataframe of links to copy

  • link_id_lookup (dict[int, int]) –

    lookup of new link ID from old link id.

  • node_id_lookup (dict[int, int]) –

    lookup of new node ID from old node id.

  • updated_geometry_col (str, default: None ) –

    name of the column to store the updated geometry. Will nodes_df for missing geometries if provided and offset_meters if not. Defaults to None.

  • nodes_df (DataFrame[RoadNodesTable], default: None ) –

    nodes dataframe of nodes to use for new link geometry. Defaults to None. If not provided, will use offset_meters.

  • offset_meters (float, default: -5 ) –

    distance to offset links if nodes_df is not provided. Defaults to -5.

  • copy_properties (list[str], default: None ) –

    properties to keep. Defaults to [].

  • rename_properties (dict[str, str], default: None ) –

    properties to rename. Defaults to {}. Will default to REQUIRED_RENAMES if keys in that dict are not provided.

  • name_prefix (str, default: 'copy of' ) –

    format string for new names. Defaults to “copy of”.

  • validate (bool, default: True ) –

    whether to validate the output dataframe. Defaults to True. If set to false, you should validate the output dataframe before using it.

Returns:

  • DataFrame[RoadLinksTable]

    DataFrame[RoadLinksTable]: offset links dataframe

Source code in network_wrangler/roadway/links/create.py
def copy_links(
    links_df: DataFrame[RoadLinksTable],
    link_id_lookup: dict[int, int],
    node_id_lookup: dict[int, int],
    updated_geometry_col: Optional[str] = None,
    nodes_df: Optional[DataFrame[RoadNodesTable]] = None,
    offset_meters: float = -5,
    copy_properties: Optional[list[str]] = None,
    rename_properties: Optional[dict[str, str]] = None,
    name_prefix: str = "copy of",
    validate: bool = True,
) -> DataFrame[RoadLinksTable]:
    """Copy links and optionally offset them.

    Will get geometry from another column if provided, otherwise will use nodes_df and then
    offset_meters to offset from previous geometry.

    Args:
        links_df (DataFrame[RoadLinksTable]): links dataframe of links to copy
        link_id_lookup (dict[int, int]): lookup of new link ID from old link id.
        node_id_lookup (dict[int, int]): lookup of new node ID from old node id.
        updated_geometry_col (str): name of the column to store the updated geometry.
            Will nodes_df for missing geometries if provided and offset_meters if not.
            Defaults to None.
        nodes_df (DataFrame[RoadNodesTable]): nodes dataframe of nodes to use for new
            link geometry. Defaults to None. If not provided, will use offset_meters.
        offset_meters (float): distance to offset links if nodes_df is not provided.
            Defaults to -5.
        copy_properties (list[str], optional): properties to keep. Defaults to [].
        rename_properties (dict[str, str], optional): properties to rename. Defaults to {}.
            Will default to REQUIRED_RENAMES if keys in that dict are not provided.
        name_prefix (str, optional): format string for new names. Defaults to "copy of".
        validate (bool, optional): whether to validate the output dataframe. Defaults to True.
            If set to false, you should validate the output dataframe before using it.

    Returns:
        DataFrame[RoadLinksTable]: offset links dataframe
    """
    copy_properties = copy_properties or []
    rename_properties = rename_properties or {}

    REQUIRED_KEEP = ["A", "B", "name", "distance", "geometry", "model_link_id"]

    # Should rename these columns to these columns - unless overriden by rename_properties
    REQUIRED_RENAMES = {
        "A": "source_A",
        "B": "source_B",
        "model_link_id": "source_model_link_id",
        "geometry": "source_geometry",
    }
    # cannot rename a column TO these fields
    FORBIDDEN_RENAMES = ["A", "B", "model_link_id", "geometry", "name"]
    WranglerLogger.debug(f"Copying {len(links_df)} links.")

    rename_properties = {k: v for k, v in rename_properties.items() if v not in FORBIDDEN_RENAMES}
    REQUIRED_RENAMES.update(rename_properties)
    # rename if different, otherwise copy
    rename_properties = {k: v for k, v in REQUIRED_RENAMES.items() if k != v}
    copy_properties += [
        k for k, v in REQUIRED_RENAMES.items() if k == v and k not in copy_properties
    ]

    _missing_copy_properties = set(copy_properties) - set(links_df.columns)
    if _missing_copy_properties:
        WranglerLogger.warning(
            f"Specified properties to copy not found in links_df.\
            Proceeding without copying: {_missing_copy_properties}"
        )
        copy_properties = [c for c in copy_properties if c not in _missing_copy_properties]

    _missing_rename_properties = set(rename_properties.keys()) - set(links_df.columns)
    if _missing_rename_properties:
        WranglerLogger.warning(
            f"Specified properties to rename not found in links_df.\
            Proceeding without renaming: {_missing_rename_properties}"
        )
        rename_properties = {
            k: v for k, v in rename_properties.items() if k not in _missing_rename_properties
        }

    offset_links = copy.deepcopy(links_df)
    drop_before_rename = [k for k in rename_properties.values() if k in offset_links.columns]
    offset_links = offset_links.drop(columns=drop_before_rename)
    offset_links = offset_links.rename(columns=rename_properties)

    offset_links["A"] = offset_links["source_A"].map(node_id_lookup)
    offset_links["B"] = offset_links["source_B"].map(node_id_lookup)
    offset_links["model_link_id"] = offset_links["source_model_link_id"].map(link_id_lookup)
    offset_links["name"] = name_prefix + " " + offset_links["name"]

    if updated_geometry_col is not None:
        offset_links = offset_links.rename(columns={updated_geometry_col: "geometry"})
    else:
        offset_links["geometry"] = None

    if nodes_df is None and offset_links.geometry.isna().values.any():
        WranglerLogger.debug(
            f"Adding node-based geometry with for {sum(offset_links.geometry.isna())} links."
        )
        offset_links.loc[[offset_links.geometry.isna(), "geometry"]] = offset_geometry_meters(
            offset_links["geometry"],
            offset_meters,
        )
    if offset_links.geometry.isna().values.any():
        WranglerLogger.debug(
            f"Adding offset geometry with for {sum(offset_links.geometry.isna())} links."
        )
        offset_links.loc[[offset_links.geometry.isna(), "geometry"]] = linestring_from_nodes(
            offset_links, nodes_df
        )

    offset_links = offset_links.set_geometry("geometry", inplace=False)
    offset_links.crs = links_df.crs
    offset_links["distance"] = length_of_linestring_miles(offset_links["geometry"])

    keep_properties = list(set(copy_properties + REQUIRED_KEEP + list(rename_properties.values())))
    offset_links = offset_links[keep_properties]

    # create and set index for new model_link_ids
    # offset_links.attrs.update(RoadLinksAttrs)
    offset_links = offset_links.reset_index(drop=True)
    offset_links = set_df_index_to_pk(offset_links)

    if validate:
        offset_links = validate_df_to_model(offset_links, RoadLinksTable)
    else:
        WranglerLogger.warning(
            "Skipping validation of offset links. Validate to RoadLinksTable before using."
        )
    return offset_links
data_to_links_df(links_df, in_crs=LAT_LON_CRS, nodes_df=None)

Create a links dataframe from list of link properties + link geometries or associated nodes.

Sets index to be a copy of the primary key. Validates output dataframe using LinksSchema.

Parameters:

  • links_df (DataFrame) –

    df or list of dictionaries of link properties

  • in_crs (int, default: LAT_LON_CRS ) –

    coordinate reference system id for incoming links if geometry already exists. Defaults to LAT_LON_CRS. Will convert everything to LAT_LON_CRSif it doesn’t match.

  • nodes_df (Union[None, DataFrame[RoadNodesTable]], default: None ) –

    Associated notes geodataframe to use if geometries or location references not present. Defaults to None.

Returns:

Source code in network_wrangler/roadway/links/create.py
@validate_call_pyd
def data_to_links_df(
    links_df: Union[pd.DataFrame, list[dict]],
    in_crs: int = LAT_LON_CRS,
    nodes_df: Union[None, DataFrame[RoadNodesTable]] = None,
) -> DataFrame[RoadLinksTable]:
    """Create a links dataframe from list of link properties + link geometries or associated nodes.

    Sets index to be a copy of the primary key.
    Validates output dataframe using LinksSchema.

    Args:
        links_df (pd.DataFrame): df or list of dictionaries of link properties
        in_crs: coordinate reference system id for incoming links if geometry already exists.
            Defaults to LAT_LON_CRS. Will convert everything to LAT_LON_CRSif it doesn't match.
        nodes_df: Associated notes geodataframe to use if geometries or location references not
            present. Defaults to None.

    Returns:
        pd.DataFrame: _description_
    """
    WranglerLogger.debug(f"Creating {len(links_df)} links.")
    if not isinstance(links_df, pd.DataFrame):
        links_df = pd.DataFrame(links_df)
    # WranglerLogger.debug(f"data_to_links_df.links_df input: \n{links_df.head}.")

    v0_link_properties = detect_v0_scoped_link_properties(links_df)
    if v0_link_properties:
        links_df = translate_links_df_v0_to_v1(links_df, complex_properties=v0_link_properties)

    links_df = _fill_missing_link_geometries_from_nodes(links_df, nodes_df)
    # Now that have geometry, make sure is GDF
    links_df = coerce_gdf(links_df, in_crs=in_crs, geometry=links_df.geometry)

    links_df = _fill_missing_distance_from_geometry(links_df)

    links_df = _harmonize_crs(links_df, LAT_LON_CRS)
    nodes_df = _harmonize_crs(nodes_df, LAT_LON_CRS)

    links_df.attrs.update(RoadLinksAttrs)
    links_df = set_df_index_to_pk(links_df)
    links_df.gdf_name = links_df.attrs["name"]
    links_df = validate_df_to_model(links_df, RoadLinksTable)

    if len(links_df) < SMALL_RECS:
        WranglerLogger.debug(
            f"New Links: \n{links_df[links_df.attrs['display_cols'] + ['geometry']]}"
        )
    else:
        WranglerLogger.debug(f"{len(links_df)} new links.")

    return links_df
shape_id_from_link_geometry(links_df)

Create a unique shape_id from the geometry of the link.

Source code in network_wrangler/roadway/links/create.py
def shape_id_from_link_geometry(
    links_df: pd.DataFrame,
) -> gpd.GeoDataFrame:
    """Create a unique shape_id from the geometry of the link."""
    shape_ids = links_df["geometry"].apply(create_unique_shape_id)
    return shape_ids

Deletes links from RoadLinksTable.

network_wrangler.roadway.links.delete.check_deletion_breaks_transit_shapes

check_deletion_breaks_transit_shapes(links_df, del_link_ids, transit_net)

Check if any transit shapes go on the deleted links.

Parameters:

  • links_df (DataFrame[RoadLinksTable]) –

    DataFrame[RoadLinksTable] to delete links from.

  • del_link_ids (list[int]) –

    list of link ids to delete.

  • transit_net (TransitNetwork) –

    input TransitNetwork

Source code in network_wrangler/roadway/links/delete.py
def check_deletion_breaks_transit_shapes(
    links_df: DataFrame[RoadLinksTable], del_link_ids: list[int], transit_net: TransitNetwork
) -> bool:
    """Check if any transit shapes go on the deleted links.

    Args:
        links_df: DataFrame[RoadLinksTable] to delete links from.
        del_link_ids: list of link ids to delete.
        transit_net: input TransitNetwork

    returns: true if there are broken shapes, false otherwise
    """
    missing_links = shape_links_without_road_links(
        transit_net.feed.shapes, links_df[~links_df.index.isin(del_link_ids)]
    )
    if not missing_links.empty:
        msg = f"Deletion breaks transit shapes:\n{missing_links}"
        WranglerLogger.warning(msg)
        return True
    return False
delete_links_by_ids(links_df, del_link_ids, ignore_missing=False, transit_net=None)

Delete links from a links table.

Parameters:

  • links_df (DataFrame[RoadLinksTable]) –

    DataFrame[RoadLinksTable] to delete links from.

  • del_link_ids (list[int]) –

    list of link ids to delete.

  • ignore_missing (bool, default: False ) –

    if True, will not raise an error if a link id to delete is not in the network. Defaults to False.

  • transit_net (Optional[TransitNetwork], default: None ) –

    If provided, will check TransitNetwork and warn if deletion breaks transit shapes. Defaults to None.

Source code in network_wrangler/roadway/links/delete.py
def delete_links_by_ids(
    links_df: DataFrame[RoadLinksTable],
    del_link_ids: list[int],
    ignore_missing: bool = False,
    transit_net: Optional[TransitNetwork] = None,
) -> DataFrame[RoadLinksTable]:
    """Delete links from a links table.

    Args:
        links_df: DataFrame[RoadLinksTable] to delete links from.
        del_link_ids: list of link ids to delete.
        ignore_missing: if True, will not raise an error if a link id to delete is not in
            the network. Defaults to False.
        transit_net: If provided, will check TransitNetwork and warn if deletion breaks transit shapes. Defaults to None.
    """
    WranglerLogger.debug(f"Deleting links with ids: \n{del_link_ids}")
    _missing = set(del_link_ids) - set(links_df.index)
    if _missing:
        WranglerLogger.warning(f"Links in network not there to delete: \n{_missing}")
        if not ignore_missing:
            msg = "Links to delete are not in the network."
            raise LinkDeletionError(msg)

    if transit_net is not None:
        check_deletion_breaks_transit_shapes(links_df, del_link_ids, transit_net)
    return links_df.drop(labels=del_link_ids, errors="ignore")

Edits RoadLinksTable properties.

NOTE: Each public method will return a new, whole copy of the RoadLinksTable with associated edits. Private methods may return mutated originals.

Usage:

# Returns copy of links_df with lanes set to 2 for links in link_idx
links_df = edit_link_property(links_df, link_idx, "lanes", {"set": 2})
# Returns copy of links_df with price reduced by 50 for links in link_idx and raises error
# if existing value doesn't match 100
links_df = edit_link_properties(
    links_df,
    link_idx,
    "price",
    {"existing": 100, "change": -50},
)
# Returns copy of links_df with geometry of links with node_ids updated based on nodes_df
links_df = edit_link_geometry_from_nodes(links_df, nodes_df, node_ids)
edit_link_geometry_from_nodes(links_df, nodes_df, node_ids)

Returns a copy of links with updated geometry for given links for a given list of nodes.

Should be called by any function that changes a node location.

Parameters:

  • links_df (DataFrame[RoadLinksTable]) –

    RoadLinksTable to update

  • nodes_df (DataFrame[RoadNodesTable]) –

    RoadNodesTable to get updated node geometry from

  • node_ids (list[int]) –

    list of node PKs with updated geometry

Source code in network_wrangler/roadway/links/edit.py
@validate_call_pyd
def edit_link_geometry_from_nodes(
    links_df: DataFrame[RoadLinksTable],
    nodes_df: DataFrame[RoadNodesTable],
    node_ids: list[int],
) -> DataFrame[RoadLinksTable]:
    """Returns a copy of links with updated geometry for given links for a given list of nodes.

    Should be called by any function that changes a node location.

    Args:
        links_df: RoadLinksTable to update
        nodes_df: RoadNodesTable to get updated node geometry from
        node_ids: list of node PKs with updated geometry
    """
    # WranglerLogger.debug(f"nodes_df.loc[node_ids]:\n {nodes_df.loc[node_ids]}")
    # TODO write wrapper on validate call so don't have to do this
    links_df.attrs.update(RoadLinksAttrs)
    nodes_df.attrs.update(RoadNodesAttrs)
    links_df = copy.deepcopy(links_df)

    updated_a_geometry = update_nodes_in_linestring_geometry(
        links_df.loc[links_df.A.isin(node_ids)], nodes_df, 0
    )
    links_df.update(updated_a_geometry)

    updated_b_geometry = update_nodes_in_linestring_geometry(
        links_df.loc[links_df.B.isin(node_ids)], nodes_df, -1
    )
    links_df.update(updated_b_geometry)

    _a_or_b_mask = links_df.A.isin(node_ids) | links_df.B.isin(node_ids)
    WranglerLogger.debug(f"links_df: \n{links_df.loc[_a_or_b_mask, ['A', 'B', 'geometry']]}")
    return links_df
edit_link_properties(links_df, link_idx, property_changes, project_name=None, config=DefaultConfig)

Return copy of RoadLinksTable with edited link properties for a list of links.

Parameters:

  • links_df (DataFrame[RoadLinksTable]) –

    links to edit

  • link_idx (list) –

    list of link indices to change

  • property_changes (dict[str, dict]) –

    dictionary of property changes

  • project_name (Optional[str], default: None ) –

    optional name of the project to be applied

  • config (WranglerConfig, default: DefaultConfig ) –

    WranglerConfig instance. Defaults to DefaultConfig.

Source code in network_wrangler/roadway/links/edit.py
@validate_call_pyd
def edit_link_properties(
    links_df: DataFrame[RoadLinksTable],
    link_idx: list,
    property_changes: dict[str, dict],
    project_name: Optional[str] = None,
    config: WranglerConfig = DefaultConfig,
) -> DataFrame[RoadLinksTable]:
    """Return copy of RoadLinksTable with edited link properties for a list of links.

    Args:
        links_df: links to edit
        link_idx: list of link indices to change
        property_changes: dictionary of property changes
        project_name: optional name of the project to be applied
        config: WranglerConfig instance. Defaults to DefaultConfig.
    """
    links_df = copy.deepcopy(links_df)
    # TODO write wrapper on validate call so don't have to do this
    links_df.attrs.update(RoadLinksAttrs)
    ml_property_changes = bool([k for k in property_changes if k.startswith("ML_")])
    existing_managed_lanes = len(links_df.loc[link_idx].of_type.managed) == 0
    flag_create_managed_lane = existing_managed_lanes & ml_property_changes

    # WranglerLogger.debug(f"property_changes: \n{property_changes}")
    for property, prop_change in property_changes.items():
        WranglerLogger.debug(f"prop_dict: \n{prop_change}")
        links_df = _edit_link_property(
            links_df,
            link_idx,
            property,
            prop_change,
            config=config,
        )

    # Only want to set this once per project.
    if project_name is not None:
        links_df.loc[link_idx, "projects"] += f"{project_name},"

    # if a managed lane created without access or egress, set it to True for all selected links
    if flag_create_managed_lane:
        if links_df.loc[link_idx].ML_access_point.sum() == 0:
            WranglerLogger.warning(
                "Access point not set in project card for a new managed lane.\
                                   \nSetting ML_access_point to True for selected links."
            )
            links_df.loc[link_idx, "ML_access_point"] = True
        if links_df.loc[link_idx].ML_egress_point.sum() == 0:
            WranglerLogger.warning(
                "Egress point not set in project card for a new managed lane.\
                                   \nSetting ML_egress_point to True for selected links."
            )
            links_df.loc[link_idx, "ML_egress_point"] = True

    links_df = validate_df_to_model(links_df, RoadLinksTable)
    return links_df

Functions to filter a RoadLinksTable based on various properties.

filter_link_properties_managed_lanes(links_df)

Filters links dataframe to only include managed lanes.

Source code in network_wrangler/roadway/links/filters.py
def filter_link_properties_managed_lanes(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include managed lanes."""
    return [
        i
        for i in links_df.columns
        if i.startswith("ML_")
        or (i.startswith("sc_ML_") and i not in ["ML_access_point", "ML_egress_point"])
    ]
filter_links_access_dummy(links_df)

Filters links dataframe to only include all access dummy links connecting managed lanes.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_access_dummy(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all access dummy links connecting managed lanes."""
    return links_df.loc[links_df["roadway"] == "ml_access_point"]
filter_links_centroid_connector(links_df)

Filters links dataframe to only include all general purpose links.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_centroid_connector(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all general purpose links."""
    raise NotImplementedError
filter_links_drive_access(links_df)

Filters links dataframe to only include all links that vehicles can operate on.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_drive_access(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all links that vehicles can operate on."""
    return filter_links_to_modes(links_df, "drive")
filter_links_dummy(links_df)

Filters links dataframe to only include all dummy links connecting managed lanes.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_dummy(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all dummy links connecting managed lanes."""
    return links_df.loc[
        (links_df["roadway"] == "ml_access_point") | (links_df["roadway"] == "ml_egress_point")
    ]
filter_links_egress_dummy(links_df)

Filters links dataframe to only include all egress dummy links connecting managed lanes.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_egress_dummy(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all egress dummy links connecting managed lanes."""
    return links_df.loc[links_df["roadway"] == "ml_egress_point"]
filter_links_general_purpose(links_df)

Filters links dataframe to only include all general purpose links.

NOTE: This will only return links without parallel managed lanes in a non-model-ready network.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_general_purpose(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all general purpose links.

    NOTE: This will only return links without parallel managed lanes in a non-model-ready network.
    """
    return links_df.loc[links_df["managed"] < 1]
filter_links_general_purpose_no_parallel_managed(links_df)

Filters links df to only include general purpose links without parallel managed lanes.

NOTE: This will only return links without parallel managed lanes in a non-model-ready network.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_general_purpose_no_parallel_managed(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links df to only include general purpose links without parallel managed lanes.

    NOTE: This will only return links without parallel managed lanes in a non-model-ready network.
    """
    return links_df.loc[links_df["managed"] == 0]
filter_links_managed_lanes(links_df)

Filters links dataframe to only include managed lanes.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_managed_lanes(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include managed lanes."""
    return links_df.loc[links_df["managed"] == 1]
filter_links_not_in_ids(links_df, link_ids)

Filters links dataframe to NOT have link_ids.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_not_in_ids(
    links_df: DataFrame[RoadLinksTable], link_ids: Union[list[int], pd.Series]
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to NOT have link_ids."""
    return links_df.loc[~links_df["model_link_id"].isin(link_ids)]
filter_links_parallel_general_purpose(links_df)

Filters links dataframe to only include general purpose links parallel to managed.

NOTE This will return Null when not a model network.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_parallel_general_purpose(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include general purpose links parallel to managed.

    NOTE This will return Null when not a model network.
    """
    return links_df.loc[links_df["managed"] == -1]
filter_links_pedbike_only(links_df)

Filters links dataframe to only include links that only ped/bikes can be on.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_pedbike_only(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include links that only ped/bikes can be on."""
    return links_df.loc[
        (
            ((links_df["walk_access"].astype(bool)) | (links_df["bike_access"].astype(bool)))
            & ~(links_df["drive_access"].astype(bool))
        )
    ]
filter_links_to_ids(links_df, link_ids)

Filters links dataframe by link_ids.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_to_ids(
    links_df: DataFrame[RoadLinksTable], link_ids: list[int]
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe by link_ids."""
    return links_df.loc[links_df["model_link_id"].isin(link_ids)]
filter_links_to_ml_access_points(links_df)

Filters links dataframe to only include all managed lane access points.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_to_ml_access_points(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all managed lane access points."""
    return links_df.loc[links_df["ML_access_point"].fillna(False)]
filter_links_to_ml_egress_points(links_df)

Filters links dataframe to only include all managed lane egress points.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_to_ml_egress_points(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all managed lane egress points."""
    return links_df.loc[links_df["ML_egress_point"].fillna(False)]
filter_links_to_modes(links_df, modes)

Filters links dataframe to only include links that are accessible by the modes in the list.

Parameters:

  • links_df (RoadLinksTable) –

    links dataframe

  • modes (List[str]) –

    list of modes to filter by.

Returns:

  • RoadLinksTable ( DataFrame[RoadLinksTable] ) –

    filtered links dataframe

Source code in network_wrangler/roadway/links/filters.py
def filter_links_to_modes(
    links_df: DataFrame[RoadLinksTable], modes: Union[str, list[str]]
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include links that are accessible by the modes in the list.

    Args:
        links_df (RoadLinksTable): links dataframe
        modes (List[str]): list of modes to filter by.

    Returns:
        RoadLinksTable: filtered links dataframe
    """
    if "any" in modes:
        return links_df
    if isinstance(modes, str):
        modes = [modes]
    _mode_link_props = list({m for m in modes for m in MODES_TO_NETWORK_LINK_VARIABLES[m]})
    return links_df.loc[links_df[_mode_link_props].any(axis=1)]
filter_links_to_node_ids(links_df, node_ids)

Filters links dataframe to only include links with either A or B in node_ids.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_to_node_ids(
    links_df: DataFrame[RoadLinksTable], node_ids: list[int]
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include links with either A or B in node_ids."""
    return links_df.loc[links_df["A"].isin(node_ids) | links_df["B"].isin(node_ids)]
filter_links_to_path(links_df, node_id_path_list, ignore_missing=False)

Return selection of links dataframe with nodes along path defined by node_id_path_list.

Parameters:

  • links_df (DataFrame[RoadLinksTable]) –

    Links dataframe to select from

  • node_id_path_list (list[int]) –

    List of node ids.

  • ignore_missing (bool, default: False ) –

    if True, will ignore if links noted by path node sequence don’t exist in links_df and will just return what does exist. Defaults to False.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_to_path(
    links_df: DataFrame[RoadLinksTable],
    node_id_path_list: list[int],
    ignore_missing: bool = False,
) -> DataFrame[RoadLinksTable]:
    """Return selection of links dataframe with nodes along path defined by node_id_path_list.

    Args:
        links_df: Links dataframe to select from
        node_id_path_list: List of node ids.
        ignore_missing: if True, will ignore if links noted by path node sequence don't exist in
            links_df and will just return what does exist. Defaults to False.
    """
    ab_pairs = [node_id_path_list[i : i + 2] for i, _ in enumerate(node_id_path_list)][:-1]
    path_links_df = pd.DataFrame(ab_pairs, columns=["A", "B"])

    selected_links_df = path_links_df.merge(
        links_df[["A", "B", "model_link_id"]],
        how="left",
        on=["A", "B"],
        indicator=True,
    )
    selected_link_ds = selected_links_df.model_link_id.unique().tolist()

    if not ignore_missing:
        missing_links_df = selected_links_df.loc[
            selected_links_df._merge == "left_only", ["A", "B"]
        ]
        if len(missing_links_df):
            WranglerLogger.error(f"! Path links missing in links_df \n {missing_links_df}")
            msg = "Path links missing in links_df."
            raise ValueError(msg)

    return filter_links_to_ids(links_df, selected_link_ds)
filter_links_transit_access(links_df)

Filters links dataframe to only include all links that transit can operate on.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_transit_access(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all links that transit can operate on."""
    return filter_links_to_modes(links_df, "transit")
filter_links_transit_only(links_df)

Filters links dataframe to only include all links that only transit can operate on.

Source code in network_wrangler/roadway/links/filters.py
def filter_links_transit_only(
    links_df: DataFrame[RoadLinksTable],
) -> DataFrame[RoadLinksTable]:
    """Filters links dataframe to only include all links that only transit can operate on."""
    return links_df.loc[(links_df["bus_only"].astype(bool)) | (links_df["rail_only"].astype(bool))]

Functions for updating roadway links with geometry from shapes.

network_wrangler.roadway.links.geo.true_shape

true_shape(links_df, shapes_df)

Updates geometry to have shape of shapes_df where available.

Source code in network_wrangler/roadway/links/geo.py
def true_shape(
    links_df: DataFrame[RoadLinksTable], shapes_df: DataFrame[RoadShapesTable]
) -> DataFrame[RoadLinksTable]:
    """Updates geometry to have shape of shapes_df where available."""
    return update_df_by_col_value(links_df, shapes_df, "shape_id", properties=["geometry"])

Utilities for filtering and querying scoped properties based on scoping dimensions.

This module provides various utility functions for filtering and querying scoped properties based on scoping dimensions such as category and timespan. It includes functions for filtering scoped values based on non-overlapping or overlapping timespans, non-overlapping or overlapping categories, and matching exact category and timespan. It also includes functions for creating exploded dataframes for scoped properties and filtering them based on scope.

Public Functions: - prop_for_scope: Creates a dataframe with the value of a property for a given category and timespan. Can return maximum overlapping timespan value given a minimum number of overlapping minutes, or strictly enforce timespans.

Internal function terminology for scopes:

  • matching scope value: a scope that could be applied for a given category/timespan combination. This includes the default scopes as well as scopes that are contained within the given category AND timespan combination.
  • overlapping scope value: a scope that fully or partially overlaps a given category OR timespan combination. This includes the default scopes, all matching scopes and all scopes where at lest one minute of timespan or one category overlap.
  • conflicting scope value: a scope that is overlapping but not matching for a given category/ timespan. By definition default scope values are not conflicting.
  • independent scope value: a scope value that is not overlapping.

Usage:

model_links_df["lanes_AM_sov"] = prop_for_scope(links_df, ["6:00":"9:00"], category="sov")

network_wrangler.roadway.links.scopes.prop_for_scope

prop_for_scope(links_df, prop_name, timespan=DEFAULT_TIMESPAN, category=DEFAULT_CATEGORY, strict_timespan_match=False, min_overlap_minutes=60, allow_default=True)

Creates a df with the value of a property for a given category and timespan.

Parameters:

  • links_df (DataFrame[RoadLinksTable]) –

    (RoadLinksTable

  • prop_name (str) –

    name of property to query

  • timespan (Union[None, list[TimeString]], default: DEFAULT_TIMESPAN ) –

    TimespanString of format [‘HH:MM’,’HH:MM’] to query orig_df for overlapping records.

  • category (Union[str, int, None], default: DEFAULT_CATEGORY ) –

    category to query orig_df for overlapping records. Defaults to None.

  • strict_timespan_match (bool, default: False ) –

    boolean indicating if the returned df should only contain records that fully contain the query timespan. If set to True, min_overlap_minutes does not apply. Defaults to False.

  • min_overlap_minutes (int, default: 60 ) –

    minimum number of minutes the timespans need to overlap to keep. Defaults to 0.

  • allow_default (bool, default: True ) –

    boolean indicating if the default value should be returned if no scoped values are found. Defaults to True.

Returns:

  • DataFrame

    pd.DataFrame with model_link_id and prop_name

Source code in network_wrangler/roadway/links/scopes.py
@validate_call_pyd
def prop_for_scope(
    links_df: DataFrame[RoadLinksTable],
    prop_name: str,
    timespan: Union[None, list[TimeString]] = DEFAULT_TIMESPAN,
    category: Union[str, int, None] = DEFAULT_CATEGORY,
    strict_timespan_match: bool = False,
    min_overlap_minutes: int = 60,
    allow_default: bool = True,
) -> pd.DataFrame:
    """Creates a df with the value of a property for a given category and timespan.

    Args:
        links_df:(RoadLinksTable
        prop_name: name of property to query
        timespan: TimespanString of format ['HH:MM','HH:MM'] to query orig_df for overlapping
            records.
        category: category to query orig_df for overlapping records. Defaults to None.
        strict_timespan_match: boolean indicating if the returned df should only contain
            records that fully contain the query timespan. If set to True, min_overlap_minutes
            does not apply. Defaults to False.
        min_overlap_minutes: minimum number of minutes the timespans need to overlap to keep.
            Defaults to 0.
        allow_default: boolean indicating if the default value should be returned if no scoped
            values are found. Defaults to True.

    Returns:
        pd.DataFrame with `model_link_id` and `prop_name`
    """
    links_df = validate_df_to_model(links_df, RoadLinksTable)
    timespan = timespan if timespan is not None else DEFAULT_TIMESPAN
    category = category if category is not None else DEFAULT_CATEGORY

    if prop_name not in links_df.columns:
        msg = f"{prop_name} not in dataframe."
        raise ValueError(msg)

    # Check if scoped values even exist and if can just return the default.
    if f"sc_{prop_name}" not in links_df.columns or links_df[f"sc_{prop_name}"].isna().all():
        if not allow_default:
            msg = f"{prop_name} does not have a scoped property column or it is null."
            WranglerLogger.error(
                msg + " Set `allow_default = True` or fill column `sc_{prop_name}`."
            )
            raise ValueError(msg)
        WranglerLogger.debug(f"No scoped values {prop_name}. Returning default.")
        return copy.deepcopy(links_df[["model_link_id", prop_name]])

    # All possible scopings
    candidate_scoped_prop_df = _create_exploded_df_for_scoped_prop(links_df, prop_name)

    # Find scopes that apply
    scoped_prop_df = _filter_exploded_df_to_scope(
        candidate_scoped_prop_df,
        timespan=timespan,
        category=category,
        strict_timespan_match=strict_timespan_match,
        min_overlap_minutes=min_overlap_minutes,
    )

    # Attach them back to all links and update default.
    result_df = copy.deepcopy(links_df[["model_link_id", prop_name]])
    result_df.loc[scoped_prop_df.index, prop_name] = scoped_prop_df["scoped"]
    WranglerLogger.debug(
        f"result_df[prop_name]: \n{result_df.loc[scoped_prop_df.index, prop_name]}"
    )
    return result_df

Utilities for summarizing a RoadLinksTable.

link_summary(links_df)

Summarizes links by link_summary_cats: count, distance, and lane miles.

Source code in network_wrangler/roadway/links/summary.py
def link_summary(links_df: DataFrame[RoadLinksTable]) -> pd.DataFrame:
    """Summarizes links by `link_summary_cats`: count, distance, and lane miles."""
    data = {
        "count": link_summary_cnt(links_df),
        "distance": link_summary_miles(links_df),
        "lane miles": link_summary_lane_miles(links_df),
    }
    return pd.DataFrame(data, index=link_summary_cats.keys())
link_summary_cnt(links_df)

Dictionary of number of links by link_summary_cats.

Source code in network_wrangler/roadway/links/summary.py
def link_summary_cnt(links_df: DataFrame[RoadLinksTable]) -> dict[str, int]:
    """Dictionary of number of links by `link_summary_cats`."""
    return {k: len(v(links_df)) for k, v in link_summary_cats.items()}
link_summary_lane_miles(links_df)

Dictionary of lane miles by link_summary_cats.

Source code in network_wrangler/roadway/links/summary.py
def link_summary_lane_miles(links_df: DataFrame[RoadLinksTable]) -> dict[str, float]:
    """Dictionary of lane miles by `link_summary_cats`."""
    return {k: calc_lane_miles(v(links_df)).sum() for k, v in link_summary_cats.items()}
link_summary_miles(links_df)

Dictionary of miles by link_summary_cats.

Source code in network_wrangler/roadway/links/summary.py
def link_summary_miles(links_df: DataFrame[RoadLinksTable]) -> dict[str, float]:
    """Dictionary of miles by `link_summary_cats`."""
    return {k: v(links_df).distance.sum() for k, v in link_summary_cats.items()}

Utilities for validating a RoadLinksTable beyond its data model.

validate_links_df(links_df, nodes_df=None, strict=False, errors_filename=Path('link_errors.csv'))

Validates a links df to RoadLinksTable and optionally checks if nodes are in the links.

Parameters:

  • links_df (DataFrame) –

    The links dataframe.

  • nodes_df (DataFrame, default: None ) –

    The nodes dataframe. Defaults to None.

  • strict (bool, default: False ) –

    If True, will validate to links_df without trying to parse it first.

  • errors_filename (Path, default: Path('link_errors.csv') ) –

    The output file for the validation errors. Defaults to “link_errors.csv”.

Returns:

  • bool ( bool ) –

    True if the links dataframe is valid.

Source code in network_wrangler/roadway/links/validate.py
def validate_links_df(
    links_df: pd.DataFrame,
    nodes_df: Optional[pd.DataFrame] = None,
    strict: bool = False,
    errors_filename: Path = Path("link_errors.csv"),
) -> bool:
    """Validates a links df to RoadLinksTable and optionally checks if nodes are in the links.

    Args:
        links_df (pd.DataFrame): The links dataframe.
        nodes_df (pd.DataFrame): The nodes dataframe. Defaults to None.
        strict (bool): If True, will validate to links_df without trying to parse it first.
        errors_filename (Path): The output file for the validation errors. Defaults
            to "link_errors.csv".

    Returns:
        bool: True if the links dataframe is valid.
    """
    from ...models.roadway.tables import RoadLinksTable  # noqa: PLC0415
    from ...utils.models import TableValidationError, validate_df_to_model  # noqa: PLC0415

    is_valid = True

    if not strict:
        from .create import data_to_links_df  # noqa: PLC0415

        try:
            links_df = data_to_links_df(links_df)
        except Exception as e:
            WranglerLogger.error(f"!!! [Links invalid] - Failed to parse links_df\n{e}")
            is_valid = False

    try:
        validate_df_to_model(links_df, RoadLinksTable, output_file=errors_filename)
    except TableValidationError as e:
        WranglerLogger.error(f"!!! [Links invalid] - Failed Schema validation\n{e}")
        is_valid = False

    try:
        validate_links_have_nodes(links_df, nodes_df)
    except NodesInLinksMissingError as e:
        WranglerLogger.error(f"!!! [Links invalid] - Nodes missing in links\n{e}")
        is_valid = False
    return is_valid
validate_links_file(links_filename, nodes_df=None, strict=False, errors_filename=Path('link_errors.csv'))

Validates a links file to RoadLinksTable and optionally checks if nodes are in the links.

Parameters:

  • links_filename (Path) –

    The links file.

  • nodes_df (DataFrame, default: None ) –

    The nodes dataframe. Defaults to None.

  • strict (bool, default: False ) –

    If True, will validate to links_df without trying to parse it first.

  • errors_filename (Path, default: Path('link_errors.csv') ) –

    The output file for the validation errors. Defaults to “link_errors.csv”.

Returns:

  • bool ( bool ) –

    True if the links file is valid.

Source code in network_wrangler/roadway/links/validate.py
def validate_links_file(
    links_filename: Path,
    nodes_df: Optional[pd.DataFrame] = None,
    strict: bool = False,
    errors_filename: Path = Path("link_errors.csv"),
) -> bool:
    """Validates a links file to RoadLinksTable and optionally checks if nodes are in the links.

    Args:
        links_filename (Path): The links file.
        nodes_df (pd.DataFrame): The nodes dataframe. Defaults to None.
        strict (bool): If True, will validate to links_df without trying to parse it first.
        errors_filename (Path): The output file for the validation errors. Defaults
            to "link_errors.csv".

    Returns:
        bool: True if the links file is valid.
    """
    links_df = pd.read_csv(links_filename)
    return validate_links_df(
        links_df, nodes_df=nodes_df, strict=strict, errors_filename=errors_filename
    )
validate_links_have_nodes(links_df, nodes_df)

Checks if links have nodes and returns a boolean.

raises: NodesInLinksMissingError if nodes_df is missing and A or B node

Source code in network_wrangler/roadway/links/validate.py
def validate_links_have_nodes(links_df: pd.DataFrame, nodes_df: pd.DataFrame) -> bool:
    """Checks if links have nodes and returns a boolean.

    raises: NodesInLinksMissingError if nodes_df is missing and A or B node
    """
    nodes_in_links = list(set(links_df["A"]).union(set(links_df["B"])))
    node_idx_in_links = nodes_df[nodes_df["model_node_id"].isin(nodes_in_links)].index

    fk_valid, fk_missing = fk_in_pk(nodes_df.index, node_idx_in_links)
    if not fk_valid:
        msg = "Links are missing len{fk_missing} nodes."
        WranglerLogger.error(msg + f"\n  Missing: {fk_missing}")
        raise NodesInLinksMissingError(msg)
    return True

Dataframe accessor shortcuts for RoadLinksTables allowing for easy filtering and editing.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor

Wrapper for various filters of RoadLinksTable.

Methods:

  • links_df.of_type.managed

    filters links dataframe to only include managed lanes.

  • links_df.of_type.parallel_general_purpose

    filters links dataframe to only include general purpose links parallel to managed.

  • links_df.of_type.general_purpose

    filters links dataframe to only include all general purpose links.

  • links_df.of_type.general_purpose_no_parallel_managed

    filters links dataframe to only include general purpose links without parallel managed lanes.

  • links_df.of_type.access_dummy

    filters links dataframe to only include all access dummy links connecting managed lanes.

  • links_df.of_type.egress_dummy

    filters links dataframe to only include all egress dummy links connecting managed lanes.

  • links_df.of_type.dummy

    filters links dataframe to only include all dummy links connecting managed lanes.

  • links_df.of_type.pedbike_only

    filters links dataframe to only include all links that only ped/bikes can be on.

  • links_df.of_type.transit_only

    filters links dataframe to only include all links that only transit can be on.

  • links_df.of_type.transit_access

    filters links dataframe to only include all links that transit can access.

  • links_df.of_type.drive_access

    filters links dataframe to only include all links that drive can access.

  • links_df.of_type.summary_df

    returns a summary of the links dataframe.

Source code in network_wrangler/roadway/links/df_accessors.py
@pd.api.extensions.register_dataframe_accessor("of_type")
class LinkOfTypeAccessor:
    """Wrapper for various filters of RoadLinksTable.

    Methods:
        links_df.of_type.managed: filters links dataframe to only include managed lanes.
        links_df.of_type.parallel_general_purpose: filters links dataframe to only include
            general purpose links parallel to managed.
        links_df.of_type.general_purpose: filters links dataframe to only include all general
            purpose links.
        links_df.of_type.general_purpose_no_parallel_managed: filters links dataframe to only
            include general purpose links without parallel managed lanes.
        links_df.of_type.access_dummy: filters links dataframe to only include all access dummy
            links connecting managed lanes.
        links_df.of_type.egress_dummy: filters links dataframe to only include all egress dummy
            links connecting managed lanes.
        links_df.of_type.dummy: filters links dataframe to only include all dummy links
            connecting managed lanes.
        links_df.of_type.pedbike_only: filters links dataframe to only include all links that
            only ped/bikes can be on.
        links_df.of_type.transit_only: filters links dataframe to only include all links that
            only transit can be on.
        links_df.of_type.transit_access: filters links dataframe to only include all links
            that transit can access.
        links_df.of_type.drive_access: filters links dataframe to only include all links
            that drive can access.
        links_df.of_type.summary_df: returns a summary of the links dataframe.

    """

    def __init__(self, links_df: DataFrame[RoadLinksTable]):
        """LinkOfTypeAccessor for RoadLinksTable."""
        self._links_df = links_df
        try:
            links_df.attrs["name"] == "road_links"  # noqa: B015
        except AttributeError:
            WranglerLogger.warning(
                "`of_type` should only be used on 'road_links' dataframes. \
                No attr['name'] not found."
            )
        except AssertionError as e:
            WranglerLogger.warning(
                f"`of_type` should only be used on 'road_links' dataframes. \
                Found type: {links_df.attr['name']}"
            )
            msg = "`of_type` is only available to network_links dataframes."
            raise NotLinksError(msg) from e

    @property
    def managed(self):
        """Filters links dataframe to only include managed lanes."""
        return filter_links_managed_lanes(self._links_df)

    @property
    def parallel_general_purpose(self):
        """Filters links dataframe to general purpose links parallel to managed lanes."""
        ml_properties = filter_link_properties_managed_lanes(self._links_df)
        keep_c = [c for c in self._links_df.columns if c not in ml_properties]
        return filter_links_parallel_general_purpose(self._links_df[keep_c])

    @property
    def general_purpose(self):
        """Filters links dataframe to only include general purpose links."""
        ml_properties = filter_link_properties_managed_lanes(self._links_df)
        keep_c = [c for c in self._links_df.columns if c not in ml_properties]
        return filter_links_general_purpose(self._links_df[keep_c])

    @property
    def general_purpose_no_parallel_managed(self):
        """Filters links general purpose links without parallel managed lanes."""
        ml_properties = filter_link_properties_managed_lanes(self._links_df)
        keep_c = [c for c in self._links_df.columns if c not in ml_properties]
        return filter_links_general_purpose_no_parallel_managed(self._links_df[keep_c])

    @property
    def access_dummy(self):
        """Filters links dataframe to access dummy links connecting managed lanes."""
        return filter_links_access_dummy(self._links_df)

    @property
    def egress_dummy(self):
        """Filters links dataframe to egress dummy links connecting managed lanes."""
        return filter_links_egress_dummy(self._links_df)

    @property
    def dummy(self):
        """Filters links dataframe to dummy links connecting managed lanes."""
        return filter_links_dummy(self._links_df)

    @property
    def pedbike_only(self):
        """Filters links dataframe to links that only ped/bikes can be on."""
        return filter_links_pedbike_only(self._links_df)

    @property
    def transit_only(self):
        """Filters links dataframe to links that only transit can be on."""
        return filter_links_transit_only(self._links_df)

    @property
    def transit_access(self):
        """Filters links dataframe to all links that transit can access."""
        return filter_links_transit_access(self._links_df)

    @property
    def drive_access(self):
        """Filters links dataframe to only include all links that drive can access."""
        return filter_links_drive_access(self._links_df)

    @property
    def summary_df(self) -> pd.DataFrame:
        """Returns a summary of the links dataframe."""
        return link_summary(self._links_df)

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.access_dummy property

access_dummy

Filters links dataframe to access dummy links connecting managed lanes.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.drive_access property

drive_access

Filters links dataframe to only include all links that drive can access.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.dummy property

dummy

Filters links dataframe to dummy links connecting managed lanes.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.egress_dummy property

egress_dummy

Filters links dataframe to egress dummy links connecting managed lanes.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.general_purpose property

general_purpose

Filters links dataframe to only include general purpose links.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.general_purpose_no_parallel_managed property

general_purpose_no_parallel_managed

Filters links general purpose links without parallel managed lanes.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.managed property

managed

Filters links dataframe to only include managed lanes.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.parallel_general_purpose property

parallel_general_purpose

Filters links dataframe to general purpose links parallel to managed lanes.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.pedbike_only property

pedbike_only

Filters links dataframe to links that only ped/bikes can be on.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.summary_df property

summary_df

Returns a summary of the links dataframe.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.transit_access property

transit_access

Filters links dataframe to all links that transit can access.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.transit_only property

transit_only

Filters links dataframe to links that only transit can be on.

network_wrangler.roadway.links.df_accessors.LinkOfTypeAccessor.__init__

__init__(links_df)

LinkOfTypeAccessor for RoadLinksTable.

Source code in network_wrangler/roadway/links/df_accessors.py
def __init__(self, links_df: DataFrame[RoadLinksTable]):
    """LinkOfTypeAccessor for RoadLinksTable."""
    self._links_df = links_df
    try:
        links_df.attrs["name"] == "road_links"  # noqa: B015
    except AttributeError:
        WranglerLogger.warning(
            "`of_type` should only be used on 'road_links' dataframes. \
            No attr['name'] not found."
        )
    except AssertionError as e:
        WranglerLogger.warning(
            f"`of_type` should only be used on 'road_links' dataframes. \
            Found type: {links_df.attr['name']}"
        )
        msg = "`of_type` is only available to network_links dataframes."
        raise NotLinksError(msg) from e

network_wrangler.roadway.links.df_accessors.ModeLinkAccessor

Wrapper for filtering RoadLinksTable by modal ability: : links_df.mode_query(modes_list).

Parameters:

  • modes (list[str]) –

    list of modes to filter by.

Source code in network_wrangler/roadway/links/df_accessors.py
@pd.api.extensions.register_dataframe_accessor("mode_query")
class ModeLinkAccessor:
    """Wrapper for filtering RoadLinksTable by modal ability: : links_df.mode_query(modes_list).

    Args:
        modes (list[str]): list of modes to filter by.
    """

    def __init__(self, links_df: DataFrame[RoadLinksTable]):
        """ModeLinkAccessor for RoadLinksTable."""
        self._links_df = links_df
        try:
            assert links_df.attrs["name"] == "road_links"
        except AttributeError:
            WranglerLogger.warning(
                "`mode_query` should only be used on 'road_links' dataframes. \
                No attr['name'] not found."
            )
        except AssertionError as err:
            msg = "`mode_query` is only available to network_links dataframes."
            WranglerLogger.warning(msg + f" Found type: {links_df.attr['name']}")
            raise NotLinksError(msg) from err

    def __call__(self, modes: list[str]):
        """Filters links dataframe to  links that are accessible by the modes in the list."""
        return filter_links_to_modes(self._links_df, modes)

network_wrangler.roadway.links.df_accessors.ModeLinkAccessor.__call__

__call__(modes)

Filters links dataframe to links that are accessible by the modes in the list.

Source code in network_wrangler/roadway/links/df_accessors.py
def __call__(self, modes: list[str]):
    """Filters links dataframe to  links that are accessible by the modes in the list."""
    return filter_links_to_modes(self._links_df, modes)

network_wrangler.roadway.links.df_accessors.ModeLinkAccessor.__init__

__init__(links_df)

ModeLinkAccessor for RoadLinksTable.

Source code in network_wrangler/roadway/links/df_accessors.py
def __init__(self, links_df: DataFrame[RoadLinksTable]):
    """ModeLinkAccessor for RoadLinksTable."""
    self._links_df = links_df
    try:
        assert links_df.attrs["name"] == "road_links"
    except AttributeError:
        WranglerLogger.warning(
            "`mode_query` should only be used on 'road_links' dataframes. \
            No attr['name'] not found."
        )
    except AssertionError as err:
        msg = "`mode_query` is only available to network_links dataframes."
        WranglerLogger.warning(msg + f" Found type: {links_df.attr['name']}")
        raise NotLinksError(msg) from err

network_wrangler.roadway.links.df_accessors.TrueShapeAccessor

Wrapper for returning a gdf with true_shapes: links_df.true_shape(shapes_df).

Source code in network_wrangler/roadway/links/df_accessors.py
@pd.api.extensions.register_dataframe_accessor("true_shape")
class TrueShapeAccessor:
    """Wrapper for returning a gdf with true_shapes: links_df.true_shape(shapes_df)."""

    def __init__(self, links_df: DataFrame[RoadLinksTable]):
        """TrueShapeAccessor for RoadLinksTable."""
        self._links_df = links_df
        try:
            assert links_df.attrs["name"] == "road_links"
        except AttributeError:
            WranglerLogger.warning(
                "`true_shape` should only be used on 'road_links' dataframes. \
                No attr['name'] not found."
            )
        except AssertionError as err:
            msg = "`true_shape` is only available to network_links dataframes."
            WranglerLogger.warning(msg + f" Found type: {links_df.attr['name']}")
            raise NotLinksError(msg) from err

    def __call__(self, shapes_df: DataFrame[RoadShapesTable]):
        """Updates geometry to have shape of shapes_df where available."""
        return true_shape(self._links_df, shapes_df)

network_wrangler.roadway.links.df_accessors.TrueShapeAccessor.__call__

__call__(shapes_df)

Updates geometry to have shape of shapes_df where available.

Source code in network_wrangler/roadway/links/df_accessors.py
def __call__(self, shapes_df: DataFrame[RoadShapesTable]):
    """Updates geometry to have shape of shapes_df where available."""
    return true_shape(self._links_df, shapes_df)

network_wrangler.roadway.links.df_accessors.TrueShapeAccessor.__init__

__init__(links_df)

TrueShapeAccessor for RoadLinksTable.

Source code in network_wrangler/roadway/links/df_accessors.py
def __init__(self, links_df: DataFrame[RoadLinksTable]):
    """TrueShapeAccessor for RoadLinksTable."""
    self._links_df = links_df
    try:
        assert links_df.attrs["name"] == "road_links"
    except AttributeError:
        WranglerLogger.warning(
            "`true_shape` should only be used on 'road_links' dataframes. \
            No attr['name'] not found."
        )
    except AssertionError as err:
        msg = "`true_shape` is only available to network_links dataframes."
        WranglerLogger.warning(msg + f" Found type: {links_df.attr['name']}")
        raise NotLinksError(msg) from err

Roadway Nodes

Functions for reading and writing nodes data.

network_wrangler.roadway.nodes.io.get_nodes

get_nodes(transit_net=None, roadway_net=None, roadway_path=None, config=DefaultConfig)

Get nodes from a transit network, roadway network, or roadway file.

Parameters:

  • transit_net (Optional[TransitNetwork], default: None ) –

    TransitNetwork instance

  • roadway_net (Optional[RoadwayNetwork], default: None ) –

    RoadwayNetwork instance

  • roadway_path (Optional[Union[str, Path]], default: None ) –

    path to a directory with roadway network

  • config (WranglerConfig, default: DefaultConfig ) –

    WranglerConfig instance. Defaults to DefaultConfig.

Source code in network_wrangler/roadway/nodes/io.py
def get_nodes(
    transit_net: Optional[TransitNetwork] = None,
    roadway_net: Optional[RoadwayNetwork] = None,
    roadway_path: Optional[Union[str, Path]] = None,
    config: WranglerConfig = DefaultConfig,
) -> GeoDataFrame:
    """Get nodes from a transit network, roadway network, or roadway file.

    Args:
        transit_net: TransitNetwork instance
        roadway_net: RoadwayNetwork instance
        roadway_path: path to a directory with roadway network
        config: WranglerConfig instance. Defaults to DefaultConfig.
    """
    if transit_net is not None and transit_net.road_net is not None:
        return transit_net.road_net.nodes_df
    if roadway_net is not None:
        return roadway_net.nodes_df
    if roadway_path is not None:
        nodes_path = Path(roadway_path)
        if nodes_path.is_dir():
            nodes_path = next(nodes_path.glob("*node*."))
        return read_nodes(nodes_path, config=config)
    msg = "nodes_df must either be given or provided via an associated road_net or by providing a roadway_net path or instance."
    raise ValueError(msg)

network_wrangler.roadway.nodes.io.nodes_df_to_geojson

nodes_df_to_geojson(nodes_df, properties)

Converts a nodes dataframe to a geojson.

Attribution: Geoff Boeing: https://geoffboeing.com/2015/10/exporting-python-data-geojson/.

Source code in network_wrangler/roadway/nodes/io.py
@validate_call_pyd
def nodes_df_to_geojson(nodes_df: DataFrame[RoadNodesTable], properties: list[str]):
    """Converts a nodes dataframe to a geojson.

    Attribution: Geoff Boeing:
    https://geoffboeing.com/2015/10/exporting-python-data-geojson/.
    """
    # TODO write wrapper on validate call so don't have to do this
    nodes_df.attrs.update(RoadNodesAttrs)
    geojson = {"type": "FeatureCollection", "features": []}
    for _, row in nodes_df.iterrows():
        feature: dict[str, Any] = {
            "type": "Feature",
            "properties": {},
            "geometry": {"type": "Point", "coordinates": []},
        }
        feature["geometry"]["coordinates"] = [row["geometry"].x, row["geometry"].y]
        feature["properties"][nodes_df.model_node_id] = row.name
        for prop in properties:
            feature["properties"][prop] = row[prop]
        geojson["features"].append(feature)
    return geojson

network_wrangler.roadway.nodes.io.read_nodes

read_nodes(filename, in_crs=LAT_LON_CRS, boundary_gdf=None, boundary_geocode=None, boundary_file=None, config=DefaultConfig)

Reads nodes and returns a geodataframe of nodes.

Sets index to be a copy of the primary key. Validates output dataframe using NodesSchema.

Parameters:

  • filename ((Path, str)) –

    file to read links in from.

  • in_crs (int, default: LAT_LON_CRS ) –

    coordinate reference system number that node data is in. Defaults to LAT_LON_CRS.

  • boundary_gdf (Optional[GeoDataFrame], default: None ) –

    GeoDataFrame to filter the input data to. Only used for geographic data. efaults to None.

  • boundary_geocode (Optional[str], default: None ) –

    Geocode to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_file (Optional[Path], default: None ) –

    File to load as a boundary to filter the input data to. Only used for geographic data. Defaults to None.

  • config (WranglerConfig, default: DefaultConfig ) –

    WranglerConfig instance. Defaults to DefaultConfig.

Source code in network_wrangler/roadway/nodes/io.py
@validate_call(config={"arbitrary_types_allowed": True})
def read_nodes(
    filename: Path,
    in_crs: int = LAT_LON_CRS,
    boundary_gdf: Optional[GeoDataFrame] = None,
    boundary_geocode: Optional[str] = None,
    boundary_file: Optional[Path] = None,
    config: WranglerConfig = DefaultConfig,
) -> DataFrame[RoadNodesTable]:
    """Reads nodes and returns a geodataframe of nodes.

    Sets index to be a copy of the primary key.
    Validates output dataframe using NodesSchema.

    Args:
        filename (Path,str): file to read links in from.
        in_crs: coordinate reference system number that node data is in. Defaults to LAT_LON_CRS.
        boundary_gdf: GeoDataFrame to filter the input data to. Only used for geographic data.
            efaults to None.
        boundary_geocode: Geocode to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_file: File to load as a boundary to filter the input data to. Only used for
            geographic data. Defaults to None.
        config: WranglerConfig instance. Defaults to DefaultConfig.
    """
    WranglerLogger.debug(f"Reading nodes from {filename}.")

    start_time = time.time()

    nodes_df = read_table(
        filename,
        boundary_gdf=boundary_gdf,
        boundary_geocode=boundary_geocode,
        boundary_file=boundary_file,
        read_speed=config.CPU.EST_PD_READ_SPEED,
    )
    WranglerLogger.debug(
        f"Read {len(nodes_df)} nodes from file in {round(time.time() - start_time, 2)}."
    )

    nodes_df = data_to_nodes_df(nodes_df, in_crs=in_crs, config=config)
    nodes_df.attrs["source_file"] = filename
    WranglerLogger.info(
        f"Read {len(nodes_df)} nodes from {filename} in {round(time.time() - start_time, 2)}."
    )
    nodes_df = validate_df_to_model(nodes_df, RoadNodesTable)
    return nodes_df

network_wrangler.roadway.nodes.io.write_nodes

write_nodes(nodes_df, out_dir, prefix, file_format='geojson', overwrite=True)

Writes RoadNodesTable to file.

Parameters:

  • nodes_df (DataFrame[RoadNodesTable]) –

    nodes dataframe

  • out_dir (Union[str, Path]) –

    directory to write nodes to

  • prefix (str) –

    prefix to add to nodes file name

  • file_format (GeoFileTypes, default: 'geojson' ) –

    format to write nodes in. e.g. “geojson” shp” “parquet” “csv” “txt”. Defaults to “geojson”.

  • overwrite (bool, default: True ) –

    whether to overwrite existing nodes file. Defaults to True.

Source code in network_wrangler/roadway/nodes/io.py
@validate_call_pyd
def write_nodes(
    nodes_df: DataFrame[RoadNodesTable],
    out_dir: Union[str, Path],
    prefix: str,
    file_format: GeoFileTypes = "geojson",
    overwrite: bool = True,
) -> None:
    """Writes RoadNodesTable to file.

    Args:
        nodes_df: nodes dataframe
        out_dir: directory to write nodes to
        prefix: prefix to add to nodes file name
        file_format: format to write nodes in. e.g. "geojson" shp" "parquet" "csv" "txt". Defaults
            to "geojson".
        overwrite: whether to overwrite existing nodes file. Defaults to True.
    """
    nodes_file = Path(out_dir) / f"{prefix}node.{file_format}"
    nodes_df = order_fields_from_data_model(nodes_df, RoadNodesTable)
    write_table(nodes_df, nodes_file, overwrite=overwrite)

Functions for creating nodes from data sources.

network_wrangler.roadway.nodes.create.data_to_nodes_df

data_to_nodes_df(nodes_df, config=DefaultConfig, in_crs=LAT_LON_CRS)

Turn nodes data into official nodes dataframe.

Adds missing geometry. Makes sure X and Y are consistent with geometry GeoSeries. Converts to LAT_LON_CRS. Copies and sets idx to primary_key. Validates output to NodesSchema.

Parameters:

  • nodes_df

    Nodes dataframe or list of dictionaries that can be converted to a dataframe.

  • config (WranglerConfig, default: DefaultConfig ) –

    WranglerConfig instance. Defaults to DefaultConfig. NOTE: Not currently used.

  • in_crs (int, default: LAT_LON_CRS ) –

    Coordinate references system id incoming data xy is in, if it isn’t already in a GeoDataFrame. Defaults to LAT_LON_CRS.

Returns:

Source code in network_wrangler/roadway/nodes/create.py
@validate_call(config={"arbitrary_types_allowed": True})
def data_to_nodes_df(
    nodes_df: Union[pd.DataFrame, gpd.GeoDataFrame, list[dict]],
    config: WranglerConfig = DefaultConfig,  # noqa: ARG001
    in_crs: int = LAT_LON_CRS,
) -> DataFrame[RoadNodesTable]:
    """Turn nodes data into official nodes dataframe.

    Adds missing geometry.
    Makes sure X and Y are consistent with geometry GeoSeries.
    Converts to LAT_LON_CRS.
    Copies and sets idx to primary_key.
    Validates output to NodesSchema.

    Args:
        nodes_df : Nodes dataframe or list of dictionaries that can be converted to a dataframe.
        config: WranglerConfig instance. Defaults to DefaultConfig. NOTE: Not currently used.
        in_crs: Coordinate references system id incoming data xy is in, if it isn't already
            in a GeoDataFrame. Defaults to LAT_LON_CRS.

    Returns:
        gpd.GeoDataFrame: _description_
    """
    WranglerLogger.debug("Turning node data into official nodes_df")

    if isinstance(nodes_df, gpd.GeoDataFrame) and nodes_df.crs != LAT_LON_CRS:
        if nodes_df.crs is None:
            nodes_df.crs = in_crs
        nodes_df = nodes_df.to_crs(LAT_LON_CRS)

    if not isinstance(nodes_df, gpd.GeoDataFrame) or nodes_df.geometry.isnull().values.any():
        nodes_df = _create_node_geometries_from_xy(nodes_df, in_crs=in_crs, net_crs=LAT_LON_CRS)

    # Make sure values are consistent
    nodes_df["X"] = nodes_df["geometry"].apply(lambda g: g.x)
    nodes_df["Y"] = nodes_df["geometry"].apply(lambda g: g.y)

    if len(nodes_df) < SMALL_RECS:
        WranglerLogger.debug(f"nodes_df: \n{nodes_df[['model_node_id', 'geometry', 'X', 'Y']]}")

    # Validate and coerce to schema
    nodes_df = validate_df_to_model(nodes_df, RoadNodesTable)
    nodes_df.attrs.update(RoadNodesAttrs)
    nodes_df.gdf_name = nodes_df.attrs["name"]
    nodes_df = set_df_index_to_pk(nodes_df)

    return nodes_df

network_wrangler.roadway.nodes.create.generate_node_ids

generate_node_ids(nodes_df, range, n)

Generate unique node ids for nodes_df.

Parameters:

  • nodes_df (DataFrame[RoadNodesTable]) –

    nodes dataframe to generate unique ids for.

  • range (tuple[int]) –

    range of ids to generate from.

  • n (int) –

    number of ids to generate.

Returns:

  • list[int]

    list[int]: list of unique node ids.

Source code in network_wrangler/roadway/nodes/create.py
def generate_node_ids(nodes_df: DataFrame[RoadNodesTable], range: tuple[int], n: int) -> list[int]:
    """Generate unique node ids for nodes_df.

    Args:
        nodes_df: nodes dataframe to generate unique ids for.
        range: range of ids to generate from.
        n: number of ids to generate.

    Returns:
        list[int]: list of unique node ids.
    """
    if n <= 0:
        return []
    existing_ids = set(nodes_df["model_node_id"].unique())
    new_ids = set(range) - existing_ids
    if len(new_ids) < n:
        msg = f"Only {len(new_ids)} new ids available, need {n}."
        raise NodeAddError(msg)

    return list(new_ids)[:n]

Functions for deleting nodes from a nodes table.

network_wrangler.roadway.nodes.delete.delete_nodes_by_ids

delete_nodes_by_ids(nodes_df, del_node_ids, ignore_missing=False)

Delete nodes from a nodes table.

Parameters:

  • nodes_df (DataFrame[RoadNodesTable]) –

    DataFrame[RoadNodesTable] to delete nodes from.

  • del_node_ids (list[int]) –

    list of node ids to delete.

  • ignore_missing (bool, default: False ) –

    if True, will not raise an error if a node id to delete is not in the network. Defaults to False.

Source code in network_wrangler/roadway/nodes/delete.py
def delete_nodes_by_ids(
    nodes_df: DataFrame[RoadNodesTable], del_node_ids: list[int], ignore_missing: bool = False
) -> DataFrame[RoadNodesTable]:
    """Delete nodes from a nodes table.

    Args:
        nodes_df: DataFrame[RoadNodesTable] to delete nodes from.
        del_node_ids: list of node ids to delete.
        ignore_missing: if True, will not raise an error if a node id to delete is not in
            the network. Defaults to False.
    """
    WranglerLogger.debug(f"Deleting nodse with ids: \n{del_node_ids}")

    _missing = set(del_node_ids) - set(nodes_df.index)
    if _missing:
        msg = "Nodes to delete are not in the network."
        WranglerLogger.warning(msg + f"\n{_missing}")
        if not ignore_missing:
            raise NodeDeletionError(msg)
    return nodes_df.drop(labels=del_node_ids, errors="ignore")

Edits RoadNodesTable properties.

NOTE: Each public method will return a new, whole copy of the RoadNodesTable with associated edits. Private methods may return mutated originals.

network_wrangler.roadway.nodes.edit.NodeGeometryChange

Bases: RecordModel

Value for setting node geometry given a model_node_id.

Source code in network_wrangler/roadway/nodes/edit.py
class NodeGeometryChange(RecordModel):
    """Value for setting node geometry given a model_node_id."""

    model_config = ConfigDict(extra="ignore")
    X: float
    Y: float
    in_crs: Optional[int] = LAT_LON_CRS

network_wrangler.roadway.nodes.edit.NodeGeometryChangeTable

Bases: DataFrameModel

DataFrameModel for setting node geometry given a model_node_id.

Source code in network_wrangler/roadway/nodes/edit.py
class NodeGeometryChangeTable(DataFrameModel):
    """DataFrameModel for setting node geometry given a model_node_id."""

    model_node_id: Series[int]
    X: Series[float] = Field(coerce=True)
    Y: Series[float] = Field(coerce=True)
    in_crs: Series[int] = Field(default=LAT_LON_CRS)

    class Config:
        """Config for NodeGeometryChangeTable."""

        add_missing_columns = True

network_wrangler.roadway.nodes.edit.NodeGeometryChangeTable.Config

Config for NodeGeometryChangeTable.

Source code in network_wrangler/roadway/nodes/edit.py
class Config:
    """Config for NodeGeometryChangeTable."""

    add_missing_columns = True

network_wrangler.roadway.nodes.edit.edit_node_geometry

edit_node_geometry(nodes_df, node_geometry_change_table)

Returns copied nodes table with geometry edited.

Should be called from network so that accompanying links and shapes are also updated.

Parameters:

Source code in network_wrangler/roadway/nodes/edit.py
@validate_call_pyd
def edit_node_geometry(
    nodes_df: DataFrame[RoadNodesTable],
    node_geometry_change_table: DataFrame[NodeGeometryChangeTable],
) -> DataFrame[RoadNodesTable]:
    """Returns copied nodes table with geometry edited.

    Should be called from network so that accompanying links and shapes are also updated.

    Args:
        nodes_df: RoadNodesTable to edit
        node_geometry_change_table: NodeGeometryChangeTable with geometry changes
    """
    # TODO write wrapper on validate call so don't have to do this
    nodes_df.attrs.update(RoadNodesAttrs)
    WranglerLogger.debug(f"Updating node geometry for {len(node_geometry_change_table)} nodes.")
    WranglerLogger.debug(f"Original nodes_df: \n{nodes_df.head()}")
    # for now, require in_crs is the same for whole column
    if node_geometry_change_table.in_crs.nunique() != 1:
        msg = f"in_crs must be the same for all nodes. Got: {node_geometry_change_table.in_crs}"
        WranglerLogger.error(msg)
        raise NodeChangeError(msg)

    in_crs = node_geometry_change_table.loc[0, "in_crs"]

    # Create a table with all the new node geometry
    geo_s = gpd.points_from_xy(node_geometry_change_table.X, node_geometry_change_table.Y)
    geo_df = gpd.GeoDataFrame(node_geometry_change_table, geometry=geo_s, crs=in_crs)
    geo_df = geo_df.to_crs(LAT_LON_CRS)
    WranglerLogger.debug(f"Updated geometry geo_df: \n{geo_df}")

    # Update the nodes table with the new geometry
    nodes_df = update_df_by_col_value(
        nodes_df, geo_df, "model_node_id", properties=["X", "Y", "geometry"]
    )
    nodes_df = validate_df_to_model(nodes_df, RoadNodesTable)

    WranglerLogger.debug(f"Updated nodes_df: \n{nodes_df.head()}")

    return nodes_df

network_wrangler.roadway.nodes.edit.edit_node_property

edit_node_property(nodes_df, node_idx, prop_name, prop_change, project_name=None, config=DefaultConfig, _geometry_ok=False)

Return copied nodes table with node property edited.

Parameters:

  • nodes_df (DataFrame[RoadNodesTable]) –

    RoadNodesTable to edit

  • node_idx (list[int]) –

    list of node indices to change

  • prop_name (str) –

    property name to change

  • prop_change (Union[dict, RoadPropertyChange]) –

    dictionary of value from project_card

  • project_name (Optional[str], default: None ) –

    optional name of the project to be applied

  • config (WranglerConfig, default: DefaultConfig ) –

    WranglerConfig instance.

  • _geometry_ok (bool, default: False ) –

    if False, will not let you change geometry-related fields. Should only be changed to True by internal processes that know that geometry is changing and will update it in appropriate places in network. Defaults to False. GENERALLY DO NOT TURN THIS ON.

Source code in network_wrangler/roadway/nodes/edit.py
def edit_node_property(
    nodes_df: DataFrame[RoadNodesTable],
    node_idx: list[int],
    prop_name: str,
    prop_change: Union[dict, RoadPropertyChange],
    project_name: Optional[str] = None,
    config: WranglerConfig = DefaultConfig,
    _geometry_ok: bool = False,
) -> DataFrame[RoadNodesTable]:
    """Return copied nodes table with node property edited.

    Args:
        nodes_df: RoadNodesTable to edit
        node_idx: list of node indices to change
        prop_name: property name to change
        prop_change: dictionary of value from project_card
        project_name: optional name of the project to be applied
        config: WranglerConfig instance.
        _geometry_ok: if False, will not let you change geometry-related fields. Should
            only be changed to True by internal processes that know that geometry is changing
            and will update it in appropriate places in network. Defaults to False.
            GENERALLY DO NOT TURN THIS ON.
    """
    if not isinstance(prop_change, RoadPropertyChange):
        prop_change = RoadPropertyChange(**prop_change)
    prop_dict = prop_change.model_dump(exclude_none=True, by_alias=True)

    # Allow the project card to override the default behavior of raising an error
    existing_value_conflict = prop_change.get(
        "existing_value_conflict", config.EDITS.EXISTING_VALUE_CONFLICT
    )

    # Should not be used to update node geometry fields unless explicity set to OK:
    if prop_name in nodes_df.attrs["geometry_props"] and not _geometry_ok:
        msg = f"Cannot unilaterally change geometry property."
        raise NodeChangeError(msg)

    # check existing if necessary
    if not _check_existing_value_conflict(
        nodes_df, node_idx, prop_name, prop_dict, existing_value_conflict
    ):
        return nodes_df

    nodes_df = copy.deepcopy(nodes_df)

    # if it is a new attribute then initialize with NaN values
    if prop_name not in nodes_df:
        nodes_df[prop_name] = None

    # `set` and `change` just affect the simple property
    if "set" in prop_dict:
        nodes_df.loc[node_idx, prop_name] = prop_dict["set"]
    elif "change" in prop_dict:
        nodes_df.loc[node_idx, prop_name] = nodes_df.loc[prop_name].apply(
            lambda x: x + prop_dict["change"]
        )
    else:
        msg = f"Couldn't find correct node change spec in: {prop_dict}"
        raise NodeChangeError(msg)

    if project_name is not None:
        nodes_df.loc[node_idx, "projects"] += f"{project_name},"

    nodes_df = validate_df_to_model(nodes_df, RoadNodesTable)
    return nodes_df

Functions to filter nodes dataframe.

network_wrangler.roadway.nodes.filters.filter_nodes_to_ids

filter_nodes_to_ids(nodes_df, node_ids)

Filters nodes dataframe by node_ids.

Parameters:

  • nodes_df (DataFrame) –

    nodes dataframe

  • node_ids (List[int]) –

    list of node_ids to filter by.

Returns:

  • DataFrame[RoadNodesTable]

    pd.DataFrame: filtered nodes dataframe

Source code in network_wrangler/roadway/nodes/filters.py
def filter_nodes_to_ids(
    nodes_df: DataFrame[RoadNodesTable], node_ids: list[int]
) -> DataFrame[RoadNodesTable]:
    """Filters nodes dataframe by node_ids.

    Args:
        nodes_df (pd.DataFrame): nodes dataframe
        node_ids (List[int]): list of node_ids to filter by.

    Returns:
        pd.DataFrame: filtered nodes dataframe
    """
    return nodes_df.loc[nodes_df["model_node_id"].isin(node_ids)]
filter_nodes_to_link_ids(link_ids, links_df, nodes_df=None)

Filters nodes dataframe to those used by given link_ids.

Parameters:

  • link_ids (List[int]) –

    list of link_ids

  • links_df (RoadLinksTable) –

    links dataframe

  • nodes_df (RoadNodesTable, default: None ) –

    nodes dataframe

Returns:

Source code in network_wrangler/roadway/nodes/filters.py
def filter_nodes_to_link_ids(
    link_ids: list[int],
    links_df: DataFrame[RoadLinksTable],
    nodes_df: Optional[DataFrame[RoadNodesTable]] = None,
) -> DataFrame[RoadNodesTable]:
    """Filters nodes dataframe to those used by given link_ids.

    Args:
        link_ids (List[int]): list of link_ids
        links_df (RoadLinksTable): links dataframe
        nodes_df (RoadNodesTable): nodes dataframe

    Returns:
        pd.DataFrame: nodes dataframe
    """
    _node_ids = node_ids_in_link_ids(link_ids, links_df, nodes_df)
    return filter_nodes_to_ids(nodes_df, _node_ids)
filter_nodes_to_links(links_df, nodes_df)

Filters RoadNodesTable to those used by given links dataframe.

Parameters:

Source code in network_wrangler/roadway/nodes/filters.py
def filter_nodes_to_links(
    links_df: DataFrame[RoadLinksTable], nodes_df: DataFrame[RoadNodesTable]
) -> DataFrame[RoadNodesTable]:
    """Filters RoadNodesTable to those used by given links dataframe.

    Args:
        links_df (RoadLinksTable): links dataframe
        nodes_df (RoadNodesTable): nodes dataframe
    """
    _node_ids = node_ids_in_links(links_df, nodes_df)
    nodes_in_links = nodes_df.loc[nodes_df.index.isin(_node_ids)]
    WranglerLogger.debug(f"Selected {len(nodes_in_links)} of {len(nodes_df)} nodes.")
    return nodes_in_links

Nodes submodule for creating, editing, filtering RoadNodes Table.

Roadway Shapes

Functions to read and write RoadShapesTable.

network_wrangler.roadway.shapes.io.read_shapes

read_shapes(filename, in_crs=LAT_LON_CRS, boundary_gdf=None, boundary_geocode=None, boundary_file=None, filter_to_shape_ids=None, config=DefaultConfig)

Reads shapes and returns a geodataframe of shapes if filename is found.

Otherwise, returns empty GeoDataFrame conforming to ShapesSchema.

Sets index to be a copy of the primary key. Validates output dataframe using ShapesSchema.

Parameters:

  • filename (str) –

    file to read shapes in from.

  • in_crs (int, default: LAT_LON_CRS ) –

    coordinate reference system number file is in. Defaults to LAT_LON_CRS.

  • boundary_gdf (Optional[GeoDataFrame], default: None ) –

    GeoDataFrame to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_geocode (Optional[str], default: None ) –

    Geocode to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_file (Optional[Path], default: None ) –

    File to load as a boundary to filter the input data to. Only used for geographic data. Defaults to None.

  • filter_to_shape_ids (Optional[list], default: None ) –

    List of shape_ids to filter the input data to. Defaults to None.

  • config (WranglerConfig, default: DefaultConfig ) –

    WranglerConfig instance. Defaults to DefaultConfig.

Source code in network_wrangler/roadway/shapes/io.py
@validate_call_pyd
def read_shapes(
    filename: Path,
    in_crs: int = LAT_LON_CRS,
    boundary_gdf: Optional[GeoDataFrame] = None,
    boundary_geocode: Optional[str] = None,
    boundary_file: Optional[Path] = None,
    filter_to_shape_ids: Optional[list] = None,
    config: WranglerConfig = DefaultConfig,
) -> DataFrame[RoadShapesTable]:
    """Reads shapes and returns a geodataframe of shapes if filename is found.

    Otherwise, returns empty GeoDataFrame conforming to ShapesSchema.

    Sets index to be a copy of the primary key.
    Validates output dataframe using ShapesSchema.

    Args:
        filename (str): file to read shapes in from.
        in_crs: coordinate reference system number file is in. Defaults to LAT_LON_CRS.
        boundary_gdf: GeoDataFrame to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_geocode: Geocode to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_file: File to load as a boundary to filter the input data to. Only used for
            geographic data. Defaults to None.
        filter_to_shape_ids: List of shape_ids to filter the input data to. Defaults to None.
        config: WranglerConfig instance. Defaults to DefaultConfig.
    """
    if not Path(filename).exists():
        WranglerLogger.warning(
            f"Shapes file {filename} not found, but is optional. \
                               Returning empty shapes dataframe."
        )
        return empty_df_from_datamodel(RoadShapesTable, crs=LAT_LON_CRS).set_index(
            "shape_id_idx", inplace=True
        )

    start_time = time.time()
    WranglerLogger.debug(f"Reading shapes from {filename}.")

    shapes_df = read_table(
        filename,
        boundary_gdf=boundary_gdf,
        boundary_geocode=boundary_geocode,
        boundary_file=boundary_file,
        read_speed=config.CPU.EST_PD_READ_SPEED,
    )
    if filter_to_shape_ids:
        shapes_df = shapes_df[shapes_df["shape_id"].isin(filter_to_shape_ids)]
    WranglerLogger.debug(
        f"Read {len(shapes_df)} shapes from file in {round(time.time() - start_time, 2)}."
    )
    shapes_df = df_to_shapes_df(shapes_df, in_crs=in_crs)
    shapes_df.attrs["source_file"] = filename
    WranglerLogger.info(
        f"Read {len(shapes_df)} shapes from {filename} in {round(time.time() - start_time, 2)}."
    )
    shapes_df = validate_df_to_model(shapes_df, RoadShapesTable)
    return shapes_df

network_wrangler.roadway.shapes.io.write_shapes

write_shapes(shapes_df, out_dir, prefix, format, overwrite)

Writes shapes to file.

Parameters:

  • shapes_df (DataFrame[RoadShapesTable]) –

    DataFrame of shapes to write.

  • out_dir (Union[str, Path]) –

    directory to write shapes to.

  • prefix (str) –

    prefix to add to file name.

  • format (str) –

    format to write shapes in.

  • overwrite (bool) –

    whether to overwrite file if it exists.

Source code in network_wrangler/roadway/shapes/io.py
@validate_call_pyd
def write_shapes(
    shapes_df: DataFrame[RoadShapesTable],
    out_dir: Union[str, Path],
    prefix: str,
    format: str,
    overwrite: bool,
) -> None:
    """Writes shapes to file.

    Args:
        shapes_df: DataFrame of shapes to write.
        out_dir: directory to write shapes to.
        prefix: prefix to add to file name.
        format: format to write shapes in.
        overwrite: whether to overwrite file if it exists.
    """
    shapes_file = Path(out_dir) / f"{prefix}shape.{format}"
    shapes_df = order_fields_from_data_model(shapes_df, RoadShapesTable)
    write_table(shapes_df, shapes_file, overwrite=overwrite)

Functions to create RoadShapesTable from various data.

network_wrangler.roadway.shapes.create.add_offset_shapes

add_offset_shapes(shapes_df, shape_ids, offset_dist_meters=10, id_scalar=DefaultConfig.IDS.ROAD_SHAPE_ID_SCALAR)

Appends a RoadShapesTable with new shape records for shape_ids which are offset from orig.

Parameters:

  • shapes_df (RoadShapesTable) –

    Original RoadShapesTable to add on to.

  • shape_ids (list) –

    Shape_ids to create offsets for.

  • offset_dist_meters (float, default: 10 ) –

    Distance in meters to offset by. Defaults to 10.

  • id_scalar (int, default: ROAD_SHAPE_ID_SCALAR ) –

    Increment to add to shape_id. Defaults to SHAPE_ID_SCALAR.

Returns:

  • RoadShapesTable ( DataFrame[RoadShapesTable] ) –

    with added offset shape_ids and a column ref_shape_id which references the shape_id which was offset to create it.

Source code in network_wrangler/roadway/shapes/create.py
def add_offset_shapes(
    shapes_df: DataFrame[RoadShapesTable],
    shape_ids: list,
    offset_dist_meters: float = 10,
    id_scalar: int = DefaultConfig.IDS.ROAD_SHAPE_ID_SCALAR,
) -> DataFrame[RoadShapesTable]:
    """Appends a RoadShapesTable with new shape records for shape_ids which are offset from orig.

    Args:
        shapes_df (RoadShapesTable): Original RoadShapesTable to add on to.
        shape_ids (list): Shape_ids to create offsets for.
        offset_dist_meters (float, optional): Distance in meters to offset by. Defaults to 10.
        id_scalar (int, optional): Increment to add to shape_id. Defaults to SHAPE_ID_SCALAR.

    Returns:
        RoadShapesTable: with added offset shape_ids and a column `ref_shape_id` which references
            the shape_id which was offset to create it.
    """
    offset_shapes_df = create_offset_shapes(shapes_df, shape_ids, offset_dist_meters, id_scalar)
    shapes_df = concat_with_attr([shapes_df, offset_shapes_df])
    shapes_df = validate_df_to_model(shapes_df, RoadShapesTable)
    return shapes_df

network_wrangler.roadway.shapes.create.create_offset_shapes

create_offset_shapes(shapes_df, shape_ids, offset_dist_meters=10, id_scalar=DefaultConfig.IDS.ROAD_SHAPE_ID_SCALAR)

Create a RoadShapesTable of new shape records for shape_ids which are offset.

Parameters:

  • shapes_df (RoadShapesTable) –

    Original RoadShapesTable to add on to.

  • shape_ids (list) –

    Shape_ids to create offsets for.

  • offset_dist_meters (float, default: 10 ) –

    Distance in meters to offset by. Defaults to 10.

  • id_scalar (int, default: ROAD_SHAPE_ID_SCALAR ) –

    Increment to add to shape_id. Defaults to ROAD_SHAPE_ID_SCALAR.

Returns:

  • RoadShapesTable ( DataFrame[RoadShapesTable] ) –

    of offset shapes and a column ref_shape_id which references the shape_id which was offset to create it.

Source code in network_wrangler/roadway/shapes/create.py
def create_offset_shapes(
    shapes_df: DataFrame[RoadShapesTable],
    shape_ids: list,
    offset_dist_meters: float = 10,
    id_scalar: int = DefaultConfig.IDS.ROAD_SHAPE_ID_SCALAR,
) -> DataFrame[RoadShapesTable]:
    """Create a RoadShapesTable of new shape records for shape_ids which are offset.

    Args:
        shapes_df (RoadShapesTable): Original RoadShapesTable to add on to.
        shape_ids (list): Shape_ids to create offsets for.
        offset_dist_meters (float, optional): Distance in meters to offset by. Defaults to 10.
        id_scalar (int, optional): Increment to add to shape_id. Defaults to ROAD_SHAPE_ID_SCALAR.

    Returns:
      RoadShapesTable: of offset shapes and a column `ref_shape_id` which references
            the shape_id which was offset to create it.
    """
    offset_shapes_df = pd.DataFrame(
        {
            "shape_id": generate_list_of_new_ids_from_existing(
                shape_ids, shapes_df.shape_ids.to_list, id_scalar
            ),
            "ref_shape_id": shape_ids,
        }
    )

    ref_shapes_df = copy.deepcopy(shapes_df[shapes_df["shape_id"].isin(shape_ids)])

    ref_shapes_df["offset_shape_id"] = generate_list_of_new_ids_from_existing(
        ref_shapes_df.shape_id.to_list, shapes_df.shape_ids.to_list, id_scalar
    )

    ref_shapes_df["geometry"] = offset_geometry_meters(ref_shapes_df.geometry, offset_dist_meters)

    offset_shapes_df = ref_shapes_df.rename(
        columns={
            "shape_id": "ref_shape_id",
            "offset_shape_id": "shape_id",
        }
    )

    offset_shapes_gdf = gpd.GeoDataFrame(offset_shapes_df, geometry="geometry", crs=shapes_df.crs)

    offset_shapes_gdf = validate_df_to_model(offset_shapes_gdf, RoadShapesTable)

    return offset_shapes_gdf

network_wrangler.roadway.shapes.create.df_to_shapes_df

df_to_shapes_df(shapes_df, in_crs=LAT_LON_CRS, config=DefaultConfig)

Sets index to be a copy of the primary key, validates to RoadShapesTable and aligns CRS.

Parameters:

  • shapes_df (GeoDataFrame) –

    description

  • in_crs (int, default: LAT_LON_CRS ) –

    coordinate reference system number of incoming df. ONLY used if shapes_df is not already set. Defaults to LAT_LON_CRS.

  • config (WranglerConfig, default: DefaultConfig ) –

    WranglerConfig instance. Defaults to DefaultConfig. NOTE: Not currently used.

Returns:

Source code in network_wrangler/roadway/shapes/create.py
def df_to_shapes_df(
    shapes_df: gpd.GeoDataFrame,
    in_crs: int = LAT_LON_CRS,
    config: WranglerConfig = DefaultConfig,  # noqa: ARG001
) -> DataFrame[RoadShapesTable]:
    """Sets index to be a copy of the primary key, validates to RoadShapesTable and aligns CRS.

    Args:
        shapes_df (gpd.GeoDataFrame): _description_
        in_crs: coordinate reference system number of incoming df. ONLY used if shapes_df is not
            already set. Defaults to LAT_LON_CRS.
        config: WranglerConfig instance. Defaults to DefaultConfig. NOTE: Not currently used.

    Returns:
        DataFrame[RoadShapesTable]
    """
    WranglerLogger.debug(f"Creating {len(shapes_df)} shapes.")
    if not isinstance(shapes_df, gpd.GeoDataFrame):
        shapes_df = coerce_gdf(shapes_df, in_crs=in_crs)

    if shapes_df.crs != LAT_LON_CRS:
        shapes_df = shapes_df.to_crs(LAT_LON_CRS)

    shapes_df = _check_rename_old_column_aliases(shapes_df)

    shapes_df.attrs.update(RoadShapesAttrs)
    shapes_df = set_df_index_to_pk(shapes_df)
    shapes_df.gdf_name = shapes_df.attrs["name"]
    shapes_df = validate_df_to_model(shapes_df, RoadShapesTable)

    return shapes_df

Edits RoadShapesTable properties.

NOTE: Each public method will return a whole copy of the RoadShapesTable with associated edits. Private methods may return mutated originals.

network_wrangler.roadway.shapes.edit.edit_shape_geometry_from_nodes

edit_shape_geometry_from_nodes(shapes_df, links_df, nodes_df, node_ids)

Updates the geometry for shapes for a given list of nodes.

Should be called by any function that changes a node location.

This will mutate the geometry of a shape in place for the start and end node

Parameters:

  • shapes_df (DataFrame[RoadShapesTable]) –

    RoadShapesTable

  • links_df (DataFrame[RoadLinksTable]) –

    RoadLinksTable

  • nodes_df (DataFrame[RoadNodesTable]) –

    RoadNodesTable

  • node_ids (list[int]) –

    list of node PKs with updated geometry

Source code in network_wrangler/roadway/shapes/edit.py
def edit_shape_geometry_from_nodes(
    shapes_df: DataFrame[RoadShapesTable],
    links_df: DataFrame[RoadLinksTable],
    nodes_df: DataFrame[RoadNodesTable],
    node_ids: list[int],
) -> DataFrame[RoadShapesTable]:
    """Updates the geometry for shapes for a given list of nodes.

    Should be called by any function that changes a node location.

    NOTE: This will mutate the geometry of a shape in place for the start and end node
            ...but not the nodes in-between.  Something to consider.

    Args:
        shapes_df: RoadShapesTable
        links_df: RoadLinksTable
        nodes_df: RoadNodesTable
        node_ids: list of node PKs with updated geometry
    """
    shapes_df = copy.deepcopy(shapes_df)
    links_A_df = links_df.loc[links_df.A.isin(node_ids)]
    _tempshape_A_df = shapes_df[["shape_id", "geometry"]].merge(
        links_A_df[["shape_id", "A"]], on="shape_id", how="inner"
    )
    _shape_ids_A = _tempshape_A_df.shape_id.unique().tolist()
    if _shape_ids_A:
        shapes_df[_shape_ids_A, "geometry"] = update_nodes_in_linestring_geometry(
            _tempshape_A_df, nodes_df, 0
        )

    links_B_df = links_df.loc[links_df.B.isin(node_ids)]
    _tempshape_B_df = shapes_df[["shape_id", "geometry"]].merge(
        links_B_df[["shape_id", "B"]], on="shape_id", how="inner"
    )
    _shape_ids_B = _tempshape_B_df.shape_id.unique().tolist()
    if _shape_ids_A:
        shapes_df[_shape_ids_B, "geometry"] = update_nodes_in_linestring_geometry(
            _tempshape_A_df, nodes_df, -1
        )
    return shapes_df

Functions to delete shapes from RoadShapesTable.

network_wrangler.roadway.shapes.delete.delete_shapes_by_ids

delete_shapes_by_ids(shapes_df, del_shape_ids, ignore_missing=False)

Deletes shapes from shapes_df by shape_id.

Parameters:

  • shapes_df (DataFrame[RoadShapesTable]) –

    RoadShapesTable

  • del_shape_ids (list[int]) –

    list of shape_ids to delete

  • ignore_missing (bool, default: False ) –

    if True, will not raise an error if shape_id is not found in shapes_df

Returns:

  • DataFrame[RoadShapesTable]

    DataFrame[RoadShapesTable]: a copy of shapes_df with shapes removed

Source code in network_wrangler/roadway/shapes/delete.py
def delete_shapes_by_ids(
    shapes_df: DataFrame[RoadShapesTable], del_shape_ids: list[int], ignore_missing: bool = False
) -> DataFrame[RoadShapesTable]:
    """Deletes shapes from shapes_df by shape_id.

    Args:
        shapes_df: RoadShapesTable
        del_shape_ids: list of shape_ids to delete
        ignore_missing: if True, will not raise an error if shape_id is not found in shapes_df

    Returns:
        DataFrame[RoadShapesTable]: a copy of shapes_df with shapes removed
    """
    WranglerLogger.debug(f"Deleting shapes with ids: \n{del_shape_ids}")

    _missing = set(del_shape_ids) - set(shapes_df.index)
    if _missing:
        WranglerLogger.warning(f"Shapes in network not there to delete: \n{_missing}")
        if not ignore_missing:
            msg = "Shapes to delete are not in the network."
            raise ShapeDeletionError(msg)
    return shapes_df.drop(labels=del_shape_ids, errors="ignore")

Helpter functions which filter a RoadShapesTable.

filter_shapes_to_links(shapes_df, links_df)

Shapes which are referenced in RoadLinksTable.

Source code in network_wrangler/roadway/shapes/filters.py
def filter_shapes_to_links(
    shapes_df: DataFrame[RoadShapesTable], links_df: DataFrame[RoadLinksTable]
) -> DataFrame[RoadShapesTable]:
    """Shapes which are referenced in RoadLinksTable."""
    return shapes_df.loc[shapes_df.shape_id.isin(links_df.shape_id)]

Functions that query RoadShapesTable.

shape_ids_without_links(shapes_df, links_df)

List of shape ids that don’t have associated links.

Source code in network_wrangler/roadway/shapes/shapes.py
def shape_ids_without_links(
    shapes_df: DataFrame[RoadShapesTable], links_df: DataFrame[RoadLinksTable]
) -> list[int]:
    """List of shape ids that don't have associated links."""
    return list(set(shapes_df.index) - set(links_df.shape_ids.to_list()))

Roadway Projects

Functions for applying roadway link or node addition project cards to the roadway network.

network_wrangler.roadway.projects.add.apply_new_roadway

apply_new_roadway(roadway_net, roadway_addition, project_name=None)

Add the new roadway features defined in the project card.

New nodes are added first so that links can refer to any added nodes.

Parameters:

  • roadway_net (RoadwayNetwork) –

    input RoadwayNetwork to apply change to

  • roadway_addition (dict) –

    dictionary conforming to RoadwayAddition model such as:

  • project_name (Optional[str], default: None ) –

    optional name of the project to be applied

Source code in network_wrangler/roadway/projects/add.py
def apply_new_roadway(
    roadway_net: RoadwayNetwork,
    roadway_addition: dict,
    project_name: Optional[str] = None,
) -> RoadwayNetwork:
    """Add the new roadway features defined in the project card.

    New nodes are added first so that links can refer to any added nodes.

    Args:
        roadway_net: input RoadwayNetwork to apply change to
        roadway_addition: dictionary conforming to RoadwayAddition model such as:

        ```json
            {
                "links": [
                    {
                        "model_link_id": 1000,
                        "A": 100,
                        "B": 101,
                        "lanes": 2,
                        "name": "Main St"
                    }
                ],
                "nodes": [
                    {
                        "model_node_id": 100,
                        "X": 0,
                        "Y": 0
                    },
                    {
                        "model_node_id": 101,
                        "X": 0,
                        "Y": 100
                    }
                ],
            }
        ```
        project_name: optional name of the project to be applied

    returns: updated network with new links and nodes and associated geometries
    """
    add_links, add_nodes = roadway_addition.get("links", []), roadway_addition.get("nodes", [])
    if not add_links and not add_nodes:
        msg = "No links or nodes given to add."
        raise NewRoadwayError(msg)

    WranglerLogger.debug(
        f"Adding New Roadway Features: \n-Links: \n{add_links}\n-Nodes: \n{add_nodes}"
    )
    if add_nodes:
        _new_nodes_df = data_to_nodes_df(pd.DataFrame(add_nodes), config=roadway_net.config)
        if project_name:
            _new_nodes_df["projects"] = f"{project_name},"
        roadway_net.add_nodes(_new_nodes_df)

    if add_links:
        # make sure links refer to nodes in network
        _missing_nodes = _node_ids_from_set_links(add_links) - set(roadway_net.nodes_df.index)
        if _missing_nodes:
            msg = "Link additions use nodes not found in network."
            WranglerLogger.error(msg + f" Missing nodes for new links: {_missing_nodes}")
            raise NewRoadwayError(msg)
        _new_links_df = data_to_links_df(
            add_links,
            nodes_df=roadway_net.nodes_df,
        )
        if project_name:
            _new_links_df["projects"] = f"{project_name},"

        roadway_net.add_links(_new_links_df)

    return roadway_net

Wrapper function for applying code to change roadway network.

network_wrangler.roadway.projects.calculate.apply_calculated_roadway

apply_calculated_roadway(roadway_net, pycode)

Changes roadway network object by executing pycode.

Parameters:

  • roadway_net (RoadwayNetwork) –

    network to manipulate

  • pycode (str) –

    python code which changes values in the roadway network object

Source code in network_wrangler/roadway/projects/calculate.py
def apply_calculated_roadway(
    roadway_net: RoadwayNetwork,
    pycode: str,
) -> RoadwayNetwork:
    """Changes roadway network object by executing pycode.

    Args:
        roadway_net: network to manipulate
        pycode: python code which changes values in the roadway network object
    """
    WranglerLogger.debug("Applying calculated roadway project.")
    self = roadway_net
    exec(pycode)

    return roadway_net

Wrapper function for applying roadway deletion project card to RoadwayNetwork.

network_wrangler.roadway.projects.delete.apply_roadway_deletion

apply_roadway_deletion(roadway_net, roadway_deletion, transit_net=None)

Delete the roadway links or nodes defined in the project card.

If deleting links and specified in RoadwayDeletion, will also clean up the shapes and nodes used by links. Defaults to not cleaning up shapes or nodes.

Parameters:

  • roadway_net (RoadwayNetwork) –

    input RoadwayNetwork to apply change to

  • roadway_deletion (Union[dict, RoadwayDeletion]) –

    dictionary conforming to RoadwayDeletion

  • transit_net (Optional[TransitNetwork], default: None ) –

    input TransitNetwork which will be used to check if deletion breaks transit shapes. If None, will not check for broken shapes.

Source code in network_wrangler/roadway/projects/delete.py
def apply_roadway_deletion(
    roadway_net: RoadwayNetwork,
    roadway_deletion: Union[dict, RoadwayDeletion],
    transit_net: Optional[TransitNetwork] = None,
) -> RoadwayNetwork:
    """Delete the roadway links or nodes defined in the project card.

    If deleting links and specified in RoadwayDeletion, will also clean up the shapes and nodes
    used by links. Defaults to not cleaning up shapes or nodes.

    Args:
        roadway_net: input RoadwayNetwork to apply change to
        roadway_deletion: dictionary conforming to RoadwayDeletion
        transit_net: input TransitNetwork which will be used to check if deletion breaks transit
            shapes. If None, will not check for broken shapes.
    """
    if not isinstance(roadway_deletion, RoadwayDeletion):
        roadway_deletion = RoadwayDeletion(**roadway_deletion)

    WranglerLogger.debug(f"Deleting Roadway Features: \n{roadway_deletion}")

    if roadway_deletion.links:
        roadway_net.delete_links(
            roadway_deletion.links.model_dump(exclude_none=True, by_alias=True),
            clean_shapes=roadway_deletion.clean_shapes,
            clean_nodes=roadway_deletion.clean_nodes,
            transit_net=transit_net,
        )

    if roadway_deletion.nodes:
        roadway_net.delete_nodes(
            roadway_deletion.nodes.model_dump(exclude_none=True, by_alias=True),
        )

    return roadway_net

Functions for applying roadway property change project cards to the roadway network.

network_wrangler.roadway.projects.edit_property.apply_roadway_property_change

apply_roadway_property_change(roadway_net, selection, property_changes, project_name=None)

Changes roadway properties for the selected features based on the project card.

Parameters:

  • roadway_net (RoadwayNetwork) –

    input RoadwayNetwork to apply change to

  • selection

    roadway selection object

  • property_changes

    dictionary of roadway properties to change. e.g.

    #changes number of lanes 3 to 2 (reduction of 1) and adds a bicycle lane
    lanes:
        existing: 3
        change: -1
    bicycle_facility:
        set: 2
    
  • project_name (Optional[str], default: None ) –

    optional name of the project to be applied

Source code in network_wrangler/roadway/projects/edit_property.py
def apply_roadway_property_change(
    roadway_net: RoadwayNetwork,
    selection: Union[RoadwayNodeSelection, RoadwayLinkSelection],
    property_changes: dict[str, RoadPropertyChange],
    project_name: Optional[str] = None,
) -> RoadwayNetwork:
    """Changes roadway properties for the selected features based on the project card.

    Args:
        roadway_net: input RoadwayNetwork to apply change to
        selection : roadway selection object
        property_changes : dictionary of roadway properties to change.
            e.g.

            ```yml
            #changes number of lanes 3 to 2 (reduction of 1) and adds a bicycle lane
            lanes:
                existing: 3
                change: -1
            bicycle_facility:
                set: 2
            ```
        project_name: optional name of the project to be applied
    """
    WranglerLogger.debug("Applying roadway property change project.")

    if isinstance(selection, RoadwayLinkSelection):
        roadway_net.links_df = edit_link_properties(
            roadway_net.links_df,
            selection.selected_links,
            property_changes,
            project_name=project_name,
        )

    elif isinstance(selection, RoadwayNodeSelection):
        non_geo_changes = {
            k: v for k, v in property_changes.items() if k not in NodeGeometryChange.model_fields
        }
        for property, property_dict in non_geo_changes.items():
            prop_change = RoadPropertyChange(**property_dict)
            roadway_net.nodes_df = edit_node_property(
                roadway_net.nodes_df,
                selection.selected_nodes,
                property,
                prop_change,
                project_name=project_name,
            )

        geo_changes_df = _node_geo_change_from_property_changes(
            property_changes, selection.selected_nodes
        )
        if geo_changes_df is not None:
            roadway_net.move_nodes(geo_changes_df)

    else:
        msg = "geometry_type must be either 'links' or 'nodes'"
        raise RoadwayPropertyChangeError(msg)

    return roadway_net

Roadway Supporting Modules

Functions for reading and writing roadway networks.

network_wrangler.roadway.io.convert_roadway_file_serialization

convert_roadway_file_serialization(in_path, in_format='geojson', out_dir=Path(), out_format='parquet', out_prefix='', overwrite=True, boundary_gdf=None, boundary_geocode=None, boundary_file=None, chunk_size=None)

Converts a files in a roadway from one serialization format to another without parsing.

Does not do any validation.

Parameters:

  • in_path (Path) –

    the path to the input directory.

  • in_format (RoadwayFileTypes, default: 'geojson' ) –

    the file formatof the input files. Defaults to “geojson”.

  • out_dir (Path, default: Path() ) –

    the path were the output will be saved.

  • out_format (RoadwayFileTypes, default: 'parquet' ) –

    the format of the output files. Defaults to “parquet”.

  • out_prefix (str, default: '' ) –

    the name prefix of the roadway files that will be generated. Defaults to “”.

  • overwrite (bool, default: True ) –

    if True, will overwrite the files if they already exist. Defaults to True.

  • boundary_gdf (Optional[GeoDataFrame], default: None ) –

    GeoDataFrame to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_geocode (Optional[str], default: None ) –

    Geocode to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_file (Optional[Path], default: None ) –

    File to load as a boundary to filter the input data to. Only used for geographic data. Defaults to None.

  • chunk_size (Optional[int], default: None ) –

    Size of chunk to process if want to force chunking. Defaults to None. Chunking will only apply to converting from json to parquet files.

Source code in network_wrangler/roadway/io.py
def convert_roadway_file_serialization(
    in_path: Path,
    in_format: RoadwayFileTypes = "geojson",
    out_dir: Path = Path(),
    out_format: RoadwayFileTypes = "parquet",
    out_prefix: str = "",
    overwrite: bool = True,
    boundary_gdf: Optional[GeoDataFrame] = None,
    boundary_geocode: Optional[str] = None,
    boundary_file: Optional[Path] = None,
    chunk_size: Optional[int] = None,
):
    """Converts a files in a roadway from one serialization format to another without parsing.

    Does not do any validation.

    Args:
        in_path: the path to the input directory.
        in_format: the file formatof the input files. Defaults to "geojson".
        out_dir: the path were the output will be saved.
        out_format: the format of the output files. Defaults to "parquet".
        out_prefix: the name prefix of the roadway files that will be generated. Defaults to "".
        overwrite: if True, will overwrite the files if they already exist. Defaults to True.
        boundary_gdf: GeoDataFrame to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_geocode: Geocode to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_file: File to load as a boundary to filter the input data to. Only used for
            geographic data. Defaults to None.
        chunk_size: Size of chunk to process if want to force chunking. Defaults to None.
            Chunking will only apply to converting from json to parquet files.
    """
    links_in_file, nodes_in_file, shapes_in_file = id_roadway_file_paths_in_dir(in_path, in_format)
    from ..utils.io_table import convert_file_serialization  # noqa: PLC0415

    nodes_out_file = Path(out_dir / f"{out_prefix}_nodes.{out_format}")
    convert_file_serialization(
        nodes_in_file,
        nodes_out_file,
        overwrite=overwrite,
        boundary_gdf=boundary_gdf,
        boundary_geocode=boundary_geocode,
        boundary_file=boundary_file,
        chunk_size=chunk_size,
    )

    if any([boundary_file, boundary_geocode, boundary_gdf is not None]):
        node_filter_s = read_table(nodes_out_file).model_node_id
    else:
        node_filter_s = None

    links_out_file = Path(out_dir / f"{out_prefix}_links.{out_format}")
    if out_format == "geojson":
        links_out_file = links_out_file.with_suffix(".json")

    convert_file_serialization(
        links_in_file,
        links_out_file,
        overwrite=overwrite,
        node_filter_s=node_filter_s,
        chunk_size=chunk_size,
    )

    if shapes_in_file:
        shapes_out_file = Path(out_dir / f"{out_prefix}_shapes.{out_format}")
        convert_file_serialization(
            shapes_in_file,
            shapes_out_file,
            overwrite=overwrite,
            boundary_gdf=boundary_gdf,
            boundary_geocode=boundary_geocode,
            boundary_file=boundary_file,
            chunk_size=chunk_size,
        )

network_wrangler.roadway.io.convert_roadway_network_serialization

convert_roadway_network_serialization(input_path, output_format='geojson', out_dir='.', input_file_format='geojson', out_prefix='', overwrite=True, boundary_gdf=None, boundary_geocode=None, boundary_file=None, filter_links_to_nodes=False)

Converts a roadway network from one serialization format to another with parsing.

Performs validation and parsing.

Parameters:

  • input_path (Union[str, Path]) –

    the path to the input directory.

  • output_format (RoadwayFileTypes, default: 'geojson' ) –

    the format of the output files. Defaults to “geojson”.

  • out_dir (Union[str, Path], default: '.' ) –

    the path were the output will be saved.

  • input_file_format (RoadwayFileTypes, default: 'geojson' ) –

    the format of the input files. Defaults to “geojson”.

  • out_prefix (str, default: '' ) –

    the name prefix of the roadway files that will be generated. Defaults to “”.

  • overwrite (bool, default: True ) –

    if True, will overwrite the files if they already exist. Defaults to True.

  • boundary_gdf (Optional[GeoDataFrame], default: None ) –

    GeoDataFrame to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_geocode (Optional[str], default: None ) –

    Geocode to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_file (Optional[Path], default: None ) –

    File to load as a boundary to filter the input data to. Only used for geographic data. Defaults to None.

  • filter_links_to_nodes (bool, default: False ) –

    if True, will filter the links to only those that have nodes. Defaults to False unless boundary_gdf, boundary_geocode, or boundary_file are provided.

Source code in network_wrangler/roadway/io.py
def convert_roadway_network_serialization(
    input_path: Union[str, Path],
    output_format: RoadwayFileTypes = "geojson",
    out_dir: Union[str, Path] = ".",
    input_file_format: RoadwayFileTypes = "geojson",
    out_prefix: str = "",
    overwrite: bool = True,
    boundary_gdf: Optional[GeoDataFrame] = None,
    boundary_geocode: Optional[str] = None,
    boundary_file: Optional[Path] = None,
    filter_links_to_nodes: bool = False,
):
    """Converts a roadway network from one serialization format to another with parsing.

    Performs validation and parsing.

    Args:
        input_path: the path to the input directory.
        output_format: the format of the output files. Defaults to "geojson".
        out_dir: the path were the output will be saved.
        input_file_format: the format of the input files. Defaults to "geojson".
        out_prefix: the name prefix of the roadway files that will be generated. Defaults to "".
        overwrite: if True, will overwrite the files if they already exist. Defaults to True.
        boundary_gdf: GeoDataFrame to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_geocode: Geocode to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_file: File to load as a boundary to filter the input data to. Only used for
            geographic data. Defaults to None.
        filter_links_to_nodes: if True, will filter the links to only those that have nodes.
            Defaults to False unless boundary_gdf, boundary_geocode, or boundary_file are provided.
    """
    if input_file_format is None:
        input_file_format = "geojson"
    WranglerLogger.info(
        f"Loading roadway network from {input_path} with format {input_file_format}"
    )
    net = load_roadway_from_dir(
        input_path,
        file_format=input_file_format,
        boundary_gdf=boundary_gdf,
        boundary_geocode=boundary_geocode,
        boundary_file=boundary_file,
        filter_links_to_nodes=filter_links_to_nodes,
    )
    WranglerLogger.info(f"Writing roadway network to {out_dir} in {output_format} format.")
    write_roadway(
        net,
        prefix=out_prefix,
        out_dir=out_dir,
        file_format=output_format,
        overwrite=overwrite,
    )

network_wrangler.roadway.io.id_roadway_file_paths_in_dir

id_roadway_file_paths_in_dir(dir, file_format='geojson')

Identifies the paths to the links, nodes, and shapes files in a directory.

Source code in network_wrangler/roadway/io.py
def id_roadway_file_paths_in_dir(
    dir: Union[Path, str], file_format: RoadwayFileTypes = "geojson"
) -> tuple[Path, Path, Union[None, Path]]:
    """Identifies the paths to the links, nodes, and shapes files in a directory."""
    network_path = Path(dir)
    if not network_path.is_dir():
        msg = f"Directory {network_path} does not exist"
        raise FileNotFoundError(msg)

    _link_file_format = file_format
    if "geojson" in file_format:
        _link_file_format = "json"

    try:
        links_file = next(network_path.glob(f"*link*{_link_file_format}"))
    except StopIteration as err:
        msg = f"No links file with {_link_file_format} file format found in {network_path}"
        raise FileNotFoundError(msg) from err

    try:
        nodes_file = next(network_path.glob(f"*node*{file_format}"))
    except StopIteration as err:
        msg = f"No nodes file with {file_format} file format found in {network_path}"
        raise FileNotFoundError(msg) from err

    try:
        shapes_file = next(network_path.glob(f"*shape*{file_format}"))
    except StopIteration:
        # Shape file is optional so if not found, its ok.
        shapes_file = None

    return links_file, nodes_file, shapes_file

network_wrangler.roadway.io.load_roadway

load_roadway(links_file, nodes_file, shapes_file=None, in_crs=LAT_LON_CRS, read_in_shapes=False, boundary_gdf=None, boundary_geocode=None, boundary_file=None, filter_links_to_nodes=None, config=DefaultConfig)

Reads a network from the roadway network standard.

Validates that it conforms to the schema.

Parameters:

  • links_file (Path) –

    full path to the link file

  • nodes_file (Path) –

    full path to the node file

  • shapes_file (Optional[Path], default: None ) –

    full path to the shape file. NOTE if not found, it will defaul to None and not raise an error.

  • in_crs (int, default: LAT_LON_CRS ) –

    coordinate reference system that network is in. Defaults to LAT_LON_CRS which defaults to 4326 which is WGS84 lat/long.

  • read_in_shapes (bool, default: False ) –

    if True, will read shapes into network instead of only lazily reading them when they are called. Defaults to False.

  • boundary_gdf (Optional[GeoDataFrame], default: None ) –

    GeoDataFrame to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_geocode (Optional[str], default: None ) –

    Geocode to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_file (Optional[Path], default: None ) –

    File to load as a boundary to filter the input data to. Only used for geographic data. Defaults to None.

  • filter_links_to_nodes (Optional[bool], default: None ) –

    if True, will filter the links to only those that have nodes. Defaults to False unless boundary_gdf, boundary_geocode, or boundary_file are provided which defaults it to True.

  • config (ConfigInputTypes, default: DefaultConfig ) –

    a Configuration object to update with the new configuration. Can be a dictionary, a path to a file, or a list of paths to files or a WranglerConfig instance. Defaults to None and will load defaults.

Returns:

Source code in network_wrangler/roadway/io.py
def load_roadway(
    links_file: Path,
    nodes_file: Path,
    shapes_file: Optional[Path] = None,
    in_crs: int = LAT_LON_CRS,
    read_in_shapes: bool = False,
    boundary_gdf: Optional[GeoDataFrame] = None,
    boundary_geocode: Optional[str] = None,
    boundary_file: Optional[Path] = None,
    filter_links_to_nodes: Optional[bool] = None,
    config: ConfigInputTypes = DefaultConfig,
) -> RoadwayNetwork:
    """Reads a network from the roadway network standard.

    Validates that it conforms to the schema.

    Args:
        links_file: full path to the link file
        nodes_file: full path to the node file
        shapes_file: full path to the shape file. NOTE if not found, it will defaul to None and not
            raise an error.
        in_crs: coordinate reference system that network is in. Defaults to LAT_LON_CRS which
            defaults to 4326 which is WGS84 lat/long.
        read_in_shapes: if True, will read shapes into network instead of only lazily
            reading them when they are called. Defaults to False.
        boundary_gdf: GeoDataFrame to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_geocode: Geocode to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_file: File to load as a boundary to filter the input data to. Only used for
            geographic data. Defaults to None.
        filter_links_to_nodes: if True, will filter the links to only those that have nodes.
            Defaults to False unless boundary_gdf, boundary_geocode, or boundary_file are provided
            which defaults it to True.
        config: a Configuration object to update with the new configuration. Can be
            a dictionary, a path to a file, or a list of paths to files or a
            WranglerConfig instance. Defaults to None and will load defaults.

    Returns:
        (RoadwayNetwork) instance of RoadwayNetwork
    """
    from .network import RoadwayNetwork  # noqa: PLC0415

    if not isinstance(config, WranglerConfig):
        config = load_wrangler_config(config)

    nodes_file = Path(nodes_file)
    links_file = Path(links_file)
    shapes_file = Path(shapes_file) if shapes_file else None
    if read_in_shapes and shapes_file is not None and shapes_file.exists():
        shapes_df = read_shapes(
            shapes_file,
            in_crs=in_crs,
            config=config,
            boundary_gdf=boundary_gdf,
            boundary_geocode=boundary_geocode,
            boundary_file=boundary_file,
        )
    else:
        shapes_df = None
    nodes_df = read_nodes(
        nodes_file,
        in_crs=in_crs,
        config=config,
        boundary_gdf=boundary_gdf,
        boundary_geocode=boundary_geocode,
        boundary_file=boundary_file,
    )

    if filter_links_to_nodes is None and any(
        [boundary_file, boundary_geocode, boundary_gdf is not None]
    ):
        filter_links_to_nodes = True
    elif filter_links_to_nodes is None:
        filter_links_to_nodes = False

    links_df = read_links(
        links_file,
        in_crs=in_crs,
        config=config,
        nodes_df=nodes_df,
        filter_to_nodes=filter_links_to_nodes,
    )

    roadway_network = RoadwayNetwork(
        links_df=links_df,
        nodes_df=nodes_df,
        shapes_df=shapes_df,
        config=config,
    )
    if shapes_file and shapes_file.exists():
        roadway_network._shapes_file = shapes_file
    roadway_network._links_file = links_file
    roadway_network._nodes_file = nodes_file

    return roadway_network

network_wrangler.roadway.io.load_roadway_from_dataframes

load_roadway_from_dataframes(links_df, nodes_df, shapes_df=None, config=DefaultConfig)

Creates a RoadwayNetwork from DataFrames with validation.

Validates the DataFrames against their respective Pandera schemas before creating the network instance. This method is useful if the user is already working with networks in DataFrames and doesn’t want to write it to disk just to read it again.

Parameters:

  • links_df (DataFrame) –

    DataFrame containing roadway links data

  • nodes_df (DataFrame) –

    DataFrame containing roadway nodes data

  • shapes_df (Optional[GeoDataFrame], default: None ) –

    Optional GeoDataFrame containing roadway shapes data

  • config (ConfigInputTypes, default: DefaultConfig ) –

    a Configuration object to update with the new configuration. Can be a dictionary, a path to a file, or a list of paths to files or a WranglerConfig instance. Defaults to None and will load defaults.

Returns:

Source code in network_wrangler/roadway/io.py
def load_roadway_from_dataframes(
    links_df: DataFrame,
    nodes_df: DataFrame,
    shapes_df: Optional[GeoDataFrame] = None,
    config: ConfigInputTypes = DefaultConfig,
) -> RoadwayNetwork:
    """Creates a RoadwayNetwork from DataFrames with validation.

    Validates the DataFrames against their respective Pandera schemas before
    creating the network instance. This method is useful if the user is already working with
    networks in DataFrames and doesn't want to write it to disk just to read it again.

    Args:
        links_df: DataFrame containing roadway links data
        nodes_df: DataFrame containing roadway nodes data
        shapes_df: Optional GeoDataFrame containing roadway shapes data
        config: a Configuration object to update with the new configuration. Can be
            a dictionary, a path to a file, or a list of paths to files or a
            WranglerConfig instance. Defaults to None and will load defaults.

    Returns:
        (RoadwayNetwork) instance with validated data
    """
    from ..models.roadway.tables import (  # noqa: PLC0415
        RoadLinksTable,
        RoadNodesTable,
        RoadShapesTable,
    )
    from ..utils.models import validate_df_to_model  # noqa: PLC0415
    from .network import RoadwayNetwork  # noqa: PLC0415

    if not isinstance(config, WranglerConfig):
        config = load_wrangler_config(config)

    # Validate DataFrames against Pandera schemas
    WranglerLogger.debug("Validating nodes_df against RoadNodesTable schema")
    validated_nodes_df = validate_df_to_model(nodes_df, RoadNodesTable)

    WranglerLogger.debug("Validating links_df against RoadLinksTable schema")
    validated_links_df = validate_df_to_model(links_df, RoadLinksTable)

    validated_shapes_df = None
    if shapes_df is not None:
        WranglerLogger.debug("Validating shapes_df against RoadShapesTable schema")
        validated_shapes_df = validate_df_to_model(shapes_df, RoadShapesTable)

    # Create RoadwayNetwork with validated DataFrames
    roadway_network = RoadwayNetwork(
        links_df=validated_links_df,
        nodes_df=validated_nodes_df,
        shapes_df=validated_shapes_df,
        config=config,
    )

    return roadway_network

network_wrangler.roadway.io.load_roadway_from_dir

load_roadway_from_dir(dir, file_format='geojson', read_in_shapes=False, boundary_gdf=None, boundary_geocode=None, boundary_file=None, filter_links_to_nodes=None, config=DefaultConfig)

Reads a network from the roadway network standard.

Validates that it conforms to the schema.

Parameters:

  • dir (Union[Path, str]) –

    the directory where the network files are located

  • file_format (RoadwayFileTypes, default: 'geojson' ) –

    the file format of the files. Defaults to “geojson”

  • read_in_shapes (bool, default: False ) –

    if True, will read shapes into network instead of only lazily reading them when they are called. Defaults to False.

  • boundary_gdf (Optional[GeoDataFrame], default: None ) –

    GeoDataFrame to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_geocode (Optional[str], default: None ) –

    Geocode to filter the input data to. Only used for geographic data. Defaults to None.

  • boundary_file (Optional[Path], default: None ) –

    File to load as a boundary to filter the input data to. Only used for geographic data. Defaults to None.

  • filter_links_to_nodes (Optional[bool], default: None ) –

    if True, will filter the links to only those that have nodes. Defaults to False unless boundary_gdf, boundary_geocode, or boundary_file are provided which defaults it to True.

  • config (ConfigInputTypes, default: DefaultConfig ) –

    a Configuration object to update with the new configuration. Can be a dictionary, a path to a file, or a list of paths to files or a WranglerConfig instance. Defaults to None and will load defaults.

Returns:

Source code in network_wrangler/roadway/io.py
def load_roadway_from_dir(
    dir: Union[Path, str],
    file_format: RoadwayFileTypes = "geojson",
    read_in_shapes: bool = False,
    boundary_gdf: Optional[GeoDataFrame] = None,
    boundary_geocode: Optional[str] = None,
    boundary_file: Optional[Path] = None,
    filter_links_to_nodes: Optional[bool] = None,
    config: ConfigInputTypes = DefaultConfig,
) -> RoadwayNetwork:
    """Reads a network from the roadway network standard.

    Validates that it conforms to the schema.

    Args:
        dir: the directory where the network files are located
        file_format: the file format of the files. Defaults to "geojson"
        read_in_shapes: if True, will read shapes into network instead of only lazily
            reading them when they are called. Defaults to False.
        boundary_gdf: GeoDataFrame to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_geocode: Geocode to filter the input data to. Only used for geographic data.
            Defaults to None.
        boundary_file: File to load as a boundary to filter the input data to. Only used for
            geographic data. Defaults to None.
        filter_links_to_nodes: if True, will filter the links to only those that have nodes.
            Defaults to False unless boundary_gdf, boundary_geocode, or boundary_file are provided
            which defaults it to True.
        config: a Configuration object to update with the new configuration. Can be
            a dictionary, a path to a file, or a list of paths to files or a
            WranglerConfig instance. Defaults to None and will load defaults.

    Returns:
        (RoadwayNetwork) instance of RoadwayNetwork
    """
    links_file, nodes_file, shapes_file = id_roadway_file_paths_in_dir(dir, file_format)

    return load_roadway(
        links_file=links_file,
        nodes_file=nodes_file,
        shapes_file=shapes_file,
        read_in_shapes=read_in_shapes,
        boundary_gdf=boundary_gdf,
        boundary_geocode=boundary_geocode,
        boundary_file=boundary_file,
        filter_links_to_nodes=filter_links_to_nodes,
        config=config,
    )

network_wrangler.roadway.io.write_roadway

write_roadway(net, out_dir='.', convert_complex_link_properties_to_single_field=False, prefix='', file_format='geojson', overwrite=True, true_shape=False)

Writes a network in the roadway network standard.

Parameters:

  • net (Union[RoadwayNetwork, ModelRoadwayNetwork]) –

    RoadwayNetwork or ModelRoadwayNetwork instance to write out.

  • out_dir (Union[Path, str], default: '.' ) –

    the path were the output will be saved. Defaults to “.”.

  • prefix (str, default: '' ) –

    the name prefix of the roadway files that will be generated.

  • file_format (RoadwayFileTypes, default: 'geojson' ) –

    the format of the output files. Defaults to “geojson”.

  • convert_complex_link_properties_to_single_field (bool, default: False ) –

    if True, will convert complex link properties to a single column consistent with v0 format. This format is NOT valid with parquet and many other softwares. Defaults to False.

  • overwrite (bool, default: True ) –

    if True, will overwrite the files if they already exist. Defaults to True.

  • true_shape (bool, default: False ) –

    if True, will write the true shape of the links as found from shapes. Defaults to False.

Source code in network_wrangler/roadway/io.py
def write_roadway(
    net: Union[RoadwayNetwork, ModelRoadwayNetwork],
    out_dir: Union[Path, str] = ".",
    convert_complex_link_properties_to_single_field: bool = False,
    prefix: str = "",
    file_format: RoadwayFileTypes = "geojson",
    overwrite: bool = True,
    true_shape: bool = False,
) -> None:
    """Writes a network in the roadway network standard.

    Args:
        net: RoadwayNetwork or ModelRoadwayNetwork instance to write out.
        out_dir: the path were the output will be saved. Defaults to ".".
        prefix: the name prefix of the roadway files that will be generated.
        file_format: the format of the output files. Defaults to "geojson".
        convert_complex_link_properties_to_single_field: if True, will convert complex link
            properties to a single column consistent with v0 format.  This format is NOT valid
            with parquet and many other softwares. Defaults to False.
        overwrite: if True, will overwrite the files if they already exist. Defaults to True.
        true_shape: if True, will write the true shape of the links as found from shapes.
            Defaults to False.
    """
    out_dir = Path(out_dir)
    out_dir.mkdir(parents=True, exist_ok=True)

    prefix = f"{prefix}_" if prefix else ""

    links_df = net.links_df
    if true_shape:
        links_df = links_df.true_shape(net.shapes_df)

    write_links(
        net.links_df,
        convert_complex_properties_to_single_field=convert_complex_link_properties_to_single_field,
        out_dir=out_dir,
        prefix=prefix,
        file_format=file_format,
        overwrite=overwrite,
        include_geometry=true_shape,
    )
    write_nodes(net.nodes_df, out_dir, prefix, file_format, overwrite)

    if not true_shape and not net.shapes_df.empty:
        write_shapes(net.shapes_df, out_dir, prefix, file_format, overwrite)

Functions to clip a RoadwayNetwork object to a boundary.

Clipped roadway is an independent roadway network that is a subset of the original roadway network.

Unlike a Subnet, it is geographic selection defined by a bounday rather than a logical selection defined by a graph.

Example usage:

from network_wrangler.roadway load_roadway_from_dir, write_roadway
from network_wrangler.roadway.clip import clip_roadway

stpaul_net = load_roadway_from_dir(example_dir / "stpaul")
boundary_file = test_dir / "data" / "ecolab.geojson"
clipped_network = clip_roadway(stpaul_net, boundary_file=boundary_file)
write_roadway(clipped_network, out_dir, prefix="ecolab", format="geojson", true_shape=True)

network_wrangler.roadway.clip.clip_roadway

clip_roadway(network, boundary_gdf=None, boundary_geocode=None, boundary_file=None)

Clip a RoadwayNetwork object to a boundary.

Retains only the links within or crossing the boundary and all the nodes that those links connect to. At least one of boundary_gdf, boundary_geocode, or boundary_file must be provided.

Parameters:

  • network (RoadwayNetwork) –

    RoadwayNetwork object to be clipped.

  • boundary_gdf (GeoDataFrame, default: None ) –

    GeoDataframe of one or more polygons which define the boundary to clip to. Defaults to None.

  • boundary_geocode (Union[str, dict], default: None ) –

    Place name to clip data to as ascertained from open street maps’s Nomatim API (e.g. “Hennipen County, MN, USA”). Defaults to None.

  • boundary_file (Union[str, Path], default: None ) –

    Geographic data file that can be read by GeoPandas (e.g. geojson, parquet, shp) that defines a geographic polygon area to clip to. Defaults to None.

Source code in network_wrangler/roadway/clip.py
def clip_roadway(
    network: RoadwayNetwork,
    boundary_gdf: gpd.GeoDataFrame = None,
    boundary_geocode: Optional[Union[str, dict]] = None,
    boundary_file: Optional[Union[str, Path]] = None,
) -> RoadwayNetwork:
    """Clip a RoadwayNetwork object to a boundary.

    Retains only the links within or crossing the boundary and all the nodes that those links
    connect to.  At least one of boundary_gdf, boundary_geocode, or boundary_file must be provided.

    Args:
        network (RoadwayNetwork): RoadwayNetwork object to be clipped.
        boundary_gdf (gpd.GeoDataFrame, optional): GeoDataframe of one or more polygons which
            define the boundary to clip to. Defaults to None.
        boundary_geocode (Union[str,dict], optional): Place name to clip data to as ascertained
            from open street maps's Nomatim API (e.g. "Hennipen County, MN, USA").
            Defaults to None.
        boundary_file (Union[str,Path], optional): Geographic data file that can be read by
            GeoPandas (e.g. geojson, parquet, shp) that defines a geographic polygon area to clip
            to. Defaults to None.

    Returns: RoadwayNetwork clipped to the defined boundary.
    """
    trimmed_links_df, trimmed_nodes_df, trimmed_shapes_df = clip_roadway_to_dfs(
        network=network,
        boundary_gdf=boundary_gdf,
        boundary_geocode=boundary_geocode,
        boundary_file=boundary_file,
    )
    from .network import RoadwayNetwork  # noqa: PLC0415

    trimmed_net = RoadwayNetwork(
        links_df=trimmed_links_df,
        nodes_df=trimmed_nodes_df,
        _shapes_df=trimmed_shapes_df,
    )
    return trimmed_net

network_wrangler.roadway.clip.clip_roadway_to_dfs

clip_roadway_to_dfs(network, boundary_gdf=None, boundary_geocode=None, boundary_file=None)

Clips a RoadwayNetwork object to a boundary and returns the resulting GeoDataFrames.

Retains only the links within or crossing the boundary and all the nodes that those links connect to.

Parameters:

  • network (RoadwayNetwork) –

    RoadwayNetwork object to be clipped.

  • boundary_gdf (GeoDataFrame, default: None ) –

    GeoDataframe of one or more polygons which define the boundary to clip to. Defaults to None.

  • boundary_geocode (Union[str, dict], default: None ) –

    Place name to clip data to as ascertained from open street maps’s Nomatim API (e.g. “Hennipen County, MN, USA”). Defaults to None.

  • boundary_file (Union[str, Path], default: None ) –

    Geographic data file that can be read by GeoPandas (e.g. geojson, parquet, shp) that defines a geographic polygon area to clip to. Defaults to None.

Source code in network_wrangler/roadway/clip.py
def clip_roadway_to_dfs(
    network: RoadwayNetwork,
    boundary_gdf: gpd.GeoDataFrame = None,
    boundary_geocode: Optional[Union[str, dict]] = None,
    boundary_file: Optional[Union[str, Path]] = None,
) -> tuple:
    """Clips a RoadwayNetwork object to a boundary and returns the resulting GeoDataFrames.

    Retains only the links within or crossing the boundary and all the nodes that those links
    connect to.

    Args:
        network (RoadwayNetwork): RoadwayNetwork object to be clipped.
        boundary_gdf (gpd.GeoDataFrame, optional): GeoDataframe of one or more polygons which
            define the boundary to clip to. Defaults to None.
        boundary_geocode (Union[str,dict], optional): Place name to clip data to as ascertained
            from open street maps's Nomatim API (e.g. "Hennipen County, MN, USA").
            Defaults to None.
        boundary_file (Union[str,Path], optional): Geographic data file that can be read by
            GeoPandas (e.g. geojson, parquet, shp) that defines a geographic polygon area to clip
            to. Defaults to None.

    Returns: tuple of GeoDataFrames trimmed_links_df, trimmed_nodes_df, trimmed_shapes_df

    """
    boundary_gdf = get_bounding_polygon(
        boundary_gdf=boundary_gdf,
        boundary_geocode=boundary_geocode,
        boundary_file=boundary_file,
    )

    # make sure boundary_gdf.crs == LAT_LON_CRS
    if boundary_gdf.crs != LAT_LON_CRS:
        WranglerLogger.debug(f"Making boundary CRS consistent with network CRS: {LAT_LON_CRS}")
        boundary_gdf = boundary_gdf.to_crs(LAT_LON_CRS)
    # get the boundary as a single polygon
    boundary = boundary_gdf.geometry.union_all()
    # get the links that intersect the boundary
    WranglerLogger.debug("Finding roadway links that intersect boundary (spatial join).")
    filtered_links_df = network.links_df[network.links_df.geometry.intersects(boundary)]
    WranglerLogger.debug(f"filtered_links_df: \n{filtered_links_df.head()}")
    # get the nodes that the links connect to
    # WranglerLogger.debug("Finding roadway nodes that clipped links connect to.")
    filtered_node_ids = node_ids_in_links(filtered_links_df, network.nodes_df)
    filtered_nodes_df = network.nodes_df[network.nodes_df.index.isin(filtered_node_ids)]
    # WranglerLogger.debug(f"filtered_nodes_df:\n{filtered_nodes_df.head()}")
    # get shapes the links use
    WranglerLogger.debug("Finding roadway shapes that clipped links connect to.")
    filtered_shapes_df = network.shapes_df[
        network.shapes_df.index.isin(filtered_links_df["shape_id"])
    ]
    trimmed_links_df = copy.deepcopy(filtered_links_df)
    trimmed_nodes_df = copy.deepcopy(filtered_nodes_df)
    trimmed_shapes_df = copy.deepcopy(filtered_shapes_df)
    return trimmed_links_df, trimmed_nodes_df, trimmed_shapes_df

Functions to create a model roadway network from a roadway network.

network_wrangler.roadway.model_roadway.COPY_FROM_GP_TO_ML module-attribute

COPY_FROM_GP_TO_ML = ['ref', 'roadway', 'access', 'distance', 'bike_access', 'drive_access', 'walk_access', 'bus_only', 'rail_only']

List of attributes to copy from a general purpose lane to access and egress dummy links.

network_wrangler.roadway.model_roadway.COPY_TO_ACCESS_EGRESS module-attribute

COPY_TO_ACCESS_EGRESS = ['ref', 'ML_access', 'ML_drive_access', 'ML_bus_only', 'ML_rail_only']

List of attributes that must be provided in managed lanes.

network_wrangler.roadway.model_roadway.ModelRoadwayNetwork

Roadway Network Object compatible with travel modeling.

Compatability includes: (1) separation of managed lane facilities and their connection to general purpose lanes using dummy links.

Attr
Source code in network_wrangler/roadway/model_roadway.py
class ModelRoadwayNetwork:
    """Roadway Network Object compatible with travel modeling.

    Compatability includes:
    (1) separation of managed lane facilities and their connection to general purpose lanes
        using dummy links.

    Attr:
        net: associated RoadwayNetwork object
        links_df: dataframe of model-compatible links
        nodes_df: dataframe of model-compatible nodes
        ml_link_id_lookup:  lookup from general purpose link ids to link ids  of their
            managed lane counterparts.
        ml_node_id_lookup: lookup from general purpose node ids to node ids of their
            managed lane counterparts.
        _net_hash: hash of the the input links and nodes in order to detect changes.

    """

    def __init__(
        self,
        net,
        ml_link_id_lookup: Optional[dict[int, int]] = None,
        ml_node_id_lookup: Optional[dict[int, int]] = None,
    ):
        """Constructor for ModelRoadwayNetwork.

        NOTE: in order to be associated with the RoadwayNetwork, this should be called from
        RoadwayNetwork.model_net which will lazily construct it.

        Args:
            net: Associated roadway network.
            ml_link_id_lookup (dict[int, int]): lookup from general purpose link ids to link ids
                of their managed lane counterparts. Defaults to None which will generate a new one
                using the provided method.
            ml_node_id_lookup (dict[int, int]): lookup from general purpose node ids to node ids
                of their managed lane counterparts. Defaults to None which will generate a new one
                using the provided method.
        """
        self.net = net

        if ml_link_id_lookup is None:
            if self.net.config.IDS.ML_LINK_ID_METHOD == "range":
                self.ml_link_id_lookup = _generate_ml_link_id_lookup_from_range(
                    self.net.links_df, self.net.config.IDS.ML_LINK_ID_RANGE
                )
            elif self.net.config.IDS.ML_LINK_ID_METHOD == "scalar":
                self.ml_link_id_lookup = _generate_ml_link_id_lookup_from_scalar(
                    self.net.links_df, self.net.config.IDS.ML_LINK_ID_SCALAR
                )
            else:
                msg = "ml_link_id_method must be 'range' or 'scalar'."
                WranglerLogger.error(msg + f" Got {self.net.config.IDS.ML_LINK_ID_METHOD}")
                raise ValueError(msg)
        else:
            self.ml_link_id_lookup = ml_link_id_lookup

        if ml_node_id_lookup is None:
            if self.net.config.IDS.ML_NODE_ID_METHOD == "range":
                self.ml_node_id_lookup = _generate_ml_node_id_from_range(
                    self.net.nodes_df, self.net.links_df, self.net.config.IDS.ML_NODE_ID_RANGE
                )
            elif self.net.config.IDS.ML_NODE_ID_METHOD == "scalar":
                self.ml_node_id_lookup = _generate_ml_node_id_lookup_from_scalar(
                    self.net.nodes_df, self.net.links_df, self.net.config.IDS.ML_NODE_ID_SCALAR
                )
            else:
                msg = "ml_node_id_method must be 'range' or 'scalar'."
                WranglerLogger.error(msg + f" Got {self.net.config.IDS.ML_NODE_ID_METHOD}")
                raise ValueError(msg)
        else:
            self.ml_node_id_lookup = ml_node_id_lookup

        if len(self.net.links_df.of_type.managed) == 0:
            self.links_df, self.nodes_df = self.net.links_df, self.net.nodes_df
        else:
            self.links_df, self.nodes_df = model_links_nodes_from_net(
                self.net, self.ml_link_id_lookup, self.ml_node_id_lookup
            )
        self._net_hash = copy.deepcopy(net.network_hash)

    @property
    def ml_config(self) -> dict:
        """Convenience method for lanaged lane configuration."""
        return self.net.config.MODEL_ROADWAY

    @property
    def shapes_df(self) -> DataFrame[RoadShapesTable]:
        """Shapes dataframe."""
        return self.net.shapes_df

    @property
    def ml_links_df(self) -> pd.DataFrame:
        """Managed lanes links."""
        return self.links_df.of_type.managed

    @property
    def gp_links_df(self) -> pd.DataFrame:
        """GP lanes on links that have managed lanes next to them."""
        return self.links_df.of_type.parallel_general_purpose

    @property
    def dummy_links_df(self) -> pd.DataFrame:
        """GP lanes on links that have managed lanes next to them."""
        return self.links_df.of_type.dummy

    @property
    def summary(self) -> dict:
        """Quick summary dictionary of number of links, nodes."""
        d = {"links": len(self.links_df), "nodes": len(self.nodes_df)}
        return d

    @property
    def compare_links_df(self) -> pd.DataFrame:
        """Comparison of the original network and the model network."""
        return compare_links([self.net.links_df, self.links_df], names=["Roadway", "ModelRoadway"])

    @property
    def compare_net_df(self) -> pd.DataFrame:
        """Comparison of the original network and the model network."""
        return compare_networks([self.net, self], names=["Roadway", "ModelRoadway"])

    def write(
        self,
        out_dir: Path = Path(),
        convert_complex_link_properties_to_single_field: bool = False,
        prefix: str = "",
        file_format: RoadwayFileTypes = "geojson",
        overwrite: bool = True,
        true_shape: bool = False,
    ) -> None:
        """Writes a network in the roadway network standard.

        Args:
            out_dir: the path were the output will be saved.
            convert_complex_link_properties_to_single_field: if True, will convert complex properties to a
                single column consistent with v0 format.  This format is NOT valid
                with parquet and many other softwares. Defaults to False.
            prefix: the name prefix of the roadway files that will be generated.
            file_format: the format of the output files. Defaults to "geojson".
            overwrite: if True, will overwrite the files if they already exist. Defaults to True.
            true_shape: if True, will write the true shape of the links as found from shapes.
                Defaults to False.
        """
        write_roadway(
            self,
            out_dir=out_dir,
            convert_complex_link_properties_to_single_field=convert_complex_link_properties_to_single_field,
            prefix=prefix,
            file_format=file_format,
            overwrite=overwrite,
            true_shape=true_shape,
        )
compare_links_df

Comparison of the original network and the model network.

network_wrangler.roadway.model_roadway.ModelRoadwayNetwork.compare_net_df property
compare_net_df

Comparison of the original network and the model network.

dummy_links_df

GP lanes on links that have managed lanes next to them.

gp_links_df

GP lanes on links that have managed lanes next to them.

network_wrangler.roadway.model_roadway.ModelRoadwayNetwork.ml_config property
ml_config

Convenience method for lanaged lane configuration.

ml_links_df

Managed lanes links.

network_wrangler.roadway.model_roadway.ModelRoadwayNetwork.shapes_df property
shapes_df

Shapes dataframe.

network_wrangler.roadway.model_roadway.ModelRoadwayNetwork.summary property
summary

Quick summary dictionary of number of links, nodes.

network_wrangler.roadway.model_roadway.ModelRoadwayNetwork.__init__
__init__(net, ml_link_id_lookup=None, ml_node_id_lookup=None)

Constructor for ModelRoadwayNetwork.

NOTE: in order to be associated with the RoadwayNetwork, this should be called from RoadwayNetwork.model_net which will lazily construct it.

Parameters:

  • net

    Associated roadway network.

  • ml_link_id_lookup (dict[int, int], default: None ) –

    lookup from general purpose link ids to link ids of their managed lane counterparts. Defaults to None which will generate a new one using the provided method.

  • ml_node_id_lookup (dict[int, int], default: None ) –

    lookup from general purpose node ids to node ids of their managed lane counterparts. Defaults to None which will generate a new one using the provided method.

Source code in network_wrangler/roadway/model_roadway.py
def __init__(
    self,
    net,
    ml_link_id_lookup: Optional[dict[int, int]] = None,
    ml_node_id_lookup: Optional[dict[int, int]] = None,
):
    """Constructor for ModelRoadwayNetwork.

    NOTE: in order to be associated with the RoadwayNetwork, this should be called from
    RoadwayNetwork.model_net which will lazily construct it.

    Args:
        net: Associated roadway network.
        ml_link_id_lookup (dict[int, int]): lookup from general purpose link ids to link ids
            of their managed lane counterparts. Defaults to None which will generate a new one
            using the provided method.
        ml_node_id_lookup (dict[int, int]): lookup from general purpose node ids to node ids
            of their managed lane counterparts. Defaults to None which will generate a new one
            using the provided method.
    """
    self.net = net

    if ml_link_id_lookup is None:
        if self.net.config.IDS.ML_LINK_ID_METHOD == "range":
            self.ml_link_id_lookup = _generate_ml_link_id_lookup_from_range(
                self.net.links_df, self.net.config.IDS.ML_LINK_ID_RANGE
            )
        elif self.net.config.IDS.ML_LINK_ID_METHOD == "scalar":
            self.ml_link_id_lookup = _generate_ml_link_id_lookup_from_scalar(
                self.net.links_df, self.net.config.IDS.ML_LINK_ID_SCALAR
            )
        else:
            msg = "ml_link_id_method must be 'range' or 'scalar'."
            WranglerLogger.error(msg + f" Got {self.net.config.IDS.ML_LINK_ID_METHOD}")
            raise ValueError(msg)
    else:
        self.ml_link_id_lookup = ml_link_id_lookup

    if ml_node_id_lookup is None:
        if self.net.config.IDS.ML_NODE_ID_METHOD == "range":
            self.ml_node_id_lookup = _generate_ml_node_id_from_range(
                self.net.nodes_df, self.net.links_df, self.net.config.IDS.ML_NODE_ID_RANGE
            )
        elif self.net.config.IDS.ML_NODE_ID_METHOD == "scalar":
            self.ml_node_id_lookup = _generate_ml_node_id_lookup_from_scalar(
                self.net.nodes_df, self.net.links_df, self.net.config.IDS.ML_NODE_ID_SCALAR
            )
        else:
            msg = "ml_node_id_method must be 'range' or 'scalar'."
            WranglerLogger.error(msg + f" Got {self.net.config.IDS.ML_NODE_ID_METHOD}")
            raise ValueError(msg)
    else:
        self.ml_node_id_lookup = ml_node_id_lookup

    if len(self.net.links_df.of_type.managed) == 0:
        self.links_df, self.nodes_df = self.net.links_df, self.net.nodes_df
    else:
        self.links_df, self.nodes_df = model_links_nodes_from_net(
            self.net, self.ml_link_id_lookup, self.ml_node_id_lookup
        )
    self._net_hash = copy.deepcopy(net.network_hash)
network_wrangler.roadway.model_roadway.ModelRoadwayNetwork.write
write(out_dir=Path(), convert_complex_link_properties_to_single_field=False, prefix='', file_format='geojson', overwrite=True, true_shape=False)

Writes a network in the roadway network standard.

Parameters:

  • out_dir (Path, default: Path() ) –

    the path were the output will be saved.

  • convert_complex_link_properties_to_single_field (bool, default: False ) –

    if True, will convert complex properties to a single column consistent with v0 format. This format is NOT valid with parquet and many other softwares. Defaults to False.

  • prefix (str, default: '' ) –

    the name prefix of the roadway files that will be generated.

  • file_format (RoadwayFileTypes, default: 'geojson' ) –

    the format of the output files. Defaults to “geojson”.

  • overwrite (bool, default: True ) –

    if True, will overwrite the files if they already exist. Defaults to True.

  • true_shape (bool, default: False ) –

    if True, will write the true shape of the links as found from shapes. Defaults to False.

Source code in network_wrangler/roadway/model_roadway.py
def write(
    self,
    out_dir: Path = Path(),
    convert_complex_link_properties_to_single_field: bool = False,
    prefix: str = "",
    file_format: RoadwayFileTypes = "geojson",
    overwrite: bool = True,
    true_shape: bool = False,
) -> None:
    """Writes a network in the roadway network standard.

    Args:
        out_dir: the path were the output will be saved.
        convert_complex_link_properties_to_single_field: if True, will convert complex properties to a
            single column consistent with v0 format.  This format is NOT valid
            with parquet and many other softwares. Defaults to False.
        prefix: the name prefix of the roadway files that will be generated.
        file_format: the format of the output files. Defaults to "geojson".
        overwrite: if True, will overwrite the files if they already exist. Defaults to True.
        true_shape: if True, will write the true shape of the links as found from shapes.
            Defaults to False.
    """
    write_roadway(
        self,
        out_dir=out_dir,
        convert_complex_link_properties_to_single_field=convert_complex_link_properties_to_single_field,
        prefix=prefix,
        file_format=file_format,
        overwrite=overwrite,
        true_shape=true_shape,
    )
model_links_nodes_from_net(net, ml_link_id_lookup, ml_node_id_lookup)

Create a roadway network with managed lanes links separated out.

Add new parallel managed lane links, access/egress links, and add shapes corresponding to the new links

Parameters:

  • net (RoadwayNetwork) –

    RoadwayNetwork instance

  • ml_link_id_lookup (dict[int, int]) –

    lookup table for managed lane link ids to their general purpose lane counterparts.

  • ml_node_id_lookup (dict[int, int]) –

    lookup table for managed lane node ids to their general purpose lane counterparts.

Source code in network_wrangler/roadway/model_roadway.py
def model_links_nodes_from_net(
    net: RoadwayNetwork, ml_link_id_lookup: dict[int, int], ml_node_id_lookup: dict[int, int]
) -> tuple[pd.DataFrame, pd.DataFrame]:
    """Create a roadway network with managed lanes links separated out.

    Add new parallel managed lane links, access/egress links,
    and add shapes corresponding to the new links

    Args:
        net: RoadwayNetwork instance
        ml_link_id_lookup: lookup table for managed lane link ids to their general purpose lane
            counterparts.
        ml_node_id_lookup: lookup table for managed lane node ids to their general purpose lane
            counterparts.

    returns: tuple of links and nodes dataframes with managed lanes separated out
    """
    WranglerLogger.info("Separating managed lane links from general purpose links")

    copy_cols_gp_ml = list(
        set(COPY_FROM_GP_TO_ML + net.config.MODEL_ROADWAY.ADDITIONAL_COPY_FROM_GP_TO_ML)
    )
    _m_links_df = _separate_ml_links(
        net.links_df,
        ml_link_id_lookup,
        ml_node_id_lookup,
        offset_meters=net.config.MODEL_ROADWAY.ML_OFFSET_METERS,
        copy_from_gp_to_ml=copy_cols_gp_ml,
    )
    _m_nodes_df = _create_ml_nodes_from_links(_m_links_df, ml_node_id_lookup)
    m_nodes_df = concat_with_attr([net.nodes_df, _m_nodes_df])

    copy_ae_fields = list(
        set(COPY_TO_ACCESS_EGRESS + net.config.MODEL_ROADWAY.ADDITIONAL_COPY_TO_ACCESS_EGRESS)
    )
    _access_egress_links_df = _create_dummy_connector_links(
        net.links_df,
        m_nodes_df,
        ml_link_id_lookup,
        ml_node_id_lookup,
        copy_fields=copy_ae_fields,
    )
    m_links_df = concat_with_attr([_m_links_df, _access_egress_links_df])
    return m_links_df, m_nodes_df

network_wrangler.roadway.model_roadway.strip_ML_from_prop_list

strip_ML_from_prop_list(property_list)

Strips ‘ML_’ from property list but keeps necessary access/egress point cols.

Source code in network_wrangler/roadway/model_roadway.py
def strip_ML_from_prop_list(property_list: list[str]) -> list[str]:
    """Strips 'ML_' from property list but keeps necessary access/egress point cols."""
    keep_same = ["ML_access_point", "ML_egress_point"]
    pl = [p.removeprefix("ML_") if p not in keep_same else p for p in property_list]
    pl = [p.replace("_ML_", "_") if p not in keep_same else p for p in pl]
    return pl

Utility functions for RoadwayNetwork and ModelRoadwayNetwork classes.

compare_links(links, names=None)

Compare the summary of links in a list of dataframes.

Parameters:

  • links (list[DataFrame]) –

    list of dataframes

  • names (Optional[list[str]], default: None ) –

    list of names for the dataframes

Source code in network_wrangler/roadway/utils.py
def compare_links(
    links: list[pd.DataFrame],
    names: Optional[list[str]] = None,
) -> pd.DataFrame:
    """Compare the summary of links in a list of dataframes.

    Args:
        links: list of dataframes
        names: list of names for the dataframes
    """
    if names is None:
        names = ["links" + str(i) for i in range(1, len(links) + 1)]
    df = pd.DataFrame({name: link.of_type.summary for name, link in zip(names, links)})
    return df

network_wrangler.roadway.utils.compare_networks

compare_networks(nets, names=None)

Compare the summary of networks in a list of networks.

Parameters:

Source code in network_wrangler/roadway/utils.py
def compare_networks(
    nets: list[Union[RoadwayNetwork, ModelRoadwayNetwork]],
    names: Optional[list[str]] = None,
) -> pd.DataFrame:
    """Compare the summary of networks in a list of networks.

    Args:
        nets: list of networks
        names: list of names for the networks
    """
    if names is None:
        names = ["net" + str(i) for i in range(1, len(nets) + 1)]
    df = pd.DataFrame({name: net.summary for name, net in zip(names, nets)})
    return df

network_wrangler.roadway.utils.create_unique_shape_id

create_unique_shape_id(line_string)

A unique hash id using the coordinates of the geometry using first and last locations.

Args: line_string: Line Geometry as a LineString

Returns: string

Source code in network_wrangler/roadway/utils.py
def create_unique_shape_id(line_string: LineString):
    """A unique hash id using the coordinates of the geometry using first and last locations.

    Args:
    line_string: Line Geometry as a LineString

    Returns: string
    """
    x1, y1 = line_string.coords[0]  # first coordinate (A node)
    x2, y2 = line_string.coords[-1]  # last coordinate (B node)

    message = f"Geometry {x1} {y1} {x2} {y2}"
    unhashed = message.encode("utf-8")
    hash = hashlib.md5(unhashed).hexdigest()

    return hash

network_wrangler.roadway.utils.diff_nets

diff_nets(net1, net2)

Diff two RoadwayNetworks and return True if they are different.

Ignore locationReferences as they are not used in the network.

Parameters:

Source code in network_wrangler/roadway/utils.py
def diff_nets(net1: RoadwayNetwork, net2: RoadwayNetwork) -> bool:
    """Diff two RoadwayNetworks and return True if they are different.

    Ignore locationReferences as they are not used in the network.

    Args:
        net1 (RoadwayNetwork): First network to compare
        net2 (RoadwayNetwork): Second network to compare
    """
    # Need to ignore b/c there are tiny differences in how this complex time is serialized and
    # in order to evaluate if they are equivelant you need to do an elemement by element comparison
    # which takes forever.
    IGNORE_COLS = ["locationReferences"]
    WranglerLogger.debug("Comparing networks.")
    WranglerLogger.info("----Comparing links----")
    diff_links = diff_dfs(net1.links_df, net2.links_df, ignore=IGNORE_COLS)
    WranglerLogger.info("----Comparing nodes----")
    diff_nodes = diff_dfs(net1.nodes_df, net2.nodes_df, ignore=IGNORE_COLS)
    WranglerLogger.info("----Comparing shapes----")
    if net1.shapes_df is None and net1.shapes_df.empty:
        diff_shapes = False
    else:
        diff_shapes = diff_dfs(net1.shapes_df, net2.shapes_df, ignore=IGNORE_COLS)
    diff = any([diff_links, diff_nodes, diff_shapes])
    if diff:
        WranglerLogger.error("!!! Differences in networks.")
    else:
        WranglerLogger.info("Networks same for properties in common")
    return diff

network_wrangler.roadway.utils.set_df_index_to_pk

set_df_index_to_pk(df)

Sets the index of the dataframe to be a copy of the primary key.

Parameters:

  • df (DataFrame) –

    data frame to set the index of

Source code in network_wrangler/roadway/utils.py
def set_df_index_to_pk(df: pd.DataFrame) -> pd.DataFrame:
    """Sets the index of the dataframe to be a copy of the primary key.

    Args:
        df (pd.DataFrame): data frame to set the index of
    """
    if df.index.name != df.attrs["idx_col"]:
        df[df.attrs["idx_col"]] = df[df.attrs["primary_key"]]
        df = df.set_index(df.attrs["idx_col"])
    return df

Validates a roadway network to the wrangler data model specifications.

network_wrangler.roadway.validate.validate_roadway_files

validate_roadway_files(links_file, nodes_file, shapes_file=None, strict=False, output_dir=Path())

Validates the roadway network files strictly to the wrangler data model specifications.

Parameters:

  • links_file (str) –

    The path to the links file.

  • nodes_file (str) –

    The path to the nodes file.

  • shapes_file (str, default: None ) –

    The path to the shapes file.

  • strict (bool, default: False ) –

    If True, will validate the roadway network strictly without parsing and filling in data.

  • output_dir (str, default: Path() ) –

    The output directory for the validation report. Defaults to “.”.

Source code in network_wrangler/roadway/validate.py
def validate_roadway_files(
    links_file: Path,
    nodes_file: Path,
    shapes_file: Optional[Path] = None,
    strict: bool = False,
    output_dir: Path = Path(),
):
    """Validates the roadway network files strictly to the wrangler data model specifications.

    Args:
        links_file (str): The path to the links file.
        nodes_file (str): The path to the nodes file.
        shapes_file (str): The path to the shapes file.
        strict (bool): If True, will validate the roadway network strictly without
            parsing and filling in data.
        output_dir (str): The output directory for the validation report. Defaults to ".".
    """
    valid = {"net": True, "links": True, "nodes": True}

    nodes_df = read_table(nodes_file)
    valid["links"] = validate_nodes_df(
        nodes_df, strict=strict, errors_filename=Path(output_dir) / "node_errors.csv"
    )

    links_df = read_table(links_file)
    valid["links"] = validate_links_df(
        links_df,
        nodes_df=nodes_df,
        strict=strict,
        errors_filename=Path(output_dir) / "link_errors.csv",
    )

    if shapes_file:
        valid["shapes"] = True
        shapes_df = read_table(shapes_file)
        valid["shapes"] = validate_shapes_df(
            shapes_df, strict=strict, errors_filename=Path(output_dir) / "shape_errors.csv"
        )

    try:
        RoadwayNetwork(links_df=links_df, nodes_df=nodes_df, _shapes_df=shapes_df)
    except Exception as e:
        WranglerLogger.error(f"!!! [Network invalid] - Failed Loading to object\n{e}")
        valid["net"] = False

network_wrangler.roadway.validate.validate_roadway_in_dir

validate_roadway_in_dir(directory, file_format='geojson', strict=False, output_dir=Path())

Validates a roadway network in a directory to the wrangler data model specifications.

Parameters:

  • directory (str) –

    The roadway network file directory.

  • file_format(str)

    The formats of roadway network file name.

  • strict (bool, default: False ) –

    If True, will validate the roadway network strictly without parsing and filling in data.

  • output_dir (str, default: Path() ) –

    The output directory for the validation report. Defaults to “.”.

Source code in network_wrangler/roadway/validate.py
def validate_roadway_in_dir(
    directory: Path,
    file_format: RoadwayFileTypes = "geojson",
    strict: bool = False,
    output_dir: Path = Path(),
):
    """Validates a roadway network in a directory to the wrangler data model specifications.

    Args:
        directory (str): The roadway network file directory.
        file_format(str): The formats of roadway network file name.
        strict (bool): If True, will validate the roadway network strictly without
            parsing and filling in data.
        output_dir (str): The output directory for the validation report. Defaults to ".".
    """
    links_file, nodes_file, shapes_file = id_roadway_file_paths_in_dir(directory, file_format)
    validate_roadway_files(
        links_file, nodes_file, shapes_file, strict=strict, output_dir=output_dir
    )

Segment class and related functions for working with segments of a RoadwayNetwork.

A segment is a contiguous length of RoadwayNetwork defined by start/end nodes + link selections.

Segments are defined by a selection dictionary and then searched for on the network using a shortest path graph search.

Usage:

selection_dict = {
    "links": {"name": ["6th", "Sixth", "sixth"]},
    "from": {"osm_node_id": "187899923"},
    "to": {"osm_node_id": "187865924"},
}

segment = Segment(net, selection)
segment.segment_links_df
segment.segment_nodes

network_wrangler.roadway.segment.DEFAULT_MAX_SEARCH_BREADTH module-attribute

DEFAULT_MAX_SEARCH_BREADTH = 10

Factor to multiply sp_weight_col by to use for weights in shortest path.

network_wrangler.roadway.segment.DEFAULT_SUBNET_SP_WEIGHT_FACTOR module-attribute

DEFAULT_SUBNET_SP_WEIGHT_FACTOR = 100

Column to use for weights in shortest path.

network_wrangler.roadway.segment.Segment

A contiguous length of RoadwayNetwork defined by start/end nodes + link selections.

Segments are defined by a selection dictionary and then searched for on the network using a shortest path graph search.

Usage:

selection_dict = {
    "links": {"name":['6th','Sixth','sixth']},
    "from": {"osm_node_id": '187899923'},
    "to": {"osm_node_id": '187865924'}
}

net = RoadwayNetwork(...)

segment = Segment(net = net, selection)

# lazily evaluated dataframe of links in segment (if found) from segment.net
segment.segment_links_df

# lazily evaluated list of nodes primary keys that are in segment (if found)
segment.segment_nodes
attr
Source code in network_wrangler/roadway/segment.py
class Segment:
    """A contiguous length of RoadwayNetwork defined by start/end nodes + link selections.

    Segments are defined by a selection dictionary and then searched for on the network using
    a shortest path graph search.

    Usage:

    ```
    selection_dict = {
        "links": {"name":['6th','Sixth','sixth']},
        "from": {"osm_node_id": '187899923'},
        "to": {"osm_node_id": '187865924'}
    }

    net = RoadwayNetwork(...)

    segment = Segment(net = net, selection)

    # lazily evaluated dataframe of links in segment (if found) from segment.net
    segment.segment_links_df

    # lazily evaluated list of nodes primary keys that are in segment (if found)
    segment.segment_nodes
    ```

    attr:
        net: Associated RoadwayNetwork object
        selection: RoadwayLinkSelection
        from_node_id: value of the primary key (usually model_node_id) for segment start node
        to_node_id: value of the primary key (usually model_node_id) for segment end node
        subnet: Subnet object (and associated graph) on which to do shortest path search
        segment_nodes: list of primary keys of nodes within the selected segment. Will be lazily
            evaluated as the result of connected_path_search().
        segment_nodes_df: dataframe selection from net.modes_df for segment_nodes. Lazily evaluated
            based on segment_nodes.
        segment_links: list of primary keys of links which connect together segment_nodes. Lazily
            evaluated based on segment_links_df.
        segment_links_df: dataframe selection from net.links_df for segment_links. Lazily
            evaluated based on segment_links_df.
        max_search_breadth: maximum number of nodes to search for in connected_path_search.
            Defaults to DEFAULT_MAX_SEGMENT_SEARCH_BREADTH which is 10.
    """

    def __init__(
        self,
        net: RoadwayNetwork,
        selection: RoadwayLinkSelection,
        max_search_breadth: int = DEFAULT_MAX_SEARCH_BREADTH,
    ):
        """Initialize a roadway segment object.

        Args:
            net (RoadwayNetwork): Associated RoadwayNetwork object
            selection (RoadwayLinkSelection): Selection of type `segment`.
            max_search_breadth (int, optional): Maximum number of nodes to search for in
                connected_path_search. Defaults to DEFAULT_MAX_SEGMENT_SEARCH_BREADTH.
        """
        self.net = net
        self.max_search_breadth = max_search_breadth
        if selection.selection_method != "segment":
            msg = "Selection object passed to Segment must be of type `segment`"
            raise SegmentFormatError(msg)
        self.selection = selection

        # segment members are identified by storing nodes along a route
        self._segment_nodes: Union[list, None] = None

        # Initialize calculated, read-only attr.
        self._from_node_id: Union[int, None] = None
        self._to_node_id: Union[int, None] = None

        self.subnet = self._generate_subnet(self.segment_sel_dict)

        WranglerLogger.debug(f"Segment created: {self}")

    @property
    def modes(self) -> list[str]:
        """List of modes in the selection."""
        return self.selection.modes if self.selection.modes else DEFAULT_SEARCH_MODES

    @property
    def segment_sel_dict(self) -> dict:
        """Selection dictionary which only has keys related to initial segment link selection."""
        return self.selection.segment_selection_dict

    @property
    def from_node_id(self) -> int:
        """Find start node in selection dict and return its primary key."""
        if self._from_node_id is not None:
            return self._from_node_id
        self._from_node_id = self.get_node_id(self.selection.selection_data.from_)
        return self._from_node_id

    @property
    def to_node_id(self) -> int:
        """Find end node in selection dict and return its primary key."""
        if self._to_node_id is not None:
            return self._to_node_id
        self._to_node_id = self.get_node_id(self.selection.selection_data.to)
        return self._to_node_id

    @property
    def segment_nodes(self) -> list[int]:
        """Primary keys of nodes in segment."""
        if self._segment_nodes is None:
            WranglerLogger.debug("Segment not found yet so conducting connected_path_search.")
            self.connected_path_search()
        if self._segment_nodes is None:
            msg = "No segment nodes found."
            raise SegmentSelectionError(msg)
        return self._segment_nodes

    @property
    def segment_nodes_df(self) -> DataFrame[RoadNodesTable]:
        """Roadway network nodes filtered to nodes in segment."""
        return self.net.nodes_df.loc[self.segment_nodes]

    @property
    def segment_from_node_s(self) -> DataFrame[RoadNodesTable]:
        """Roadway network nodes filtered to segment start node."""
        return self.segment_nodes_df.loc[self.from_node_id]

    @property
    def segment_to_node_s(self) -> DataFrame[RoadNodesTable]:
        """Roadway network nodes filtered to segment end node."""
        return self.segment_nodes_df.loc[self.to_node_id]

    @property
    def segment_links_df(self) -> DataFrame[RoadLinksTable]:
        """Roadway network links filtered to segment links."""
        modal_links_df = self.net.links_df.mode_query(self.modes)
        segment_links_df = filter_links_to_path(modal_links_df, self.segment_nodes)
        return segment_links_df

    @property
    def segment_links(self) -> list[int]:
        """Primary keys of links in segment."""
        return self.segment_links_df.index.tolist()

    def get_node_id(self, node_selection_data: SelectNodeDict) -> int:
        """Get the primary key of a node based on the selection data."""
        node = self.get_node(node_selection_data)
        return node["model_node_id"].values[0]

    def get_node(self, node_selection_data: SelectNodeDict):
        """Get single node based on the selection data."""
        node_selection_dict = {
            k: v
            for k, v in node_selection_data.asdict.items()
            if k in self.selection.node_query_fields
        }
        node_df = self.net.nodes_df.isin_dict(node_selection_dict)
        if len(node_df) != 1:
            msg = f"Node selection not unique. Found {len(node_df)} nodes."
            raise SegmentSelectionError(msg)
        return node_df

    def connected_path_search(
        self,
    ) -> None:
        """Finds a path from from_node_id to to_node_id based on the weight col value/factor."""
        WranglerLogger.debug("Calculating shortest path from graph")
        _found = False
        _found = self._find_subnet_shortest_path()

        while not _found and self.subnet._i <= self.max_search_breadth:
            self.subnet._expand_subnet_breadth()
            _found = self._find_subnet_shortest_path()

        if not _found:
            msg = f"No connected path found from {self.O.pk} and {self.D_pk}"
            WranglerLogger.debug(msg)
            raise SegmentSelectionError(msg)

    def _generate_subnet(self, selection_dict: dict) -> Subnet:
        """Generate a subnet of the roadway network on which to search for connected segment.

        Args:
            selection_dict: selection dictionary to use for generating subnet
        """
        if not selection_dict:
            msg = "No selection provided to generate subnet from."
            raise SegmentFormatError(msg)

        WranglerLogger.debug(f"Creating subnet from dictionary: {selection_dict}")
        subnet = generate_subnet_from_link_selection_dict(
            self.net,
            modes=self.modes,
            link_selection_dict=selection_dict,
        )
        # expand network to find at least the origin and destination nodes
        subnet.expand_to_nodes(
            [self.from_node_id, self.to_node_id], max_search_breadth=self.max_search_breadth
        )
        return subnet

    def _find_subnet_shortest_path(
        self,
    ) -> bool:
        """Finds shortest path from from_node_id to to_node_id using self.subnet.graph.

        Sets self._segment_nodes to resulting path nodes

        Returns:
            bool: True if shortest path was found
        """
        WranglerLogger.debug(
            f"Calculating shortest path from {self.from_node_id} to {self.to_node_id} using\
        {self.subnet._sp_weight_col} as weight with a factor of {self.subnet._sp_weight_factor}"
        )

        self._segment_nodes = shortest_path(self.subnet.graph, self.from_node_id, self.to_node_id)

        if not self._segment_nodes:
            WranglerLogger.debug(f"No SP from {self.from_node_id} to {self.to_node_id} Found.")
            return False

        return True
network_wrangler.roadway.segment.Segment.from_node_id property
from_node_id

Find start node in selection dict and return its primary key.

network_wrangler.roadway.segment.Segment.modes property
modes

List of modes in the selection.

network_wrangler.roadway.segment.Segment.segment_from_node_s property
segment_from_node_s

Roadway network nodes filtered to segment start node.

segment_links

Primary keys of links in segment.

segment_links_df

Roadway network links filtered to segment links.

network_wrangler.roadway.segment.Segment.segment_nodes property
segment_nodes

Primary keys of nodes in segment.

network_wrangler.roadway.segment.Segment.segment_nodes_df property
segment_nodes_df

Roadway network nodes filtered to nodes in segment.

network_wrangler.roadway.segment.Segment.segment_sel_dict property
segment_sel_dict

Selection dictionary which only has keys related to initial segment link selection.

network_wrangler.roadway.segment.Segment.segment_to_node_s property
segment_to_node_s

Roadway network nodes filtered to segment end node.

network_wrangler.roadway.segment.Segment.to_node_id property
to_node_id

Find end node in selection dict and return its primary key.

network_wrangler.roadway.segment.Segment.__init__
__init__(net, selection, max_search_breadth=DEFAULT_MAX_SEARCH_BREADTH)

Initialize a roadway segment object.

Parameters:

  • net (RoadwayNetwork) –

    Associated RoadwayNetwork object

  • selection (RoadwayLinkSelection) –

    Selection of type segment.

  • max_search_breadth (int, default: DEFAULT_MAX_SEARCH_BREADTH ) –

    Maximum number of nodes to search for in connected_path_search. Defaults to DEFAULT_MAX_SEGMENT_SEARCH_BREADTH.

Source code in network_wrangler/roadway/segment.py
def __init__(
    self,
    net: RoadwayNetwork,
    selection: RoadwayLinkSelection,
    max_search_breadth: int = DEFAULT_MAX_SEARCH_BREADTH,
):
    """Initialize a roadway segment object.

    Args:
        net (RoadwayNetwork): Associated RoadwayNetwork object
        selection (RoadwayLinkSelection): Selection of type `segment`.
        max_search_breadth (int, optional): Maximum number of nodes to search for in
            connected_path_search. Defaults to DEFAULT_MAX_SEGMENT_SEARCH_BREADTH.
    """
    self.net = net
    self.max_search_breadth = max_search_breadth
    if selection.selection_method != "segment":
        msg = "Selection object passed to Segment must be of type `segment`"
        raise SegmentFormatError(msg)
    self.selection = selection

    # segment members are identified by storing nodes along a route
    self._segment_nodes: Union[list, None] = None

    # Initialize calculated, read-only attr.
    self._from_node_id: Union[int, None] = None
    self._to_node_id: Union[int, None] = None

    self.subnet = self._generate_subnet(self.segment_sel_dict)

    WranglerLogger.debug(f"Segment created: {self}")
connected_path_search()

Finds a path from from_node_id to to_node_id based on the weight col value/factor.

Source code in network_wrangler/roadway/segment.py
def connected_path_search(
    self,
) -> None:
    """Finds a path from from_node_id to to_node_id based on the weight col value/factor."""
    WranglerLogger.debug("Calculating shortest path from graph")
    _found = False
    _found = self._find_subnet_shortest_path()

    while not _found and self.subnet._i <= self.max_search_breadth:
        self.subnet._expand_subnet_breadth()
        _found = self._find_subnet_shortest_path()

    if not _found:
        msg = f"No connected path found from {self.O.pk} and {self.D_pk}"
        WranglerLogger.debug(msg)
        raise SegmentSelectionError(msg)
network_wrangler.roadway.segment.Segment.get_node
get_node(node_selection_data)

Get single node based on the selection data.

Source code in network_wrangler/roadway/segment.py
def get_node(self, node_selection_data: SelectNodeDict):
    """Get single node based on the selection data."""
    node_selection_dict = {
        k: v
        for k, v in node_selection_data.asdict.items()
        if k in self.selection.node_query_fields
    }
    node_df = self.net.nodes_df.isin_dict(node_selection_dict)
    if len(node_df) != 1:
        msg = f"Node selection not unique. Found {len(node_df)} nodes."
        raise SegmentSelectionError(msg)
    return node_df
network_wrangler.roadway.segment.Segment.get_node_id
get_node_id(node_selection_data)

Get the primary key of a node based on the selection data.

Source code in network_wrangler/roadway/segment.py
def get_node_id(self, node_selection_data: SelectNodeDict) -> int:
    """Get the primary key of a node based on the selection data."""
    node = self.get_node(node_selection_data)
    return node["model_node_id"].values[0]
generate_subnet_from_link_selection_dict(net, link_selection_dict, modes=DEFAULT_SEARCH_MODES, sp_weight_col=SUBNET_SP_WEIGHT_COL, sp_weight_factor=DEFAULT_SUBNET_SP_WEIGHT_FACTOR, **kwargs)

Generates a Subnet object from a link selection dictionary.

First will search based on “name” in selection_dict but if not found, will search using the “ref” field instead.

Parameters:

  • net (RoadwayNetwork) –

    RoadwayNetwork object.

  • link_selection_dict (dict) –

    dictionary of attributes to search for.

  • modes (list[str], default: DEFAULT_SEARCH_MODES ) –

    List of modes to limit subnet to. Defaults to DEFAULT_SEARCH_MODES.

  • sp_weight_col (str, default: SUBNET_SP_WEIGHT_COL ) –

    Column to use for weights in shortest path. Defaults to SUBNET_SP_WEIGHT_COL.

  • sp_weight_factor (float, default: DEFAULT_SUBNET_SP_WEIGHT_FACTOR ) –

    Factor to multiply sp_weight_col by to use for weights in shortest path. Defaults to DEFAULT_SUBNET_SP_WEIGHT_FACTOR.

  • kwargs

    other kwargs to pass to Subnet initiation

Returns:

  • Subnet ( Subnet ) –

    Subnet object.

Source code in network_wrangler/roadway/segment.py
def generate_subnet_from_link_selection_dict(
    net,
    link_selection_dict: dict,
    modes: list[str] = DEFAULT_SEARCH_MODES,
    sp_weight_col: str = SUBNET_SP_WEIGHT_COL,
    sp_weight_factor: float = DEFAULT_SUBNET_SP_WEIGHT_FACTOR,
    **kwargs,
) -> Subnet:
    """Generates a Subnet object from a link selection dictionary.

    First will search based on "name" in selection_dict but if not found, will search
        using the "ref" field instead.

    Args:
        net (RoadwayNetwork): RoadwayNetwork object.
        link_selection_dict: dictionary of attributes to search for.
        modes: List of modes to limit subnet to. Defaults to DEFAULT_SEARCH_MODES.
        sp_weight_col: Column to use for weights in shortest path.  Defaults to SUBNET_SP_WEIGHT_COL.
        sp_weight_factor: Factor to multiply sp_weight_col by to use for weights in shortest path.
            Defaults to DEFAULT_SUBNET_SP_WEIGHT_FACTOR.
        kwargs: other kwargs to pass to Subnet initiation

    Returns:
        Subnet: Subnet object.
    """
    link_sd_options = _generate_subnet_link_selection_dict_options(link_selection_dict)
    for sd in link_sd_options:
        WranglerLogger.debug(f"Trying link selection:\n{sd}")
        subnet_links_df = copy.deepcopy(net.links_df.mode_query(modes))
        subnet_links_df = subnet_links_df.dict_query(sd)
        if len(subnet_links_df) > 0:
            break
    if len(subnet_links_df) == 0:
        WranglerLogger.error(f"Selection didn't return subnet links: {link_selection_dict}")
        msg = "No links found with selection."
        raise SubnetCreationError(msg)

    subnet_links_df["i"] = 0
    subnet = Subnet(
        net=net,
        subnet_links_df=subnet_links_df,
        modes=modes,
        sp_weight_col=sp_weight_col,
        sp_weight_factor=sp_weight_factor,
        **kwargs,
    )

    WranglerLogger.debug(
        f"Found subnet from link selection with {len(subnet.subnet_links_df)} links."
    )
    return subnet

network_wrangler.roadway.segment.identify_segment_endpoints

identify_segment_endpoints(net, mode='drive', min_connecting_links=10, max_link_deviation=2)

This has not been revisited or refactored and may or may not contain useful code.

Parameters:

  • net

    RoadwayNetwork to find segments for

  • mode (str, default: 'drive' ) –

    list of modes of the network, one of drive,transit, walk, bike. Defaults to “drive”.

  • min_connecting_links (int, default: 10 ) –

    number of links that should be connected with same name or ref to be considered a segment (minus max_link_deviation). Defaults to 10.

  • max_link_deviation (int, default: 2 ) –

    maximum links that don’t have the same name or ref to still be considered a segment. Defaults to 2.

Source code in network_wrangler/roadway/segment.py
def identify_segment_endpoints(
    net,
    mode: str = "drive",
    min_connecting_links: int = 10,
    max_link_deviation: int = 2,
) -> pd.DataFrame:
    """This has not been revisited or refactored and may or may not contain useful code.

    Args:
        net: RoadwayNetwork to find segments for
        mode:  list of modes of the network, one of `drive`,`transit`,
            `walk`, `bike`. Defaults to "drive".
        min_connecting_links: number of links that should be connected with same name or ref
            to be considered a segment (minus max_link_deviation). Defaults to 10.
        max_link_deviation: maximum links that don't have the same name or ref to still be
            considered a segment. Defaults to 2.

    """
    msg = "This function has not been revisited or refactored to work."
    raise NotImplementedError(msg)
    SEGMENT_IDENTIFIERS = ["name", "ref"]

    NAME_PER_NODE = 4
    REF_PER_NODE = 2

    # make a copy so it is a full dataframe rather than a slice.
    _links_df = copy.deepcopy(net.links_df.mode_query(mode))

    _nodes_df = copy.deepcopy(
        net.nodes_in_links(
            _links_df,
        )
    )
    from .network import add_incident_link_data_to_nodes  # noqa: PLC0415

    _nodes_df = add_incident_link_data_to_nodes(
        links_df=_links_df,
        nodes_df=_nodes_df,
        link_variables=[*SEGMENT_IDENTIFIERS, "distance"],
    )

    # WranglerLogger.debug(f"Node/Link table elements: {len(_nodes_df)}"")

    # Screen out segments that have blank name AND refs
    _nodes_df = _nodes_df.replace(r"^\s*$", np.nan, regex=True).dropna(subset=["name", "ref"])

    # WranglerLogger.debug(f"Node/Link recs after dropping empty name AND ref : {len(_nodes_df)}")

    # Screen out segments that aren't likely to be long enough
    # Minus 1 in case ref or name is missing on an intermediate link
    _min_ref_in_table = REF_PER_NODE * (min_connecting_links - max_link_deviation)
    _min_name_in_table = NAME_PER_NODE * (min_connecting_links - max_link_deviation)

    _nodes_df["ref_freq"] = _nodes_df["ref"].map(_nodes_df["ref"].value_counts())
    _nodes_df["name_freq"] = _nodes_df["name"].map(_nodes_df["name"].value_counts())

    _nodes_df = _nodes_df.loc[
        (_nodes_df["ref_freq"] >= _min_ref_in_table)
        & (_nodes_df["name_freq"] >= _min_name_in_table)
    ]

    _display_cols = [
        net.nodes_df.model_node_id,
        "name",
        "ref",
        "distance",
        "ref_freq",
        "name_freq",
    ]
    msg = f"Node/Link table has n = {len(_nodes_df)} after screening segments for min length: \n\
        {_nodes_df[_display_cols]}"
    WranglerLogger.debug(msg)

    # ----------------------------------------
    # Find nodes that are likely endpoints
    # ----------------------------------------

    # - Likely have one incident link and one outgoing link
    _max_ref_endpoints = REF_PER_NODE / 2
    _max_name_endpoints = NAME_PER_NODE / 2
    # - Attach frequency  of node/ref
    _nodes_df = _nodes_df.merge(
        _nodes_df.groupby(by=[net.nodes_df.model_node_id, "ref"]).size().rename("ref_N_freq"),
        on=[net.nodes_df.model_node_id, "ref"],
    )

    _display_cols = ["model_node_id", "ref", "name", "ref_N_freq"]
    # WranglerLogger.debug(f"_ref_count+_nodes:\n{_nodes_df[_display_cols]})
    # - Attach frequency  of node/name
    _nodes_df = _nodes_df.merge(
        _nodes_df.groupby(by=[net.nodes_df.model_node_id, "name"]).size().rename("name_N_freq"),
        on=[net.nodes_df.model_node_id, "name"],
    )
    _display_cols = ["model_node_id", "ref", "name", "name_N_freq"]
    # WranglerLogger.debug(f"_name_count+_nodes:\n{_nodes_df[_display_cols]}")

    _display_cols = [
        net.nodes_df.model_node_id,
        "name",
        "ref",
        "distance",
        "ref_N_freq",
        "name_N_freq",
    ]
    # WranglerLogger.debug(f"Possible segment endpoints:\n{_nodes_df[_display_cols]}")
    # - Filter possible endpoint list based on node/name node/ref frequency
    _nodes_df = _nodes_df.loc[
        (_nodes_df["ref_N_freq"] <= _max_ref_endpoints)
        | (_nodes_df["name_N_freq"] <= _max_name_endpoints)
    ]
    _gb_cols = [
        net.nodes_df.model_node_id,
        "name",
        "ref",
        "ref_N_freq",
        "name_N_freq",
    ]

    msg = f"{len(_nodes_df)} Likely segment endpoints with req_ref<= {_max_ref_endpoints} or\
            freq_name<={_max_name_endpoints}\n{_nodes_df.groupby(_gb_cols)}"
    # WranglerLogger.debug(msg)
    # ----------------------------------------
    # Assign a segment id
    # ----------------------------------------
    _nodes_df["segment_id"], _segments = pd.factorize(_nodes_df.name + _nodes_df.ref)

    WranglerLogger.debug(f"{len(_segments)} Segments: \n{chr(10).join(_segments.tolist())}")

    # ----------------------------------------
    # Drop segments without at least two nodes
    # ----------------------------------------

    # https://stackoverflow.com/questions/13446480/python-pandas-remove-entries-based-on-the-number-of-occurrences
    _min_nodes = 2
    _nodes_df = _nodes_df[
        _nodes_df.groupby(["segment_id", net.nodes_df.model_node_id])[
            net.nodes_df.model_node_id
        ].transform(len)
        >= _min_nodes
    ]

    msg = f"{len(_nodes_df)} segments with at least {_min_nodes} nodes: \n\
        {_nodes_df.groupby(['segment_id'])}"
    # WranglerLogger.debug(msg)

    # ----------------------------------------
    # For segments with more than two nodes, find farthest apart pairs
    # ----------------------------------------

    def _max_segment_distance(row):
        _segment_nodes = _nodes_df.loc[_nodes_df["segment_id"] == row["segment_id"]]
        dist = _segment_nodes.geometry.distance(row.geometry)
        return max(dist.dropna())

    _nodes_df["seg_distance"] = _nodes_df.apply(_max_segment_distance, axis=1)
    _nodes_df = _nodes_df.merge(
        _nodes_df.groupby("segment_id").seg_distance.agg(max).rename("max_seg_distance"),
        on="segment_id",
    )

    _nodes_df = _nodes_df.loc[
        (_nodes_df["max_seg_distance"] == _nodes_df["seg_distance"])
        & (_nodes_df["seg_distance"] > 0)
    ].drop_duplicates(subset=[net.nodes_df.model_node_id, "segment_id"])

    # ----------------------------------------
    # Reassign segment id for final segments
    # ----------------------------------------
    _nodes_df["segment_id"], _segments = pd.factorize(_nodes_df.name + _nodes_df.ref)

    _display_cols = [
        net.nodes_df.model_node_id,
        "name",
        "ref",
        "segment_id",
        "seg_distance",
    ]

    WranglerLogger.debug(
        f"Start and end of {len(_segments)} Segments: \n{_nodes_df[_display_cols]}"
    )

    _return_cols = [
        "segment_id",
        net.nodes_df.model_node_id,
        "geometry",
        "name",
        "ref",
    ]
    return _nodes_df[_return_cols]

Subnet class for RoadwayNetwork object.

network_wrangler.roadway.subnet.DEFAULT_SUBNET_MAX_SEARCH_BREADTH module-attribute

DEFAULT_SUBNET_MAX_SEARCH_BREADTH = 10

Factor to multiply sp_weight_col by to use for weights in shortest path.

network_wrangler.roadway.subnet.DEFAULT_SUBNET_SP_WEIGHT_FACTOR module-attribute

DEFAULT_SUBNET_SP_WEIGHT_FACTOR = 100

Column to use for weights in shortest path.

network_wrangler.roadway.subnet.Subnet

Subnet is a connected selection of links/nodes from a RoadwayNetwork object.

Subnets are used for things like identifying Segments.

Usage:

selection_dict = {
    "links": {"name": ["6th", "Sixth", "sixth"]},
    "from": {"osm_node_id": "187899923"},
    "to": {"osm_node_id": "187865924"},
}

segment = Segment(net=RoadwayNetwork(...), selection_dict=selection_dict)
# used to store graph
self._segment_route_nodes = shortest_path(segment.subnet.graph, start_node_pk, end_node_pk)
attr
Source code in network_wrangler/roadway/subnet.py
class Subnet:
    """Subnet is a connected selection of links/nodes from a RoadwayNetwork object.

    Subnets are used for things like identifying Segments.

    Usage:

    ```
    selection_dict = {
        "links": {"name": ["6th", "Sixth", "sixth"]},
        "from": {"osm_node_id": "187899923"},
        "to": {"osm_node_id": "187865924"},
    }

    segment = Segment(net=RoadwayNetwork(...), selection_dict=selection_dict)
    # used to store graph
    self._segment_route_nodes = shortest_path(segment.subnet.graph, start_node_pk, end_node_pk)
    ```

    attr:
        net: Associated RoadwayNetwork object
        selection_dict: segment selection dictionary, which is is used to create initial subnet
            based on name and ref
        subnet_links_df: initial subnets can alternately be defined by a dataframe of links.
        graph_hash: unique hash of subnet_links_df, _sp_weight_col and _sp_weight_factor. Used
            to identify if any of these have changed and thus if a new graph should be generated.
        graph: returns the nx.MultiDigraph of subne which is stored in self._graph and lazily
            evaluated when called if graph_hash has changed becusae it is an expensive operation.
        num_links: number of links in the subnet
        subnet_nodes: lazily evaluated list of node primary keys based on subnet_links_df
        subnet_nodes_df: lazily evaluated selection of net.nodes_df based on subnet_links_df

    """

    def __init__(
        self,
        net: RoadwayNetwork,
        modes: Optional[list] = DEFAULT_SEARCH_MODES,
        subnet_links_df: pd.DataFrame = None,
        i: int = 0,
        sp_weight_factor: float = DEFAULT_SUBNET_SP_WEIGHT_FACTOR,
        sp_weight_col: str = SUBNET_SP_WEIGHT_COL,
        max_search_breadth: int = DEFAULT_SUBNET_MAX_SEARCH_BREADTH,
    ):
        """Generates and returns a Subnet object.

        Args:
            net (RoadwayNetwork): Associated RoadwayNetwork object.
            modes: List of modes to limit subnet to. Defaults to DEFAULT_SEARCH_MODES.
            subnet_links_df (pd.DataFrame, optional): Initial links to include in subnet.
                Optional if define a selection_dict and will default to result of
                self.generate_subnet_from_selection_dict(selection_dict)
            i: Expansion iteration number. Shouldn't need to change this as it will be done
                internally. Defaults to 0.
            sp_weight_col: Column to use for weights in shortest path.  Will not
                likely need to be changed. Defaults to "i" which is the iteration #.
            sp_weight_factor: Factor to multiply sp_weight_col by to use for
                weights in shortest path.  Will not likely need to be changed.
                Defaults to DEFAULT_SP_WEIGHT_FACTOR.
            max_search_breadth: Maximum expansions of the subnet network to find
                the shortest path after the initial selection based on `name`. Will not likely
                need to be changed unless network contains a lot of ambiguity.
                Defaults to DEFAULT_MAX_SEARCH_BREADTH.
        """
        self.net = net
        self.modes = modes
        self._subnet_links_df = subnet_links_df
        self._i = i
        self._sp_weight_col = sp_weight_col
        self._sp_weight_factor = sp_weight_factor
        self._max_search_breadth = max_search_breadth
        self._graph = None
        self._graph_link_hash = None

    @property
    def exists(self) -> bool:
        """Returns True if subnet_links_df is not None and has at least one link."""
        if self.subnet_links_df is None:
            return False
        if len(self.subnet_links_df) == 0:
            return False
        if len(self.subnet_links_df) > 0:
            return True
        msg = "Something's not right."
        raise SubnetCreationError(msg)

    @property
    def subnet_links_df(self) -> DataFrame[RoadLinksTable]:
        """Links in the subnet."""
        return self._subnet_links_df

    @property
    def graph_hash(self) -> str:
        """Hash of the links in order to detect a network change from when graph created."""
        _value = [
            self.subnet_links_df.df_hash(),
            self._sp_weight_col,
            str(self._sp_weight_factor),
        ]
        _enc_value = str.encode("-".join(_value))
        _hash = hashlib.sha256(_enc_value).hexdigest()
        return _hash

    @property
    def graph(self) -> MultiDiGraph:
        """nx.MultiDiGraph of the subnet."""
        if self.graph_hash != self._graph_link_hash:
            self._graph = links_nodes_to_ox_graph(
                self.subnet_links_df,
                self.subnet_nodes_df,
                sp_weight_col=self._sp_weight_col,
                sp_weight_factor=self._sp_weight_factor,
            )
        return self._graph

    @property
    def num_links(self):
        """Number of links in the subnet."""
        return len(self.subnet_links_df)

    @property
    def subnet_nodes(self) -> pd.Series:
        """List of node_ids in the subnet."""
        if self.subnet_links_df is None:
            msg = "Must set self.subnet_links_df before accessing subnet_nodes."
            raise ValueError(msg)
        return node_ids_in_links(self.subnet_links_df, self.net.nodes_df)

    @property
    def subnet_nodes_df(self) -> DataFrame[RoadNodesTable]:
        """Nodes filtered to subnet."""
        return self.net.nodes_df.loc[self.subnet_nodes]

    def expand_to_nodes(self, nodes_list: list, max_search_breadth: int) -> None:
        """Expand network to include list of nodes.

        Will stop expanding and generate a SubnetExpansionError if meet max_search_breadth before
        finding the nodes.

        Args:
            nodes_list: a list of node primary keys to expand subnet to include.
            max_search_breadth: maximum number of expansions to make before giving up.
        """
        WranglerLogger.debug(f"Expanding subnet to includes nodes: {nodes_list}")

        # expand network to find nodes in the list
        while not set(nodes_list).issubset(self.subnet_nodes) and self._i <= max_search_breadth:
            self._expand_subnet_breadth()

        if not set(nodes_list).issubset(self.subnet_nodes):
            msg = f"Can't find nodes {nodes_list} before achieving maximum\
                network expansion iterations of {max_search_breadth}"
            raise SubnetExpansionError(msg)

    def _expand_subnet_breadth(self) -> None:
        """Add one degree of breadth to self.subnet_links_df and add property."""
        self._i += 1

        WranglerLogger.debug(f"Adding Breadth to Subnet: i={self._i}")

        _modal_links_df = self.net.links_df.mode_query(self.modes)
        # find links where A node is connected to subnet but not B node
        _outbound_links_df = _modal_links_df.loc[
            _modal_links_df.A.isin(self.subnet_nodes) & ~_modal_links_df.B.isin(self.subnet_nodes)
        ]

        WranglerLogger.debug(f"_outbound_links_df links: {len(_outbound_links_df)}")

        # find links where B node is connected to subnet but not A node
        _inbound_links_df = _modal_links_df.loc[
            _modal_links_df.B.isin(self.subnet_nodes) & ~_modal_links_df.A.isin(self.subnet_nodes)
        ]

        WranglerLogger.debug(f"_inbound_links_df links: {len(_inbound_links_df)}")

        # find links where A and B nodes are connected to subnet but not in subnet
        _both_AB_connected_links_df = _modal_links_df.loc[
            _modal_links_df.B.isin(self.subnet_nodes)
            & _modal_links_df.A.isin(self.subnet_nodes)
            & ~_modal_links_df.index.isin(self.subnet_links_df.index.tolist())
        ]

        WranglerLogger.debug(
            f"{len(_both_AB_connected_links_df)} links where both A and B are connected to subnet\
             but aren't in subnet."
        )

        _add_links_df = concat_with_attr(
            [_both_AB_connected_links_df, _inbound_links_df, _outbound_links_df]
        )

        _add_links_df["i"] = self._i
        WranglerLogger.debug(f"Links to add: {len(_add_links_df)}")

        WranglerLogger.debug(f"{self.num_links} initial subnet links")

        self._subnet_links_df = concat_with_attr([self.subnet_links_df, _add_links_df])

        WranglerLogger.debug(f"{self.num_links} expanded subnet links")
network_wrangler.roadway.subnet.Subnet.exists property
exists

Returns True if subnet_links_df is not None and has at least one link.

network_wrangler.roadway.subnet.Subnet.graph property
graph

nx.MultiDiGraph of the subnet.

network_wrangler.roadway.subnet.Subnet.graph_hash property
graph_hash

Hash of the links in order to detect a network change from when graph created.

num_links

Number of links in the subnet.

subnet_links_df

Links in the subnet.

network_wrangler.roadway.subnet.Subnet.subnet_nodes property
subnet_nodes

List of node_ids in the subnet.

network_wrangler.roadway.subnet.Subnet.subnet_nodes_df property
subnet_nodes_df

Nodes filtered to subnet.

network_wrangler.roadway.subnet.Subnet.__init__
__init__(net, modes=DEFAULT_SEARCH_MODES, subnet_links_df=None, i=0, sp_weight_factor=DEFAULT_SUBNET_SP_WEIGHT_FACTOR, sp_weight_col=SUBNET_SP_WEIGHT_COL, max_search_breadth=DEFAULT_SUBNET_MAX_SEARCH_BREADTH)

Generates and returns a Subnet object.

Parameters:

  • net (RoadwayNetwork) –

    Associated RoadwayNetwork object.

  • modes (Optional[list], default: DEFAULT_SEARCH_MODES ) –

    List of modes to limit subnet to. Defaults to DEFAULT_SEARCH_MODES.

  • subnet_links_df (DataFrame, default: None ) –

    Initial links to include in subnet. Optional if define a selection_dict and will default to result of self.generate_subnet_from_selection_dict(selection_dict)

  • i (int, default: 0 ) –

    Expansion iteration number. Shouldn’t need to change this as it will be done internally. Defaults to 0.

  • sp_weight_col (str, default: SUBNET_SP_WEIGHT_COL ) –

    Column to use for weights in shortest path. Will not likely need to be changed. Defaults to “i” which is the iteration #.

  • sp_weight_factor (float, default: DEFAULT_SUBNET_SP_WEIGHT_FACTOR ) –

    Factor to multiply sp_weight_col by to use for weights in shortest path. Will not likely need to be changed. Defaults to DEFAULT_SP_WEIGHT_FACTOR.

  • max_search_breadth (int, default: DEFAULT_SUBNET_MAX_SEARCH_BREADTH ) –

    Maximum expansions of the subnet network to find the shortest path after the initial selection based on name. Will not likely need to be changed unless network contains a lot of ambiguity. Defaults to DEFAULT_MAX_SEARCH_BREADTH.

Source code in network_wrangler/roadway/subnet.py
def __init__(
    self,
    net: RoadwayNetwork,
    modes: Optional[list] = DEFAULT_SEARCH_MODES,
    subnet_links_df: pd.DataFrame = None,
    i: int = 0,
    sp_weight_factor: float = DEFAULT_SUBNET_SP_WEIGHT_FACTOR,
    sp_weight_col: str = SUBNET_SP_WEIGHT_COL,
    max_search_breadth: int = DEFAULT_SUBNET_MAX_SEARCH_BREADTH,
):
    """Generates and returns a Subnet object.

    Args:
        net (RoadwayNetwork): Associated RoadwayNetwork object.
        modes: List of modes to limit subnet to. Defaults to DEFAULT_SEARCH_MODES.
        subnet_links_df (pd.DataFrame, optional): Initial links to include in subnet.
            Optional if define a selection_dict and will default to result of
            self.generate_subnet_from_selection_dict(selection_dict)
        i: Expansion iteration number. Shouldn't need to change this as it will be done
            internally. Defaults to 0.
        sp_weight_col: Column to use for weights in shortest path.  Will not
            likely need to be changed. Defaults to "i" which is the iteration #.
        sp_weight_factor: Factor to multiply sp_weight_col by to use for
            weights in shortest path.  Will not likely need to be changed.
            Defaults to DEFAULT_SP_WEIGHT_FACTOR.
        max_search_breadth: Maximum expansions of the subnet network to find
            the shortest path after the initial selection based on `name`. Will not likely
            need to be changed unless network contains a lot of ambiguity.
            Defaults to DEFAULT_MAX_SEARCH_BREADTH.
    """
    self.net = net
    self.modes = modes
    self._subnet_links_df = subnet_links_df
    self._i = i
    self._sp_weight_col = sp_weight_col
    self._sp_weight_factor = sp_weight_factor
    self._max_search_breadth = max_search_breadth
    self._graph = None
    self._graph_link_hash = None
network_wrangler.roadway.subnet.Subnet.expand_to_nodes
expand_to_nodes(nodes_list, max_search_breadth)

Expand network to include list of nodes.

Will stop expanding and generate a SubnetExpansionError if meet max_search_breadth before finding the nodes.

Parameters:

  • nodes_list (list) –

    a list of node primary keys to expand subnet to include.

  • max_search_breadth (int) –

    maximum number of expansions to make before giving up.

Source code in network_wrangler/roadway/subnet.py
def expand_to_nodes(self, nodes_list: list, max_search_breadth: int) -> None:
    """Expand network to include list of nodes.

    Will stop expanding and generate a SubnetExpansionError if meet max_search_breadth before
    finding the nodes.

    Args:
        nodes_list: a list of node primary keys to expand subnet to include.
        max_search_breadth: maximum number of expansions to make before giving up.
    """
    WranglerLogger.debug(f"Expanding subnet to includes nodes: {nodes_list}")

    # expand network to find nodes in the list
    while not set(nodes_list).issubset(self.subnet_nodes) and self._i <= max_search_breadth:
        self._expand_subnet_breadth()

    if not set(nodes_list).issubset(self.subnet_nodes):
        msg = f"Can't find nodes {nodes_list} before achieving maximum\
            network expansion iterations of {max_search_breadth}"
        raise SubnetExpansionError(msg)

Functions to convert RoadwayNetwork to osmnx graph and perform graph operations.

network_wrangler.roadway.graph.DEFAULT_GRAPH_WEIGHT_COL module-attribute

DEFAULT_GRAPH_WEIGHT_COL = 'distance'

Factor to multiply sp_weight_col by to use for weights in shortest path.

network_wrangler.roadway.graph.ox_major_version module-attribute

ox_major_version = int(split('.')[0])

Column to use for weights in shortest path.

network_wrangler.roadway.graph.assess_connectivity

assess_connectivity(net, mode='', ignore_end_nodes=True)

Network graph and list of disconnected subgraphs described by a list of their member nodes.

Parameters:

  • net (RoadwayNetwork) –

    RoadwayNetwork object

  • mode (str, default: '' ) –

    mode of the network, one of drive,transit, walk, bike

  • ignore_end_nodes (bool, default: True ) –

    if True, ignores stray singleton nodes

Tuple of

  • Network Graph (osmnx flavored networkX DiGraph)

  • List of disconnected subgraphs described by the list of their member nodes (as described by their model_node_id)

Source code in network_wrangler/roadway/graph.py
def assess_connectivity(
    net: RoadwayNetwork,
    mode: str = "",
    ignore_end_nodes: bool = True,
):
    """Network graph and list of disconnected subgraphs described by a list of their member nodes.

    Args:
        net: RoadwayNetwork object
        mode:  mode of the network, one of `drive`,`transit`,
            `walk`, `bike`
        ignore_end_nodes: if True, ignores stray singleton nodes

    Returns: Tuple of
        Network Graph (osmnx flavored networkX DiGraph)
        List of disconnected subgraphs described by the list of their
            member nodes (as described by their `model_node_id`)
    """
    WranglerLogger.debug(f"Assessing network connectivity for mode: {mode}")

    G = net.get_modal_graph(mode)

    sub_graph_nodes = [
        list(s) for s in sorted(nx.strongly_connected_components(G), key=len, reverse=True)
    ]

    # sorted on decreasing length, dropping the main sub-graph
    disconnected_sub_graph_nodes = sub_graph_nodes[1:]

    # dropping the sub-graphs with only 1 node
    if ignore_end_nodes:
        disconnected_sub_graph_nodes = [
            list(s) for s in disconnected_sub_graph_nodes if len(s) > 1
        ]

    WranglerLogger.info(
        f"{net.nodes_df.model_node_id} for disconnected networks for mode = {mode}:\n"
        + "\n".join(list(map(str, disconnected_sub_graph_nodes))),
    )
    return G, disconnected_sub_graph_nodes
links_nodes_to_ox_graph(links_df, nodes_df, sp_weight_col='distance', sp_weight_factor=1)

Create an osmnx-flavored network graph from nodes and links dfs.

osmnx doesn’t like values that are arrays, so remove the variables that have arrays. osmnx also requires that certain variables be filled in, so do that too.

Parameters:

  • links_df (GeoDataFrame) –

    links_df from RoadwayNetwork

  • nodes_df (GeoDataFrame) –

    nodes_df from RoadwayNetwork

  • sp_weight_col (str, default: 'distance' ) –

    column to use for weights. Defaults to distance.

  • sp_weight_factor (float, default: 1 ) –

    multiple to apply to the weights. Defaults to 1.

Source code in network_wrangler/roadway/graph.py
def links_nodes_to_ox_graph(
    links_df: GeoDataFrame,
    nodes_df: GeoDataFrame,
    sp_weight_col: str = "distance",
    sp_weight_factor: float = 1,
):
    """Create an osmnx-flavored network graph from nodes and links dfs.

    osmnx doesn't like values that are arrays, so remove the variables
    that have arrays.  osmnx also requires that certain variables
    be filled in, so do that too.

    Args:
        links_df: links_df from RoadwayNetwork
        nodes_df: nodes_df from RoadwayNetwork
        sp_weight_col: column to use for weights. Defaults to `distance`.
        sp_weight_factor: multiple to apply to the weights. Defaults to 1.

    Returns: a networkx multidigraph
    """
    WranglerLogger.debug("starting ox_graph()")
    graph_nodes_df = _nodes_to_graph_nodes(nodes_df)
    graph_links_df = _links_to_graph_links(
        links_df,
        sp_weight_col=sp_weight_col,
        sp_weight_factor=sp_weight_factor,
    )

    try:
        WranglerLogger.debug("starting ox.gdfs_to_graph()")
        G = ox.graph_from_gdfs(graph_nodes_df, graph_links_df)

    except AttributeError as attr_error:
        if attr_error.args[0] == "module 'osmnx' has no attribute 'graph_from_gdfs'":
            # This is the only exception for which we have a workaround
            # Does this still work given the u,v,key multi-indexing?
            #
            WranglerLogger.warn(
                "Please upgrade your OSMNX package. For now, using deprecated\
                        osmnx.gdfs_to_graph because osmnx.graph_from_gdfs not found"
            )
            G = ox.gdfs_to_graph(graph_nodes_df, graph_links_df)
        else:
            # for other AttributeErrors, raise further
            raise attr_error
    except Exception as e:
        raise e

    WranglerLogger.debug("Created osmnx graph from RoadwayNetwork")
    return G

network_wrangler.roadway.graph.net_to_graph

net_to_graph(net, mode=None)

Converts a network to a MultiDiGraph.

Parameters:

  • net (RoadwayNetwork) –

    RoadwayNetwork object

  • mode (Optional[str], default: None ) –

    mode of the network, one of drive,transit, walk, bike

Source code in network_wrangler/roadway/graph.py
def net_to_graph(net: RoadwayNetwork, mode: Optional[str] = None) -> nx.MultiDiGraph:
    """Converts a network to a MultiDiGraph.

    Args:
        net: RoadwayNetwork object
        mode: mode of the network, one of `drive`,`transit`,
            `walk`, `bike`

    Returns: networkx: osmnx: DiGraph  of network
    """
    _links_df = net.links_df.mode_query(mode)

    _nodes_df = net.nodes_in_links()

    G = links_nodes_to_ox_graph(_links_df, _nodes_df)

    return G

network_wrangler.roadway.graph.shortest_path

shortest_path(G, O_id, D_id, sp_weight_property='weight')

Calculates the shortest path between two nodes in a network.

Parameters:

  • G (MultiDiGraph) –

    osmnx MultiDiGraph, created using links_nodes_to_ox_graph

  • O_id

    primary key for start node

  • D_id

    primary key for end node

  • sp_weight_property

    link property to use as weight in finding shortest path. Defaults to “weight”.

  • Boolean if shortest path found
  • nx Directed graph of graph links
  • route of shortest path nodes as List
  • links in shortest path selected from links_df
Source code in network_wrangler/roadway/graph.py
def shortest_path(
    G: nx.MultiDiGraph, O_id, D_id, sp_weight_property="weight"
) -> Union[list, None]:
    """Calculates the shortest path between two nodes in a network.

    Args:
        G: osmnx MultiDiGraph, created using links_nodes_to_ox_graph
        O_id: primary key for start node
        D_id: primary key for end node
        sp_weight_property: link property to use as weight in finding shortest path.
            Defaults to "weight".

    Returns: tuple with length of four
    - Boolean if shortest path found
    - nx Directed graph of graph links
    - route of shortest path nodes as List
    - links in shortest path selected from links_df
    """
    try:
        sp_route = nx.shortest_path(G, O_id, D_id, weight=sp_weight_property)
        WranglerLogger.debug("Shortest path successfully routed")
    except nx.NetworkXNoPath:
        WranglerLogger.debug(f"No SP from {O_id} to {D_id} Found.")
        return None
    except Exception as e:
        raise e

    return sp_route