Preprocessing
OSRM requires OSM data to be preprocessed before routing. Two algorithms are supported:
- CH (Contraction Hierarchies) — fastest query times; use
extract→contract - MLD (Multi-Level Dijkstra) — better for large networks and dynamic weights; use
extract→partition→customize
extract
extract(input_path, profile='car', output_path=None, threads=None, verbosity='INFO', progress_callback=None, capture_output=False, **kwargs)
Extract OSM data into OSRM format.
This is the first step in the OSRM preprocessing pipeline. It parses OSM data (from .osm, .osm.bz2, or .osm.pbf files) and applies a Lua routing profile to generate graph data for routing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_path
|
str
|
Path to input OSM file (.osm, .osm.bz2, or .osm.pbf) |
required |
profile
|
str
|
Profile name ('car', 'bicycle', 'foot') or path to custom .lua file |
'car'
|
output_path
|
Optional[str]
|
Base path for output files. If None, uses input_path base name. |
None
|
threads
|
Optional[int]
|
Number of threads to use. If None, uses all available CPU cores. |
None
|
verbosity
|
str
|
Log level - one of: "NONE", "ERROR", "WARNING", "INFO", "DEBUG" |
'INFO'
|
progress_callback
|
Optional[Callable[[str], None]]
|
Optional callback function(line: str) for progress updates. Called with each log line. Keep lightweight to avoid slowing down the extraction process. |
None
|
capture_output
|
bool
|
If True, capture stdout/stderr and return in result dict. If False and no progress_callback, output goes directly to console. |
False
|
**kwargs
|
Additional ExtractorConfig parameters (small_component_size, use_metadata, parse_conditionals, etc.) |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dict with keys: - success: bool - Whether extraction succeeded - duration: float - Time taken in seconds - stdout: str - Captured stdout (if capture_output=True or progress_callback set) - stderr: str - Captured stderr (if capture_output=True or progress_callback set) - error: str - Error message (if success=False) |
Example
import osrm
Simple extraction with bundled profile
result = osrm.extract('data.osm.pbf', profile='car')
With different profile
result = osrm.extract('data.osm.pbf', profile='bicycle')
With custom profile
result = osrm.extract('data.osm.pbf', profile='path/to/custom.lua')
With progress callback
def show_progress(line): ... print(f"Progress: {line}") result = osrm.extract('data.osm.pbf', progress_callback=show_progress) print(f"Completed in {result['duration']:.2f}s")
With custom settings
result = osrm.extract( ... 'data.osm.pbf', ... profile='bicycle', ... threads=4, ... small_component_size=500, ... use_metadata=True ... )
Source code in osrm/preprocessing.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | |
contract
contract(input_path, threads=None, verbosity='INFO', progress_callback=None, capture_output=False, **kwargs)
Run contraction hierarchy computation for CH algorithm.
This is used for the CH (Contraction Hierarchies) routing algorithm. Run this after extract() if you plan to use CH routing.
For MLD algorithm, use partition() and customize() instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_path
|
str
|
Base path to .osrm files (output from extract) |
required |
threads
|
Optional[int]
|
Number of threads to use. If None, uses all available CPU cores. |
None
|
verbosity
|
str
|
Log level - one of: "NONE", "ERROR", "WARNING", "INFO", "DEBUG" |
'INFO'
|
progress_callback
|
Optional[Callable[[str], None]]
|
Optional callback function(line: str) for progress updates |
None
|
capture_output
|
bool
|
If True, capture and return stdout/stderr |
False
|
**kwargs
|
Additional ContractorConfig parameters |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dict with success, duration, stdout, stderr, and optionally error keys |
Example
import osrm
First extract
osrm.extract('data.osm.pbf', profile_path='profiles/car.lua')
Then contract
result = osrm.contract('data.osrm') print(f"Contraction completed in {result['duration']:.2f}s")
Source code in osrm/preprocessing.py
partition
partition(input_path, threads=None, verbosity='INFO', progress_callback=None, capture_output=False, **kwargs)
Partition graph for MLD (Multi-Level Dijkstra) algorithm.
This is the second step for MLD routing (after extract). Follow with customize() to complete the MLD preprocessing pipeline.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_path
|
str
|
Base path to .osrm files (output from extract) |
required |
threads
|
Optional[int]
|
Number of threads to use. If None, uses all available CPU cores. |
None
|
verbosity
|
str
|
Log level - one of: "NONE", "ERROR", "WARNING", "INFO", "DEBUG" |
'INFO'
|
progress_callback
|
Optional[Callable[[str], None]]
|
Optional callback function(line: str) for progress updates |
None
|
capture_output
|
bool
|
If True, capture and return stdout/stderr |
False
|
**kwargs
|
Additional PartitionerConfig parameters (balance, boundary_factor, num_optimizing_cuts, small_component_size, max_cell_sizes) |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dict with success, duration, stdout, stderr, and optionally error keys |
Example
import osrm
First extract
osrm.extract('data.osm.pbf', profile_path='profiles/car.lua')
Then partition for MLD
result = osrm.partition('data.osrm')
Finally customize
osrm.customize('data.osrm')
Source code in osrm/preprocessing.py
customize
customize(input_path, threads=None, verbosity='INFO', progress_callback=None, capture_output=False, **kwargs)
Customize partitioned graph for MLD (Multi-Level Dijkstra) algorithm.
This is the final step for MLD routing (after extract and partition).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_path
|
str
|
Base path to .osrm files (output from partition) |
required |
threads
|
Optional[int]
|
Number of threads to use. If None, uses all available CPU cores. |
None
|
verbosity
|
str
|
Log level - one of: "NONE", "ERROR", "WARNING", "INFO", "DEBUG" |
'INFO'
|
progress_callback
|
Optional[Callable[[str], None]]
|
Optional callback function(line: str) for progress updates |
None
|
capture_output
|
bool
|
If True, capture and return stdout/stderr |
False
|
**kwargs
|
Additional CustomizationConfig parameters |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dict with success, duration, stdout, stderr, and optionally error keys |
Example
import osrm
Complete MLD pipeline
osrm.extract('data.osm.pbf', profile_path='profiles/car.lua') osrm.partition('data.osrm') result = osrm.customize('data.osrm')
Now ready for routing with MLD
engine = osrm.OSRM('data.osrm', algorithm='MLD')