canml.canmlio Module
This module provides the core APIs for decoding BLF files:
canmlio: Enhanced CAN BLF processing toolkit for production use.
This module provides end-to-end functionality for decoding CAN bus logs in BLF format into pandas DataFrames, handling DBC file loading and merging, streaming large logs, full-file loading with filtering, timing alignment, missing-signal injection, and exporting to CSV or Parquet with accompanying metadata. It also supports enums and custom signal attributes, all configurable via a single CanmlConfig object.
- Dependencies:
numpy
pandas
cantools
python-can
tqdm
pyarrow (for Parquet export)
Example
from canml.canmlio import load_dbc_files, load_blf, to_csv, CanmlConfig
# 1. Load DBC with safe prefixing db = load_dbc_files(“vehicle.dbc”, prefix_signals=True)
# 2. Configure BLF loading cfg = CanmlConfig(
chunk_size=5000, progress_bar=True, sort_timestamps=True, force_uniform_timing=True, interval_seconds=0.02, interpolate_missing=True, dtype_map={“Engine_RPM”: “int32”}
)
# 3. Load BLF file into DataFrame df = load_blf(
blf_path=”drive.blf”, db=db, config=cfg, message_ids={0x100, 0x200}, expected_signals=[“Engine_RPM”, “Brake_Active”]
)
# 4. Export results to_csv(df, “drive.csv”, metadata_path=”drive_meta.json”)
- class canml.canmlio.CanmlConfig(chunk_size: int = 10000, progress_bar: bool = True, dtype_map: Dict[str, Any] | None = None, sort_timestamps: bool = False, force_uniform_timing: bool = False, interval_seconds: float = 0.01, interpolate_missing: bool = False)[source]
Bases:
objectConfiguration options for BLF processing.
- Parameters:
chunk_size (int) – Number of messages per chunk. Defaults to 10000.
progress_bar (bool) – Show tqdm bar if True. Defaults to True.
dtype_map (Optional[Dict[str, Any]]) – Signal-to-dtype map. Defaults to None.
sort_timestamps (bool) – Sort by timestamp. Defaults to False.
force_uniform_timing (bool) – Uniform spacing of timestamps. Defaults to False.
interval_seconds (float) – Uniform interval seconds. Defaults to 0.01.
interpolate_missing (bool) – Interpolate missing signals. Defaults to False.
- Raises:
ValueError – If chunk_size or interval_seconds <= 0.
- chunk_size: int = 10000
- dtype_map: Dict[str, Any] | None = None
- force_uniform_timing: bool = False
- interpolate_missing: bool = False
- interval_seconds: float = 0.01
- progress_bar: bool = True
- sort_timestamps: bool = False
- canml.canmlio.iter_blf_chunks(blf_path: str, db: Database, config: CanmlConfig, filter_ids: Set[int] | None = None, filter_signals: Iterable[Any] | None = None) Iterator[DataFrame][source]
Stream-decode a BLF file into pandas DataFrame chunks.
Logs total vs dropped message counts.
- canml.canmlio.load_blf(blf_path: str, db: Database | str | List[str], config: CanmlConfig | None = None, message_ids: Set[int] | None = None, expected_signals: Iterable[Any] | None = None) DataFrame[source]
Load an entire BLF file into a DataFrame.
- Supports:
ID and signal filtering
Timestamp sorting and uniform spacing
Missing signal injection with dtype preservation
Metadata attributes and enum conversion
- canml.canmlio.load_dbc_files(dbc_paths: str | List[str], prefix_signals: bool = False) Database[source]
Load and merge DBC files with optional prefixing.
- canml.canmlio.to_csv(df_or_iter: DataFrame | Iterable[DataFrame], output_path: str, mode: str = 'w', header: bool = True, pandas_kwargs: Dict[str, Any] | None = None, columns: List[str] | None = None, metadata_path: str | None = None) None[source]
Write DataFrame or chunks to CSV with side-car metadata JSON.
- Parameters:
df_or_iter (DataFrame or iterable) – Data to write.
output_path (str) – Destination CSV file path.
mode (str) – Write mode ‘w’ or ‘a’.
header (bool) – Include header in CSV.
pandas_kwargs (dict) – Extra pandas.to_csv args.
columns (list) – Subset of columns to write.
metadata_path (str) – Path to JSON for signal_attributes.
- canml.canmlio.to_parquet(df: DataFrame, output_path: str, compression: str = 'snappy', pandas_kwargs: Dict[str, Any] | None = None, metadata_path: str | None = None) None[source]
Write DataFrame to Parquet with side-car metadata JSON.
- Parameters:
df (DataFrame) – Data to write.
output_path (str) – Destination .parquet file path.
compression (str) – Parquet codec.
pandas_kwargs (dict) – Extra pandas.to_parquet args.
metadata_path (str) – JSON path for signal_attributes.