canml.canmlio Module

canml icon

This module provides the core APIs for decoding BLF files:

canmlio: Enhanced CAN BLF processing toolkit for production use. Module: canml/canmlio.py

Features:
  • Merge multiple DBCs with namespace collision avoidance.

  • Stream-decode large BLF files into pandas DataFrame chunks.

  • Full-file loading with uniform timestamp spacing and interpolation.

  • Signal/message filtering by ID or signal name.

  • Automatic injection of expected signals with dtype preservation.

  • Incremental CSV/Parquet export with metadata support.

  • Generic handling for enums and custom signal attributes.

  • Progress bars via tqdm and caching for DBC loading.

Dependencies:

numpy, pandas, cantools, python-can, tqdm, pyarrow

Usage:

from canml.canmlio import load_dbc_files, iter_blf_chunks, load_blf, to_csv, to_parquet, CanmlConfig

class canml.canmlio.CanmlConfig(chunk_size: int = 10000, progress_bar: bool = True, dtype_map: Dict[str, Any] | None = None, sort_timestamps: bool = False, force_uniform_timing: bool = False, interval_seconds: float = 0.01, interpolate_missing: bool = False)[source]

Bases: object

Configuration for BLF processing.

chunk_size

number of messages per DataFrame chunk.

Type:

int

progress_bar

show tqdm progress bar if True.

Type:

bool

dtype_map

mapping from signal name to desired pandas dtype.

Type:

Dict[str, Any] | None

sort_timestamps

sort final DataFrame by timestamp if True.

Type:

bool

force_uniform_timing

override timestamps with uniform spacing if True.

Type:

bool

interval_seconds

interval between timestamps when uniform timing enabled.

Type:

float

interpolate_missing

interpolate missing signals if True.

Type:

bool

chunk_size: int = 10000
dtype_map: Dict[str, Any] | None = None
force_uniform_timing: bool = False
interpolate_missing: bool = False
interval_seconds: float = 0.01
progress_bar: bool = True
sort_timestamps: bool = False
canml.canmlio.iter_blf_chunks(blf_path: str, db: Database, config: CanmlConfig, filter_ids: Set[int] | None = None, filter_signals: Set[str] | None = None) Iterator[DataFrame][source]

Stream-decode BLF file into DataFrame chunks.

Parameters:
  • blf_path – .blf file path.

  • db – loaded CantoolsDatabase.

  • config – CanmlConfig instance.

  • filter_ids – set of arbitration IDs to include.

  • filter_signals – set of signal names to include.

Yields:

pandas.DataFrame chunks of decoded messages.

canml.canmlio.load_blf(blf_path: str, db: Database | str | List[str], config: CanmlConfig | None = None, message_ids: Set[int] | None = None, expected_signals: Iterable[str] | None = None) DataFrame[source]

Load an entire BLF file into a pandas DataFrame, with robust decoding, filtering, timing normalization, missing‐signal injection, and metadata.

Parameters:
  • blf_path – Path to the .blf log file.

  • db – Either a CantoolsDatabase instance or path(s) to DBC file(s).

  • config – CanmlConfig instance controlling chunking, timing, dtypes, etc.

  • message_ids – Optional set of CAN arbitration IDs to include (None = all).

  • expected_signals – Optional iterable of signal names to include (None = all signals in DBC).

Returns:

  • raw_timestamp (if uniform timing)

  • df.attrs[“signal_attributes”] mapping signal→custom attributes

  • enum signals as pandas.Categorical

Return type:

DataFrame with columns [“timestamp”, …signals], enriched with

canml.canmlio.load_dbc_files(dbc_paths: str | List[str], prefix_signals: bool = False) Database[source]

Load and optionally prefix one or more DBC files into a Cantools database.

Parameters:
  • dbc_paths – path or list of paths to .dbc files.

  • prefix_signals – if True, prefix signal names with their message name.

Returns:

Cached CantoolsDatabase instance.

canml.canmlio.to_csv(df_or_iter: DataFrame | Iterable[DataFrame], output_path: str, mode: str = 'w', header: bool = True, pandas_kwargs: Dict[str, Any] | None = None, columns: List[str] | None = None, metadata_path: str | None = None) None[source]

Write DataFrame or iterable of DataFrames to CSV, exporting metadata.

Parameters:
  • df_or_iter – single DataFrame or iterable of chunks.

  • output_path – CSV file path.

  • mode – ‘w’ or ‘a’.

  • header – write header row.

  • pandas_kwargs – extra pandas.to_csv kwargs.

  • columns – subset/order of columns to write.

  • metadata_path – JSON file path to save df.attrs[“signal_attributes”].

canml.canmlio.to_parquet(df: DataFrame, output_path: str, compression: str = 'snappy', pandas_kwargs: Dict[str, Any] | None = None, metadata_path: str | None = None) None[source]

Write DataFrame to Parquet with optional metadata export.

Parameters:
  • df – DataFrame to write.

  • output_path – .parquet file path.

  • compression – Parquet codec.

  • pandas_kwargs – kwargs for pandas.to_parquet.

  • metadata_path – JSON path for signal_attributes metadata.