canml.canmlio Module
This module provides the core APIs for decoding BLF files:
canmlio: Enhanced CAN BLF processing toolkit for production use. Module: canml/canmlio.py Features:
Merge multiple DBCs with namespace collision avoidance (optional prefixing).
Stream‐decode large BLF files in pandas DataFrame chunks.
Full‐file loading with optional uniform timestamp spacing.
Signal‐ and message‐level filtering.
Automatic injection of expected signals (NaN‐filled if missing).
Incremental CSV export and Parquet export.
Progress bars via tqdm.
- canml.canmlio.iter_blf_chunks(blf_path: str, db: Database, chunk_size: int = 10000, filter_ids: Set[int] | None = None, progress_bar: bool = True) Iterator[DataFrame][source]
Stream-decode a BLF file in pandas DataFrame chunks.
- Parameters:
blf_path – Path to the BLF log.
db – cantools Database with message definitions.
chunk_size – Rows per DataFrame chunk.
filter_ids – If set, only decode messages with these arbitration IDs.
progress_bar – If True, show a tqdm progress bar.
- Yields:
DataFrame chunks with decoded signals + timestamp column.
- Raises:
FileNotFoundError – If BLF file not found.
ValueError – If chunk_size is invalid.
- canml.canmlio.load_blf(blf_path: str, db: Database | str | List[str], message_ids: Set[int] | None = None, expected_signals: List[str] | None = None, force_uniform_timing: bool = False, interval_seconds: float = 0.01, dtype_map: Dict[str, str | dtype] | None = None, sort_timestamps: bool = False) DataFrame[source]
Load an entire BLF file into a DataFrame, with optional filters, signal injection, and dtype control for injected signals.
Notes
If force_uniform_timing=True, the original timestamps are saved in “raw_timestamp”.
Concatenates chunks iteratively to reduce memory usage.
- Parameters:
blf_path – Path to the BLF log.
db – Database instance or DBC path(s).
message_ids – Set of arbitration IDs to include (default all).
expected_signals – List of signal names to ensure exist.
force_uniform_timing – If True, override timestamps with uniform spacing.
interval_seconds – Interval for uniform timing.
dtype_map – Optional mapping from signal name to dtype for injected columns.
sort_timestamps – If True, sort by timestamp before processing.
- Returns:
A DataFrame with ‘timestamp’ + decoded signal columns.
- Raises:
FileNotFoundError – If files missing.
ValueError – For invalid parameters or processing errors.
- canml.canmlio.load_dbc_files(dbc_paths: str | List[str], prefix_signals: bool = False) Database[source]
Load and merge one or more DBC files into a single Database. Optionally prefix signal names with message names to avoid collisions.
- Parameters:
dbc_paths – Path or list of paths to DBC files.
prefix_signals – If True, rename signals to “<MessageName>_<SignalName>”.
- Returns:
A cantools Database with all definitions loaded.
- Raises:
FileNotFoundError – If any DBC file is missing.
ValueError – If loading fails or duplicate names are detected.
- canml.canmlio.to_csv(df_or_iter: DataFrame | Iterable[DataFrame], output_path: str, mode: str = 'w', header: bool = True, pandas_kwargs: Dict[str, Any] | None = None, columns: List[str] | None = None) None[source]
Write a DataFrame or iterable of DataFrames to CSV incrementally, enforcing a canonical column order if provided.
- Parameters:
df_or_iter – DataFrame or iterable of DataFrames.
output_path – Destination CSV file.
mode – Write mode (‘w’ or ‘a’).
header – Write header for first block.
pandas_kwargs – Additional kwargs for pandas.to_csv.
columns – Optional canonical column list; each chunk will be reindexed to this.
- Raises:
ValueError – If columns are invalid.
TypeError – If input is not a DataFrame or iterable.
- canml.canmlio.to_parquet(df: DataFrame, output_path: str, compression: str = 'snappy', pandas_kwargs: Dict[str, Any] | None = None) None[source]
Write a DataFrame to Parquet.
- Parameters:
df – pandas DataFrame.
output_path – ‘.parquet’ file path.
compression – Parquet codec.
pandas_kwargs – Additional kwargs for pandas.to_parquet.
- Raises:
ValueError – If write fails.