Performance Evaluation

Track and estimation performance metrics.

Performance evaluation module.

This module provides metrics for evaluating tracking and estimation performance, including:

  • Track metrics: OSPA, MOTA/MOTP, track purity, fragmentation

  • Estimation metrics: RMSE, NEES, NIS, consistency tests

Examples

>>> from pytcl.performance_evaluation import ospa, rmse, nees
>>> import numpy as np
>>> # OSPA between two point sets
>>> X = [np.array([0, 0]), np.array([10, 10])]
>>> Y = [np.array([1, 0]), np.array([9, 11])]
>>> result = ospa(X, Y, c=100, p=2)
>>> print(f"OSPA: {result.ospa:.2f}")
OSPA: 1.12
>>> # RMSE between true and estimated states
>>> true = np.array([[0, 0], [1, 1], [2, 2]])
>>> est = np.array([[0.1, -0.1], [1.1, 0.9], [2.0, 2.1]])
>>> print(f"RMSE: {rmse(true, est):.3f}")
RMSE: 0.100
class pytcl.performance_evaluation.OSPAResult(ospa, localization, cardinality)[source]

Bases: NamedTuple

Result of OSPA metric computation.

ospa

Total OSPA distance.

Type:

float

localization

Localization component.

Type:

float

cardinality

Cardinality component.

Type:

float

ospa: float

Alias for field number 0

localization: float

Alias for field number 1

cardinality: float

Alias for field number 2

class pytcl.performance_evaluation.MOTMetrics(mota, motp, num_switches, num_fragmentations, num_false_positives, num_misses)[source]

Bases: NamedTuple

Multiple Object Tracking (MOT) metrics.

mota

Multiple Object Tracking Accuracy.

Type:

float

motp

Multiple Object Tracking Precision.

Type:

float

num_switches

Number of identity switches.

Type:

int

num_fragmentations

Number of track fragmentations.

Type:

int

num_false_positives

Number of false positive detections.

Type:

int

num_misses

Number of missed detections.

Type:

int

mota: float

Alias for field number 0

motp: float

Alias for field number 1

num_switches: int

Alias for field number 2

num_fragmentations: int

Alias for field number 3

num_false_positives: int

Alias for field number 4

num_misses: int

Alias for field number 5

pytcl.performance_evaluation.ospa(X, Y, c=100.0, p=2.0)[source]

Compute Optimal Sub-Pattern Assignment (OSPA) metric.

The OSPA metric provides a mathematically consistent measure of distance between two sets of points, accounting for both localization error and cardinality mismatch.

Parameters:
  • X (list of ndarray) – First set of points (e.g., ground truth).

  • Y (list of ndarray) – Second set of points (e.g., estimated tracks).

  • c (float, optional) – Cutoff parameter for localization error (default: 100.0).

  • p (float, optional) – Order parameter for the metric (default: 2.0).

Returns:

Named tuple containing: - ospa: Total OSPA distance - localization: Localization component - cardinality: Cardinality component

Return type:

OSPAResult

Notes

  • If both sets are empty, OSPA is 0.

  • The metric is symmetric: ospa(X, Y) = ospa(Y, X).

  • For p=2 (default), the metric is a proper L2 distance.

Examples

>>> X = [np.array([0, 0]), np.array([10, 10])]
>>> Y = [np.array([1, 0]), np.array([10, 11])]
>>> result = ospa(X, Y, c=100, p=2)
>>> result.ospa
1.118...
pytcl.performance_evaluation.ospa_over_time(X_sequence, Y_sequence, c=100.0, p=2.0)[source]

Compute OSPA metric over a time sequence.

Parameters:
  • X_sequence (list of list of ndarray) – Sequence of ground truth point sets.

  • Y_sequence (list of list of ndarray) – Sequence of estimated point sets.

  • c (float, optional) – Cutoff parameter (default: 100.0).

  • p (float, optional) – Order parameter (default: 2.0).

Returns:

OSPA values at each time step.

Return type:

ndarray

Raises:

ValueError – If sequences have different lengths.

Examples

>>> # Two time steps with ground truth and estimates
>>> X_seq = [[np.array([0, 0]), np.array([10, 10])],
...          [np.array([1, 0]), np.array([11, 10])]]
>>> Y_seq = [[np.array([0.5, 0]), np.array([10, 10.5])],
...          [np.array([1.5, 0]), np.array([11, 10.5])]]
>>> ospa_vals = ospa_over_time(X_seq, Y_seq, c=100, p=2)
>>> len(ospa_vals)
2
pytcl.performance_evaluation.track_purity(true_labels, estimated_labels)[source]

Compute track purity metric.

Track purity measures how well estimated tracks correspond to single ground truth targets. A purity of 1.0 means each estimated track contains observations from only one true target.

Parameters:
  • true_labels (ndarray) – Ground truth target labels for each observation.

  • estimated_labels (ndarray) – Estimated track labels for each observation.

Returns:

Track purity score in [0, 1].

Return type:

float

Examples

>>> true_labels = np.array([0, 0, 0, 1, 1, 1])
>>> estimated_labels = np.array([0, 0, 0, 1, 1, 1])  # Perfect
>>> track_purity(true_labels, estimated_labels)
1.0
>>> estimated_labels = np.array([0, 0, 1, 1, 1, 1])  # Mixed
>>> track_purity(true_labels, estimated_labels)
0.833...
pytcl.performance_evaluation.track_fragmentation(true_labels, estimated_labels, time_indices=None)[source]

Count number of track fragmentations.

A fragmentation occurs when observations from a single ground truth target are split across multiple estimated tracks.

Parameters:
  • true_labels (ndarray) – Ground truth target labels for each observation.

  • estimated_labels (ndarray) – Estimated track labels for each observation.

  • time_indices (ndarray, optional) – Time indices for each observation (for temporal ordering).

Returns:

Number of fragmentations.

Return type:

int

Examples

>>> true_labels = np.array([0, 0, 0, 0])
>>> estimated_labels = np.array([0, 0, 1, 1])  # One fragmentation
>>> track_fragmentation(true_labels, estimated_labels)
1
pytcl.performance_evaluation.identity_switches(true_labels, estimated_labels, time_indices=None)[source]

Count number of identity switches.

An identity switch occurs when an estimated track changes which ground truth target it is associated with.

Parameters:
  • true_labels (ndarray) – Ground truth target labels for each observation.

  • estimated_labels (ndarray) – Estimated track labels for each observation.

  • time_indices (ndarray, optional) – Time indices for each observation.

Returns:

Number of identity switches.

Return type:

int

Examples

>>> true_labels = np.array([0, 0, 1, 1])
>>> estimated_labels = np.array([0, 0, 0, 0])  # Track 0 switches targets
>>> identity_switches(true_labels, estimated_labels)
1
pytcl.performance_evaluation.mot_metrics(ground_truth, estimates, threshold=10.0)[source]

Compute CLEAR MOT metrics.

Parameters:
  • ground_truth (list of list of ndarray) – Ground truth positions at each time step.

  • estimates (list of list of ndarray) – Estimated positions at each time step.

  • threshold (float, optional) – Distance threshold for valid associations (default: 10.0).

Returns:

Named tuple containing MOTA, MOTP, and counts.

Return type:

MOTMetrics

Notes

MOTA (Multiple Object Tracking Accuracy) accounts for false positives, misses, and identity switches. MOTP (Precision) measures localization accuracy for correctly matched pairs.

Examples

>>> gt = [[np.array([0, 0]), np.array([10, 10])],
...       [np.array([1, 0]), np.array([11, 10])]]
>>> est = [[np.array([0.5, 0]), np.array([10.5, 10])],
...        [np.array([1.5, 0]), np.array([11.5, 10])]]
>>> result = mot_metrics(gt, est, threshold=5.0)
>>> result.mota  # High accuracy with small errors
1.0
>>> result.motp < 1.0  # Some localization error
True
class pytcl.performance_evaluation.ConsistencyResult(is_consistent, statistic, lower_bound, upper_bound, mean_value)[source]

Bases: NamedTuple

Result of consistency test.

is_consistent

Whether the estimator is consistent.

Type:

bool

statistic

Test statistic value.

Type:

float

lower_bound

Lower confidence bound.

Type:

float

upper_bound

Upper confidence bound.

Type:

float

mean_value

Mean of the test statistic.

Type:

float

is_consistent: bool

Alias for field number 0

statistic: float

Alias for field number 1

lower_bound: float

Alias for field number 2

upper_bound: float

Alias for field number 3

mean_value: float

Alias for field number 4

pytcl.performance_evaluation.rmse(true_states, estimated_states, axis=None)[source]

Compute Root Mean Square Error.

Parameters:
  • true_states (ndarray) – True state values, shape (N, state_dim) or (N,).

  • estimated_states (ndarray) – Estimated state values, same shape as true_states.

  • axis (int, optional) – Axis over which to compute RMSE. - None: RMSE over all elements (scalar result) - 0: RMSE for each state component (vector result) - 1: RMSE for each time step (vector result)

Returns:

Root mean square error.

Return type:

ndarray or float

Examples

>>> true = np.array([[0, 0], [1, 1], [2, 2]])
>>> est = np.array([[0.1, -0.1], [1.2, 0.9], [1.8, 2.1]])
>>> rmse(true, est)  # Scalar RMSE
0.158...
>>> rmse(true, est, axis=0)  # Per-component RMSE
array([0.152..., 0.115...])
pytcl.performance_evaluation.position_rmse(true_states, estimated_states, position_indices)[source]

Compute RMSE for position components only.

Parameters:
  • true_states (ndarray) – True state values, shape (N, state_dim).

  • estimated_states (ndarray) – Estimated state values, shape (N, state_dim).

  • position_indices (list of int) – Indices of position components in state vector.

Returns:

Position RMSE.

Return type:

float

Examples

>>> # State = [x, vx, y, vy], positions are indices [0, 2]
>>> true = np.array([[0, 1, 0, 1], [1, 1, 1, 1]])
>>> est = np.array([[0.1, 1, -0.1, 1], [1.2, 1, 0.9, 1]])
>>> position_rmse(true, est, [0, 2])
0.141...
pytcl.performance_evaluation.velocity_rmse(true_states, estimated_states, velocity_indices)[source]

Compute RMSE for velocity components only.

Parameters:
  • true_states (ndarray) – True state values, shape (N, state_dim).

  • estimated_states (ndarray) – Estimated state values, shape (N, state_dim).

  • velocity_indices (list of int) – Indices of velocity components in state vector.

Returns:

Velocity RMSE.

Return type:

float

Examples

>>> # State = [x, vx, y, vy], velocities are indices [1, 3]
>>> true = np.array([[0, 10, 0, 5], [1, 10, 0.5, 5]])
>>> est = np.array([[0, 9.5, 0, 5.2], [1, 10.2, 0.5, 4.9]])
>>> velocity_rmse(true, est, [1, 3])
0.316...
pytcl.performance_evaluation.nees(true_state, estimated_state, covariance)[source]

Compute Normalized Estimation Error Squared (NEES).

NEES is a measure of filter consistency. For a properly tuned filter, the average NEES should be close to the state dimension.

Parameters:
  • true_state (ndarray) – True state vector, shape (state_dim,).

  • estimated_state (ndarray) – Estimated state vector, shape (state_dim,).

  • covariance (ndarray) – Estimation covariance, shape (state_dim, state_dim).

Returns:

NEES value (chi-squared distributed with df=state_dim).

Return type:

float

Notes

NEES = (x_true - x_est)’ * P^{-1} * (x_true - x_est)

For a consistent filter, NEES should follow a chi-squared distribution with degrees of freedom equal to the state dimension.

Examples

>>> true = np.array([1.0, 2.0])
>>> est = np.array([1.1, 1.9])
>>> P = np.eye(2) * 0.1
>>> nees(true, est, P)
0.2
pytcl.performance_evaluation.nees_sequence(true_states, estimated_states, covariances)[source]

Compute NEES for a sequence of estimates.

Parameters:
  • true_states (ndarray) – True states, shape (N, state_dim).

  • estimated_states (ndarray) – Estimated states, shape (N, state_dim).

  • covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

Returns:

NEES values for each time step, shape (N,).

Return type:

ndarray

Examples

>>> true = np.array([[1.0, 2.0], [1.5, 2.5]])
>>> est = np.array([[1.1, 1.9], [1.6, 2.4]])
>>> P = np.array([np.eye(2) * 0.1, np.eye(2) * 0.1])
>>> nees_vals = nees_sequence(true, est, P)
>>> len(nees_vals)
2
pytcl.performance_evaluation.average_nees(true_states, estimated_states, covariances)[source]

Compute average NEES over a sequence.

Parameters:
  • true_states (ndarray) – True states, shape (N, state_dim).

  • estimated_states (ndarray) – Estimated states, shape (N, state_dim).

  • covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

Returns:

Average NEES (should be close to state_dim for consistent filter).

Return type:

float

Examples

>>> true = np.array([[1.0, 2.0], [1.5, 2.5], [2.0, 3.0]])
>>> est = np.array([[1.1, 1.9], [1.6, 2.4], [2.1, 2.9]])
>>> P = np.array([np.eye(2) * 0.1] * 3)
>>> avg = average_nees(true, est, P)
>>> avg  # Should be close to state_dim=2 for consistent filter
0.2
pytcl.performance_evaluation.nis(innovation, innovation_covariance)[source]

Compute Normalized Innovation Squared (NIS).

NIS is similar to NEES but computed in measurement space. Used to verify measurement model consistency.

Parameters:
  • innovation (ndarray) – Innovation (measurement residual) vector.

  • innovation_covariance (ndarray) – Innovation covariance matrix S.

Returns:

NIS value (chi-squared distributed with df=meas_dim).

Return type:

float

Notes

NIS = nu’ * S^{-1} * nu

where nu = z - H*x_pred is the innovation and S is the innovation covariance.

Examples

>>> nu = np.array([0.5, -0.3])  # Innovation vector
>>> S = np.eye(2) * 0.25  # Innovation covariance
>>> nis(nu, S)
1.36
pytcl.performance_evaluation.nis_sequence(innovations, innovation_covariances)[source]

Compute NIS for a sequence of innovations.

Parameters:
  • innovations (ndarray) – Innovation vectors, shape (N, meas_dim).

  • innovation_covariances (ndarray) – Innovation covariances, shape (N, meas_dim, meas_dim).

Returns:

NIS values for each time step.

Return type:

ndarray

Examples

>>> innovations = np.array([[0.5, -0.3], [0.2, 0.1]])
>>> S = np.array([np.eye(2) * 0.25, np.eye(2) * 0.25])
>>> nis_vals = nis_sequence(innovations, S)
>>> len(nis_vals)
2
pytcl.performance_evaluation.consistency_test(nees_or_nis_values, df, confidence=0.95)[source]

Perform chi-squared consistency test on NEES or NIS values.

Tests whether the average NEES/NIS falls within expected confidence bounds for a consistent estimator.

Parameters:
  • nees_or_nis_values (ndarray) – NEES or NIS values from multiple time steps or Monte Carlo runs.

  • df (int) – Degrees of freedom (state_dim for NEES, meas_dim for NIS).

  • confidence (float, optional) – Confidence level (default: 0.95).

Returns:

Named tuple with test results.

Return type:

ConsistencyResult

Examples

>>> np.random.seed(42)
>>> # Simulate NEES from chi-squared (consistent filter)
>>> nees_vals = np.random.chisquare(df=4, size=100)
>>> result = consistency_test(nees_vals, df=4)
>>> result.is_consistent
True
pytcl.performance_evaluation.credibility_interval(errors, covariances, interval=0.95)[source]

Compute fraction of errors within credibility interval.

For a consistent estimator, approximately interval fraction of the errors should fall within the corresponding credibility region.

Parameters:
  • errors (ndarray) – Estimation errors, shape (N, state_dim).

  • covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

  • interval (float, optional) – Credibility interval (default: 0.95).

Returns:

Fraction of errors within the interval.

Return type:

float

Examples

>>> rng = np.random.default_rng(42)
>>> errors = rng.normal(0, 0.1, (100, 2))  # Small errors
>>> P = np.array([np.eye(2) * 0.1] * 100)  # Matching covariance
>>> frac = credibility_interval(errors, P, interval=0.95)
>>> frac > 0.9  # Most errors within interval
True
pytcl.performance_evaluation.monte_carlo_rmse(errors, axis=0)[source]

Compute RMSE from Monte Carlo simulation errors.

Parameters:
  • errors (ndarray) – Estimation errors from multiple runs, shape (N_runs, N_time, state_dim) or (N_runs, state_dim).

  • axis (int, optional) – Axis representing Monte Carlo runs (default: 0).

Returns:

RMSE values.

Return type:

ndarray

Examples

>>> # 3 Monte Carlo runs, 2 time steps, 2 state components
>>> errors = np.array([[[0.1, 0.2], [0.15, 0.1]],
...                    [[0.05, 0.1], [0.2, 0.15]],
...                    [[0.15, 0.05], [0.1, 0.2]]])
>>> rmse_per_time = monte_carlo_rmse(errors, axis=0)
>>> rmse_per_time.shape
(2, 2)
pytcl.performance_evaluation.estimation_error_bounds(covariances, sigma=2.0)[source]

Compute estimation error bounds from covariances.

Parameters:
  • covariances (ndarray) – Covariance matrices, shape (N, state_dim, state_dim).

  • sigma (float, optional) – Number of standard deviations for bounds (default: 2.0).

Returns:

Error bounds (standard deviations) for each component, shape (N, state_dim).

Return type:

ndarray

Examples

>>> P = np.array([[[1.0, 0], [0, 4.0]],
...               [[0.25, 0], [0, 1.0]]])
>>> bounds = estimation_error_bounds(P, sigma=2.0)
>>> bounds[0]  # 2-sigma bounds: 2*sqrt(1), 2*sqrt(4)
array([2., 4.])
>>> bounds[1]  # 2-sigma bounds: 2*sqrt(0.25), 2*sqrt(1)
array([1., 2.])

Track Metrics

Multi-target tracking performance metrics (OSPA, GOSPA).

Track performance metrics for multi-target tracking evaluation.

This module provides metrics for evaluating multi-target tracker performance, including OSPA (Optimal Sub-Pattern Assignment), track purity, and fragmentation.

References

class pytcl.performance_evaluation.track_metrics.OSPAResult(ospa, localization, cardinality)[source]

Bases: NamedTuple

Result of OSPA metric computation.

ospa

Total OSPA distance.

Type:

float

localization

Localization component.

Type:

float

cardinality

Cardinality component.

Type:

float

ospa: float

Alias for field number 0

localization: float

Alias for field number 1

cardinality: float

Alias for field number 2

class pytcl.performance_evaluation.track_metrics.MOTMetrics(mota, motp, num_switches, num_fragmentations, num_false_positives, num_misses)[source]

Bases: NamedTuple

Multiple Object Tracking (MOT) metrics.

mota

Multiple Object Tracking Accuracy.

Type:

float

motp

Multiple Object Tracking Precision.

Type:

float

num_switches

Number of identity switches.

Type:

int

num_fragmentations

Number of track fragmentations.

Type:

int

num_false_positives

Number of false positive detections.

Type:

int

num_misses

Number of missed detections.

Type:

int

mota: float

Alias for field number 0

motp: float

Alias for field number 1

num_switches: int

Alias for field number 2

num_fragmentations: int

Alias for field number 3

num_false_positives: int

Alias for field number 4

num_misses: int

Alias for field number 5

pytcl.performance_evaluation.track_metrics.ospa(X, Y, c=100.0, p=2.0)[source]

Compute Optimal Sub-Pattern Assignment (OSPA) metric.

The OSPA metric provides a mathematically consistent measure of distance between two sets of points, accounting for both localization error and cardinality mismatch.

Parameters:
  • X (list of ndarray) – First set of points (e.g., ground truth).

  • Y (list of ndarray) – Second set of points (e.g., estimated tracks).

  • c (float, optional) – Cutoff parameter for localization error (default: 100.0).

  • p (float, optional) – Order parameter for the metric (default: 2.0).

Returns:

Named tuple containing: - ospa: Total OSPA distance - localization: Localization component - cardinality: Cardinality component

Return type:

OSPAResult

Notes

  • If both sets are empty, OSPA is 0.

  • The metric is symmetric: ospa(X, Y) = ospa(Y, X).

  • For p=2 (default), the metric is a proper L2 distance.

Examples

>>> X = [np.array([0, 0]), np.array([10, 10])]
>>> Y = [np.array([1, 0]), np.array([10, 11])]
>>> result = ospa(X, Y, c=100, p=2)
>>> result.ospa
1.118...
pytcl.performance_evaluation.track_metrics.ospa_over_time(X_sequence, Y_sequence, c=100.0, p=2.0)[source]

Compute OSPA metric over a time sequence.

Parameters:
  • X_sequence (list of list of ndarray) – Sequence of ground truth point sets.

  • Y_sequence (list of list of ndarray) – Sequence of estimated point sets.

  • c (float, optional) – Cutoff parameter (default: 100.0).

  • p (float, optional) – Order parameter (default: 2.0).

Returns:

OSPA values at each time step.

Return type:

ndarray

Raises:

ValueError – If sequences have different lengths.

Examples

>>> # Two time steps with ground truth and estimates
>>> X_seq = [[np.array([0, 0]), np.array([10, 10])],
...          [np.array([1, 0]), np.array([11, 10])]]
>>> Y_seq = [[np.array([0.5, 0]), np.array([10, 10.5])],
...          [np.array([1.5, 0]), np.array([11, 10.5])]]
>>> ospa_vals = ospa_over_time(X_seq, Y_seq, c=100, p=2)
>>> len(ospa_vals)
2
pytcl.performance_evaluation.track_metrics.track_purity(true_labels, estimated_labels)[source]

Compute track purity metric.

Track purity measures how well estimated tracks correspond to single ground truth targets. A purity of 1.0 means each estimated track contains observations from only one true target.

Parameters:
  • true_labels (ndarray) – Ground truth target labels for each observation.

  • estimated_labels (ndarray) – Estimated track labels for each observation.

Returns:

Track purity score in [0, 1].

Return type:

float

Examples

>>> true_labels = np.array([0, 0, 0, 1, 1, 1])
>>> estimated_labels = np.array([0, 0, 0, 1, 1, 1])  # Perfect
>>> track_purity(true_labels, estimated_labels)
1.0
>>> estimated_labels = np.array([0, 0, 1, 1, 1, 1])  # Mixed
>>> track_purity(true_labels, estimated_labels)
0.833...
pytcl.performance_evaluation.track_metrics.track_fragmentation(true_labels, estimated_labels, time_indices=None)[source]

Count number of track fragmentations.

A fragmentation occurs when observations from a single ground truth target are split across multiple estimated tracks.

Parameters:
  • true_labels (ndarray) – Ground truth target labels for each observation.

  • estimated_labels (ndarray) – Estimated track labels for each observation.

  • time_indices (ndarray, optional) – Time indices for each observation (for temporal ordering).

Returns:

Number of fragmentations.

Return type:

int

Examples

>>> true_labels = np.array([0, 0, 0, 0])
>>> estimated_labels = np.array([0, 0, 1, 1])  # One fragmentation
>>> track_fragmentation(true_labels, estimated_labels)
1
pytcl.performance_evaluation.track_metrics.identity_switches(true_labels, estimated_labels, time_indices=None)[source]

Count number of identity switches.

An identity switch occurs when an estimated track changes which ground truth target it is associated with.

Parameters:
  • true_labels (ndarray) – Ground truth target labels for each observation.

  • estimated_labels (ndarray) – Estimated track labels for each observation.

  • time_indices (ndarray, optional) – Time indices for each observation.

Returns:

Number of identity switches.

Return type:

int

Examples

>>> true_labels = np.array([0, 0, 1, 1])
>>> estimated_labels = np.array([0, 0, 0, 0])  # Track 0 switches targets
>>> identity_switches(true_labels, estimated_labels)
1
pytcl.performance_evaluation.track_metrics.mot_metrics(ground_truth, estimates, threshold=10.0)[source]

Compute CLEAR MOT metrics.

Parameters:
  • ground_truth (list of list of ndarray) – Ground truth positions at each time step.

  • estimates (list of list of ndarray) – Estimated positions at each time step.

  • threshold (float, optional) – Distance threshold for valid associations (default: 10.0).

Returns:

Named tuple containing MOTA, MOTP, and counts.

Return type:

MOTMetrics

Notes

MOTA (Multiple Object Tracking Accuracy) accounts for false positives, misses, and identity switches. MOTP (Precision) measures localization accuracy for correctly matched pairs.

Examples

>>> gt = [[np.array([0, 0]), np.array([10, 10])],
...       [np.array([1, 0]), np.array([11, 10])]]
>>> est = [[np.array([0.5, 0]), np.array([10.5, 10])],
...        [np.array([1.5, 0]), np.array([11.5, 10])]]
>>> result = mot_metrics(gt, est, threshold=5.0)
>>> result.mota  # High accuracy with small errors
1.0
>>> result.motp < 1.0  # Some localization error
True

Estimation Metrics

State estimation performance metrics (NEES, NIS).

Estimation performance metrics.

This module provides metrics for evaluating state estimation performance, including RMSE, NEES, NIS, and consistency tests.

References

class pytcl.performance_evaluation.estimation_metrics.ConsistencyResult(is_consistent, statistic, lower_bound, upper_bound, mean_value)[source]

Bases: NamedTuple

Result of consistency test.

is_consistent

Whether the estimator is consistent.

Type:

bool

statistic

Test statistic value.

Type:

float

lower_bound

Lower confidence bound.

Type:

float

upper_bound

Upper confidence bound.

Type:

float

mean_value

Mean of the test statistic.

Type:

float

is_consistent: bool

Alias for field number 0

statistic: float

Alias for field number 1

lower_bound: float

Alias for field number 2

upper_bound: float

Alias for field number 3

mean_value: float

Alias for field number 4

pytcl.performance_evaluation.estimation_metrics.rmse(true_states, estimated_states, axis=None)[source]

Compute Root Mean Square Error.

Parameters:
  • true_states (ndarray) – True state values, shape (N, state_dim) or (N,).

  • estimated_states (ndarray) – Estimated state values, same shape as true_states.

  • axis (int, optional) – Axis over which to compute RMSE. - None: RMSE over all elements (scalar result) - 0: RMSE for each state component (vector result) - 1: RMSE for each time step (vector result)

Returns:

Root mean square error.

Return type:

ndarray or float

Examples

>>> true = np.array([[0, 0], [1, 1], [2, 2]])
>>> est = np.array([[0.1, -0.1], [1.2, 0.9], [1.8, 2.1]])
>>> rmse(true, est)  # Scalar RMSE
0.158...
>>> rmse(true, est, axis=0)  # Per-component RMSE
array([0.152..., 0.115...])
pytcl.performance_evaluation.estimation_metrics.position_rmse(true_states, estimated_states, position_indices)[source]

Compute RMSE for position components only.

Parameters:
  • true_states (ndarray) – True state values, shape (N, state_dim).

  • estimated_states (ndarray) – Estimated state values, shape (N, state_dim).

  • position_indices (list of int) – Indices of position components in state vector.

Returns:

Position RMSE.

Return type:

float

Examples

>>> # State = [x, vx, y, vy], positions are indices [0, 2]
>>> true = np.array([[0, 1, 0, 1], [1, 1, 1, 1]])
>>> est = np.array([[0.1, 1, -0.1, 1], [1.2, 1, 0.9, 1]])
>>> position_rmse(true, est, [0, 2])
0.141...
pytcl.performance_evaluation.estimation_metrics.velocity_rmse(true_states, estimated_states, velocity_indices)[source]

Compute RMSE for velocity components only.

Parameters:
  • true_states (ndarray) – True state values, shape (N, state_dim).

  • estimated_states (ndarray) – Estimated state values, shape (N, state_dim).

  • velocity_indices (list of int) – Indices of velocity components in state vector.

Returns:

Velocity RMSE.

Return type:

float

Examples

>>> # State = [x, vx, y, vy], velocities are indices [1, 3]
>>> true = np.array([[0, 10, 0, 5], [1, 10, 0.5, 5]])
>>> est = np.array([[0, 9.5, 0, 5.2], [1, 10.2, 0.5, 4.9]])
>>> velocity_rmse(true, est, [1, 3])
0.316...
pytcl.performance_evaluation.estimation_metrics.nees(true_state, estimated_state, covariance)[source]

Compute Normalized Estimation Error Squared (NEES).

NEES is a measure of filter consistency. For a properly tuned filter, the average NEES should be close to the state dimension.

Parameters:
  • true_state (ndarray) – True state vector, shape (state_dim,).

  • estimated_state (ndarray) – Estimated state vector, shape (state_dim,).

  • covariance (ndarray) – Estimation covariance, shape (state_dim, state_dim).

Returns:

NEES value (chi-squared distributed with df=state_dim).

Return type:

float

Notes

NEES = (x_true - x_est)’ * P^{-1} * (x_true - x_est)

For a consistent filter, NEES should follow a chi-squared distribution with degrees of freedom equal to the state dimension.

Examples

>>> true = np.array([1.0, 2.0])
>>> est = np.array([1.1, 1.9])
>>> P = np.eye(2) * 0.1
>>> nees(true, est, P)
0.2
pytcl.performance_evaluation.estimation_metrics.nees_sequence(true_states, estimated_states, covariances)[source]

Compute NEES for a sequence of estimates.

Parameters:
  • true_states (ndarray) – True states, shape (N, state_dim).

  • estimated_states (ndarray) – Estimated states, shape (N, state_dim).

  • covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

Returns:

NEES values for each time step, shape (N,).

Return type:

ndarray

Examples

>>> true = np.array([[1.0, 2.0], [1.5, 2.5]])
>>> est = np.array([[1.1, 1.9], [1.6, 2.4]])
>>> P = np.array([np.eye(2) * 0.1, np.eye(2) * 0.1])
>>> nees_vals = nees_sequence(true, est, P)
>>> len(nees_vals)
2
pytcl.performance_evaluation.estimation_metrics.average_nees(true_states, estimated_states, covariances)[source]

Compute average NEES over a sequence.

Parameters:
  • true_states (ndarray) – True states, shape (N, state_dim).

  • estimated_states (ndarray) – Estimated states, shape (N, state_dim).

  • covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

Returns:

Average NEES (should be close to state_dim for consistent filter).

Return type:

float

Examples

>>> true = np.array([[1.0, 2.0], [1.5, 2.5], [2.0, 3.0]])
>>> est = np.array([[1.1, 1.9], [1.6, 2.4], [2.1, 2.9]])
>>> P = np.array([np.eye(2) * 0.1] * 3)
>>> avg = average_nees(true, est, P)
>>> avg  # Should be close to state_dim=2 for consistent filter
0.2
pytcl.performance_evaluation.estimation_metrics.nis(innovation, innovation_covariance)[source]

Compute Normalized Innovation Squared (NIS).

NIS is similar to NEES but computed in measurement space. Used to verify measurement model consistency.

Parameters:
  • innovation (ndarray) – Innovation (measurement residual) vector.

  • innovation_covariance (ndarray) – Innovation covariance matrix S.

Returns:

NIS value (chi-squared distributed with df=meas_dim).

Return type:

float

Notes

NIS = nu’ * S^{-1} * nu

where nu = z - H*x_pred is the innovation and S is the innovation covariance.

Examples

>>> nu = np.array([0.5, -0.3])  # Innovation vector
>>> S = np.eye(2) * 0.25  # Innovation covariance
>>> nis(nu, S)
1.36
pytcl.performance_evaluation.estimation_metrics.nis_sequence(innovations, innovation_covariances)[source]

Compute NIS for a sequence of innovations.

Parameters:
  • innovations (ndarray) – Innovation vectors, shape (N, meas_dim).

  • innovation_covariances (ndarray) – Innovation covariances, shape (N, meas_dim, meas_dim).

Returns:

NIS values for each time step.

Return type:

ndarray

Examples

>>> innovations = np.array([[0.5, -0.3], [0.2, 0.1]])
>>> S = np.array([np.eye(2) * 0.25, np.eye(2) * 0.25])
>>> nis_vals = nis_sequence(innovations, S)
>>> len(nis_vals)
2
pytcl.performance_evaluation.estimation_metrics.consistency_test(nees_or_nis_values, df, confidence=0.95)[source]

Perform chi-squared consistency test on NEES or NIS values.

Tests whether the average NEES/NIS falls within expected confidence bounds for a consistent estimator.

Parameters:
  • nees_or_nis_values (ndarray) – NEES or NIS values from multiple time steps or Monte Carlo runs.

  • df (int) – Degrees of freedom (state_dim for NEES, meas_dim for NIS).

  • confidence (float, optional) – Confidence level (default: 0.95).

Returns:

Named tuple with test results.

Return type:

ConsistencyResult

Examples

>>> np.random.seed(42)
>>> # Simulate NEES from chi-squared (consistent filter)
>>> nees_vals = np.random.chisquare(df=4, size=100)
>>> result = consistency_test(nees_vals, df=4)
>>> result.is_consistent
True
pytcl.performance_evaluation.estimation_metrics.credibility_interval(errors, covariances, interval=0.95)[source]

Compute fraction of errors within credibility interval.

For a consistent estimator, approximately interval fraction of the errors should fall within the corresponding credibility region.

Parameters:
  • errors (ndarray) – Estimation errors, shape (N, state_dim).

  • covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

  • interval (float, optional) – Credibility interval (default: 0.95).

Returns:

Fraction of errors within the interval.

Return type:

float

Examples

>>> rng = np.random.default_rng(42)
>>> errors = rng.normal(0, 0.1, (100, 2))  # Small errors
>>> P = np.array([np.eye(2) * 0.1] * 100)  # Matching covariance
>>> frac = credibility_interval(errors, P, interval=0.95)
>>> frac > 0.9  # Most errors within interval
True
pytcl.performance_evaluation.estimation_metrics.monte_carlo_rmse(errors, axis=0)[source]

Compute RMSE from Monte Carlo simulation errors.

Parameters:
  • errors (ndarray) – Estimation errors from multiple runs, shape (N_runs, N_time, state_dim) or (N_runs, state_dim).

  • axis (int, optional) – Axis representing Monte Carlo runs (default: 0).

Returns:

RMSE values.

Return type:

ndarray

Examples

>>> # 3 Monte Carlo runs, 2 time steps, 2 state components
>>> errors = np.array([[[0.1, 0.2], [0.15, 0.1]],
...                    [[0.05, 0.1], [0.2, 0.15]],
...                    [[0.15, 0.05], [0.1, 0.2]]])
>>> rmse_per_time = monte_carlo_rmse(errors, axis=0)
>>> rmse_per_time.shape
(2, 2)
pytcl.performance_evaluation.estimation_metrics.estimation_error_bounds(covariances, sigma=2.0)[source]

Compute estimation error bounds from covariances.

Parameters:
  • covariances (ndarray) – Covariance matrices, shape (N, state_dim, state_dim).

  • sigma (float, optional) – Number of standard deviations for bounds (default: 2.0).

Returns:

Error bounds (standard deviations) for each component, shape (N, state_dim).

Return type:

ndarray

Examples

>>> P = np.array([[[1.0, 0], [0, 4.0]],
...               [[0.25, 0], [0, 1.0]]])
>>> bounds = estimation_error_bounds(P, sigma=2.0)
>>> bounds[0]  # 2-sigma bounds: 2*sqrt(1), 2*sqrt(4)
array([2., 4.])
>>> bounds[1]  # 2-sigma bounds: 2*sqrt(0.25), 2*sqrt(1)
array([1., 2.])