Performance Evaluation

Track and estimation performance metrics.

Performance evaluation module.

This module provides metrics for evaluating tracking and estimation performance, including:

Track metrics: OSPA, MOTA/MOTP, track purity, fragmentation
Estimation metrics: RMSE, NEES, NIS, consistency tests

Examples

>>> from pytcl.performance_evaluation import ospa, rmse, nees
>>> import numpy as np

>>> # OSPA between two point sets
>>> X = [np.array([0, 0]), np.array([10, 10])]
>>> Y = [np.array([1, 0]), np.array([9, 11])]
>>> result = ospa(X, Y, c=100, p=2)
>>> print(f"OSPA: {result.ospa:.2f}")
OSPA: 1.12

>>> # RMSE between true and estimated states
>>> true = np.array([[0, 0], [1, 1], [2, 2]])
>>> est = np.array([[0.1, -0.1], [1.1, 0.9], [2.0, 2.1]])
>>> print(f"RMSE: {rmse(true, est):.3f}")
RMSE: 0.100

class pytcl.performance_evaluation.OSPAResult(ospa, localization, cardinality)[source]

Bases: NamedTuple

Result of OSPA metric computation.

ospa

Total OSPA distance.

Type:: float

localization

Localization component.

Type:: float

cardinality

Cardinality component.

Type:: float

ospa: float: Alias for field number 0

localization: float: Alias for field number 1

cardinality: float: Alias for field number 2

class pytcl.performance_evaluation.MOTMetrics(mota, motp, num_switches, num_fragmentations, num_false_positives, num_misses)[source]

Bases: NamedTuple

Multiple Object Tracking (MOT) metrics.

mota

Multiple Object Tracking Accuracy.

Type:: float

motp

Multiple Object Tracking Precision.

Type:: float

num_switches

Number of identity switches.

Type:: int

num_fragmentations

Number of track fragmentations.

Type:: int

num_false_positives

Number of false positive detections.

Type:: int

num_misses

Number of missed detections.

Type:: int

mota: float: Alias for field number 0

motp: float: Alias for field number 1

num_switches: int: Alias for field number 2

num_fragmentations: int: Alias for field number 3

num_false_positives: int: Alias for field number 4

num_misses: int: Alias for field number 5

pytcl.performance_evaluation.ospa(X, Y, c=100.0, p=2.0)[source]

Compute Optimal Sub-Pattern Assignment (OSPA) metric.

The OSPA metric provides a mathematically consistent measure of distance between two sets of points, accounting for both localization error and cardinality mismatch.

Parameters:

X (list of ndarray) – First set of points (e.g., ground truth).
Y (list of ndarray) – Second set of points (e.g., estimated tracks).
c (float, optional) – Cutoff parameter for localization error (default: 100.0).
p (float, optional) – Order parameter for the metric (default: 2.0).

Returns:

Named tuple containing: - ospa: Total OSPA distance - localization: Localization component - cardinality: Cardinality component

Return type:

OSPAResult

Notes

If both sets are empty, OSPA is 0.
The metric is symmetric: ospa(X, Y) = ospa(Y, X).
For p=2 (default), the metric is a proper L2 distance.

Examples

>>> X = [np.array([0, 0]), np.array([10, 10])]
>>> Y = [np.array([1, 0]), np.array([10, 11])]
>>> result = ospa(X, Y, c=100, p=2)
>>> result.ospa
1.118...

pytcl.performance_evaluation.ospa_over_time(X_sequence, Y_sequence, c=100.0, p=2.0)[source]

Compute OSPA metric over a time sequence.

Parameters:

X_sequence (list of list of ndarray) – Sequence of ground truth point sets.
Y_sequence (list of list of ndarray) – Sequence of estimated point sets.
c (float, optional) – Cutoff parameter (default: 100.0).
p (float, optional) – Order parameter (default: 2.0).

Returns:

OSPA values at each time step.

Return type:

ndarray

Raises:

ValueError – If sequences have different lengths.

Examples

>>> # Two time steps with ground truth and estimates
>>> X_seq = [[np.array([0, 0]), np.array([10, 10])],
...          [np.array([1, 0]), np.array([11, 10])]]
>>> Y_seq = [[np.array([0.5, 0]), np.array([10, 10.5])],
...          [np.array([1.5, 0]), np.array([11, 10.5])]]
>>> ospa_vals = ospa_over_time(X_seq, Y_seq, c=100, p=2)
>>> len(ospa_vals)
2

pytcl.performance_evaluation.track_purity(true_labels, estimated_labels)[source]

Compute track purity metric.

Track purity measures how well estimated tracks correspond to single ground truth targets. A purity of 1.0 means each estimated track contains observations from only one true target.

Parameters:

true_labels (ndarray) – Ground truth target labels for each observation.
estimated_labels (ndarray) – Estimated track labels for each observation.

Returns:

Track purity score in [0, 1].

Return type:

float

Examples

>>> true_labels = np.array([0, 0, 0, 1, 1, 1])
>>> estimated_labels = np.array([0, 0, 0, 1, 1, 1])  # Perfect
>>> track_purity(true_labels, estimated_labels)
1.0
>>> estimated_labels = np.array([0, 0, 1, 1, 1, 1])  # Mixed
>>> track_purity(true_labels, estimated_labels)
0.833...

pytcl.performance_evaluation.track_fragmentation(true_labels, estimated_labels, time_indices=None)[source]

Count number of track fragmentations.

A fragmentation occurs when observations from a single ground truth target are split across multiple estimated tracks.

Parameters:

true_labels (ndarray) – Ground truth target labels for each observation.
estimated_labels (ndarray) – Estimated track labels for each observation.
time_indices (ndarray, optional) – Time indices for each observation (for temporal ordering).

Returns:

Number of fragmentations.

Return type:

int

Examples

>>> true_labels = np.array([0, 0, 0, 0])
>>> estimated_labels = np.array([0, 0, 1, 1])  # One fragmentation
>>> track_fragmentation(true_labels, estimated_labels)
1

pytcl.performance_evaluation.identity_switches(true_labels, estimated_labels, time_indices=None)[source]

Count number of identity switches.

An identity switch occurs when an estimated track changes which ground truth target it is associated with.

Parameters:

true_labels (ndarray) – Ground truth target labels for each observation.
estimated_labels (ndarray) – Estimated track labels for each observation.
time_indices (ndarray, optional) – Time indices for each observation.

Returns:

Number of identity switches.

Return type:

int

Examples

>>> true_labels = np.array([0, 0, 1, 1])
>>> estimated_labels = np.array([0, 0, 0, 0])  # Track 0 switches targets
>>> identity_switches(true_labels, estimated_labels)
1

pytcl.performance_evaluation.mot_metrics(ground_truth, estimates, threshold=10.0)[source]

Compute CLEAR MOT metrics.

Parameters:

ground_truth (list of list of ndarray) – Ground truth positions at each time step.
estimates (list of list of ndarray) – Estimated positions at each time step.
threshold (float, optional) – Distance threshold for valid associations (default: 10.0).

Returns:

Named tuple containing MOTA, MOTP, and counts.

Return type:

MOTMetrics

Notes

MOTA (Multiple Object Tracking Accuracy) accounts for false positives, misses, and identity switches. MOTP (Precision) measures localization accuracy for correctly matched pairs.

Examples

>>> gt = [[np.array([0, 0]), np.array([10, 10])],
...       [np.array([1, 0]), np.array([11, 10])]]
>>> est = [[np.array([0.5, 0]), np.array([10.5, 10])],
...        [np.array([1.5, 0]), np.array([11.5, 10])]]
>>> result = mot_metrics(gt, est, threshold=5.0)
>>> result.mota  # High accuracy with small errors
1.0
>>> result.motp < 1.0  # Some localization error
True

class pytcl.performance_evaluation.ConsistencyResult(is_consistent, statistic, lower_bound, upper_bound, mean_value)[source]

Bases: NamedTuple

Result of consistency test.

is_consistent

Whether the estimator is consistent.

Type:: bool

statistic

Test statistic value.

Type:: float

lower_bound

Lower confidence bound.

Type:: float

upper_bound

Upper confidence bound.

Type:: float

mean_value

Mean of the test statistic.

Type:: float

is_consistent: bool: Alias for field number 0

statistic: float: Alias for field number 1

lower_bound: float: Alias for field number 2

upper_bound: float: Alias for field number 3

mean_value: float: Alias for field number 4

pytcl.performance_evaluation.rmse(true_states, estimated_states, axis=None)[source]

Compute Root Mean Square Error.

Parameters:

true_states (ndarray) – True state values, shape (N, state_dim) or (N,).
estimated_states (ndarray) – Estimated state values, same shape as true_states.
axis (int, optional) – Axis over which to compute RMSE. - None: RMSE over all elements (scalar result) - 0: RMSE for each state component (vector result) - 1: RMSE for each time step (vector result)

Returns:

Root mean square error.

Return type:

ndarray or float

Examples

>>> true = np.array([[0, 0], [1, 1], [2, 2]])
>>> est = np.array([[0.1, -0.1], [1.2, 0.9], [1.8, 2.1]])
>>> rmse(true, est)  # Scalar RMSE
0.158...
>>> rmse(true, est, axis=0)  # Per-component RMSE
array([0.152..., 0.115...])

pytcl.performance_evaluation.position_rmse(true_states, estimated_states, position_indices)[source]

Compute RMSE for position components only.

Parameters:

true_states (ndarray) – True state values, shape (N, state_dim).
estimated_states (ndarray) – Estimated state values, shape (N, state_dim).
position_indices (list of int) – Indices of position components in state vector.

Returns:

Position RMSE.

Return type:

float

Examples

>>> # State = [x, vx, y, vy], positions are indices [0, 2]
>>> true = np.array([[0, 1, 0, 1], [1, 1, 1, 1]])
>>> est = np.array([[0.1, 1, -0.1, 1], [1.2, 1, 0.9, 1]])
>>> position_rmse(true, est, [0, 2])
0.141...

pytcl.performance_evaluation.velocity_rmse(true_states, estimated_states, velocity_indices)[source]

Compute RMSE for velocity components only.

Parameters:

true_states (ndarray) – True state values, shape (N, state_dim).
estimated_states (ndarray) – Estimated state values, shape (N, state_dim).
velocity_indices (list of int) – Indices of velocity components in state vector.

Returns:

Velocity RMSE.

Return type:

float

Examples

>>> # State = [x, vx, y, vy], velocities are indices [1, 3]
>>> true = np.array([[0, 10, 0, 5], [1, 10, 0.5, 5]])
>>> est = np.array([[0, 9.5, 0, 5.2], [1, 10.2, 0.5, 4.9]])
>>> velocity_rmse(true, est, [1, 3])
0.316...

pytcl.performance_evaluation.nees(true_state, estimated_state, covariance)[source]

Compute Normalized Estimation Error Squared (NEES).

NEES is a measure of filter consistency. For a properly tuned filter, the average NEES should be close to the state dimension.

Parameters:

true_state (ndarray) – True state vector, shape (state_dim,).
estimated_state (ndarray) – Estimated state vector, shape (state_dim,).
covariance (ndarray) – Estimation covariance, shape (state_dim, state_dim).

Returns:

NEES value (chi-squared distributed with df=state_dim).

Return type:

float

Notes

NEES = (x_true - x_est)’ * P^{-1} * (x_true - x_est)

For a consistent filter, NEES should follow a chi-squared distribution with degrees of freedom equal to the state dimension.

Examples

>>> true = np.array([1.0, 2.0])
>>> est = np.array([1.1, 1.9])
>>> P = np.eye(2) * 0.1
>>> nees(true, est, P)
0.2

pytcl.performance_evaluation.nees_sequence(true_states, estimated_states, covariances)[source]

Compute NEES for a sequence of estimates.

Parameters:

true_states (ndarray) – True states, shape (N, state_dim).
estimated_states (ndarray) – Estimated states, shape (N, state_dim).
covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

Returns:

NEES values for each time step, shape (N,).

Return type:

ndarray

Examples

>>> true = np.array([[1.0, 2.0], [1.5, 2.5]])
>>> est = np.array([[1.1, 1.9], [1.6, 2.4]])
>>> P = np.array([np.eye(2) * 0.1, np.eye(2) * 0.1])
>>> nees_vals = nees_sequence(true, est, P)
>>> len(nees_vals)
2

pytcl.performance_evaluation.average_nees(true_states, estimated_states, covariances)[source]

Compute average NEES over a sequence.

Parameters:

true_states (ndarray) – True states, shape (N, state_dim).
estimated_states (ndarray) – Estimated states, shape (N, state_dim).
covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

Returns:

Average NEES (should be close to state_dim for consistent filter).

Return type:

float

Examples

>>> true = np.array([[1.0, 2.0], [1.5, 2.5], [2.0, 3.0]])
>>> est = np.array([[1.1, 1.9], [1.6, 2.4], [2.1, 2.9]])
>>> P = np.array([np.eye(2) * 0.1] * 3)
>>> avg = average_nees(true, est, P)
>>> avg  # Should be close to state_dim=2 for consistent filter
0.2

pytcl.performance_evaluation.nis(innovation, innovation_covariance)[source]

Compute Normalized Innovation Squared (NIS).

NIS is similar to NEES but computed in measurement space. Used to verify measurement model consistency.

Parameters:

innovation (ndarray) – Innovation (measurement residual) vector.
innovation_covariance (ndarray) – Innovation covariance matrix S.

Returns:

NIS value (chi-squared distributed with df=meas_dim).

Return type:

float

Notes

NIS = nu’ * S^{-1} * nu

where nu = z - H*x_pred is the innovation and S is the innovation covariance.

Examples

>>> nu = np.array([0.5, -0.3])  # Innovation vector
>>> S = np.eye(2) * 0.25  # Innovation covariance
>>> nis(nu, S)
1.36

pytcl.performance_evaluation.nis_sequence(innovations, innovation_covariances)[source]

Compute NIS for a sequence of innovations.

Parameters:

innovations (ndarray) – Innovation vectors, shape (N, meas_dim).
innovation_covariances (ndarray) – Innovation covariances, shape (N, meas_dim, meas_dim).

Returns:

NIS values for each time step.

Return type:

ndarray

Examples

>>> innovations = np.array([[0.5, -0.3], [0.2, 0.1]])
>>> S = np.array([np.eye(2) * 0.25, np.eye(2) * 0.25])
>>> nis_vals = nis_sequence(innovations, S)
>>> len(nis_vals)
2

pytcl.performance_evaluation.consistency_test(nees_or_nis_values, df, confidence=0.95)[source]

Perform chi-squared consistency test on NEES or NIS values.

Tests whether the average NEES/NIS falls within expected confidence bounds for a consistent estimator.

Parameters:

nees_or_nis_values (ndarray) – NEES or NIS values from multiple time steps or Monte Carlo runs.
df (int) – Degrees of freedom (state_dim for NEES, meas_dim for NIS).
confidence (float, optional) – Confidence level (default: 0.95).

Returns:

Named tuple with test results.

Return type:

ConsistencyResult

Examples

>>> np.random.seed(42)
>>> # Simulate NEES from chi-squared (consistent filter)
>>> nees_vals = np.random.chisquare(df=4, size=100)
>>> result = consistency_test(nees_vals, df=4)
>>> result.is_consistent
True

pytcl.performance_evaluation.credibility_interval(errors, covariances, interval=0.95)[source]

Compute fraction of errors within credibility interval.

For a consistent estimator, approximately interval fraction of the errors should fall within the corresponding credibility region.

Parameters:

errors (ndarray) – Estimation errors, shape (N, state_dim).
covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).
interval (float, optional) – Credibility interval (default: 0.95).

Returns:

Fraction of errors within the interval.

Return type:

float

Examples

>>> rng = np.random.default_rng(42)
>>> errors = rng.normal(0, 0.1, (100, 2))  # Small errors
>>> P = np.array([np.eye(2) * 0.1] * 100)  # Matching covariance
>>> frac = credibility_interval(errors, P, interval=0.95)
>>> frac > 0.9  # Most errors within interval
True

pytcl.performance_evaluation.monte_carlo_rmse(errors, axis=0)[source]

Compute RMSE from Monte Carlo simulation errors.

Parameters:

errors (ndarray) – Estimation errors from multiple runs, shape (N_runs, N_time, state_dim) or (N_runs, state_dim).
axis (int, optional) – Axis representing Monte Carlo runs (default: 0).

Returns:

RMSE values.

Return type:

ndarray

Examples

>>> # 3 Monte Carlo runs, 2 time steps, 2 state components
>>> errors = np.array([[[0.1, 0.2], [0.15, 0.1]],
...                    [[0.05, 0.1], [0.2, 0.15]],
...                    [[0.15, 0.05], [0.1, 0.2]]])
>>> rmse_per_time = monte_carlo_rmse(errors, axis=0)
>>> rmse_per_time.shape
(2, 2)

pytcl.performance_evaluation.estimation_error_bounds(covariances, sigma=2.0)[source]

Compute estimation error bounds from covariances.

Parameters:

covariances (ndarray) – Covariance matrices, shape (N, state_dim, state_dim).
sigma (float, optional) – Number of standard deviations for bounds (default: 2.0).

Returns:

Error bounds (standard deviations) for each component, shape (N, state_dim).

Return type:

ndarray

Examples

>>> P = np.array([[[1.0, 0], [0, 4.0]],
...               [[0.25, 0], [0, 1.0]]])
>>> bounds = estimation_error_bounds(P, sigma=2.0)
>>> bounds[0]  # 2-sigma bounds: 2*sqrt(1), 2*sqrt(4)
array([2., 4.])
>>> bounds[1]  # 2-sigma bounds: 2*sqrt(0.25), 2*sqrt(1)
array([1., 2.])

Track Metrics

Multi-target tracking performance metrics (OSPA, GOSPA).

Track performance metrics for multi-target tracking evaluation.

This module provides metrics for evaluating multi-target tracker performance, including OSPA (Optimal Sub-Pattern Assignment), track purity, and fragmentation.

References

class pytcl.performance_evaluation.track_metrics.OSPAResult(ospa, localization, cardinality)[source]

Bases: NamedTuple

Result of OSPA metric computation.

ospa

Total OSPA distance.

Type:: float

localization

Localization component.

Type:: float

cardinality

Cardinality component.

Type:: float

ospa: float: Alias for field number 0

localization: float: Alias for field number 1

cardinality: float: Alias for field number 2

class pytcl.performance_evaluation.track_metrics.MOTMetrics(mota, motp, num_switches, num_fragmentations, num_false_positives, num_misses)[source]

Bases: NamedTuple

Multiple Object Tracking (MOT) metrics.

mota

Multiple Object Tracking Accuracy.

Type:: float

motp

Multiple Object Tracking Precision.

Type:: float

num_switches

Number of identity switches.

Type:: int

num_fragmentations

Number of track fragmentations.

Type:: int

num_false_positives

Number of false positive detections.

Type:: int

num_misses

Number of missed detections.

Type:: int

mota: float: Alias for field number 0

motp: float: Alias for field number 1

num_switches: int: Alias for field number 2

num_fragmentations: int: Alias for field number 3

num_false_positives: int: Alias for field number 4

num_misses: int: Alias for field number 5

pytcl.performance_evaluation.track_metrics.ospa(X, Y, c=100.0, p=2.0)[source]

Compute Optimal Sub-Pattern Assignment (OSPA) metric.

The OSPA metric provides a mathematically consistent measure of distance between two sets of points, accounting for both localization error and cardinality mismatch.

Parameters:

X (list of ndarray) – First set of points (e.g., ground truth).
Y (list of ndarray) – Second set of points (e.g., estimated tracks).
c (float, optional) – Cutoff parameter for localization error (default: 100.0).
p (float, optional) – Order parameter for the metric (default: 2.0).

Returns:

Named tuple containing: - ospa: Total OSPA distance - localization: Localization component - cardinality: Cardinality component

Return type:

OSPAResult

Notes

If both sets are empty, OSPA is 0.
The metric is symmetric: ospa(X, Y) = ospa(Y, X).
For p=2 (default), the metric is a proper L2 distance.

Examples

>>> X = [np.array([0, 0]), np.array([10, 10])]
>>> Y = [np.array([1, 0]), np.array([10, 11])]
>>> result = ospa(X, Y, c=100, p=2)
>>> result.ospa
1.118...

pytcl.performance_evaluation.track_metrics.ospa_over_time(X_sequence, Y_sequence, c=100.0, p=2.0)[source]

Compute OSPA metric over a time sequence.

Parameters:

X_sequence (list of list of ndarray) – Sequence of ground truth point sets.
Y_sequence (list of list of ndarray) – Sequence of estimated point sets.
c (float, optional) – Cutoff parameter (default: 100.0).
p (float, optional) – Order parameter (default: 2.0).

Returns:

OSPA values at each time step.

Return type:

ndarray

Raises:

ValueError – If sequences have different lengths.

Examples

>>> # Two time steps with ground truth and estimates
>>> X_seq = [[np.array([0, 0]), np.array([10, 10])],
...          [np.array([1, 0]), np.array([11, 10])]]
>>> Y_seq = [[np.array([0.5, 0]), np.array([10, 10.5])],
...          [np.array([1.5, 0]), np.array([11, 10.5])]]
>>> ospa_vals = ospa_over_time(X_seq, Y_seq, c=100, p=2)
>>> len(ospa_vals)
2

pytcl.performance_evaluation.track_metrics.track_purity(true_labels, estimated_labels)[source]

Compute track purity metric.

Track purity measures how well estimated tracks correspond to single ground truth targets. A purity of 1.0 means each estimated track contains observations from only one true target.

Parameters:

true_labels (ndarray) – Ground truth target labels for each observation.
estimated_labels (ndarray) – Estimated track labels for each observation.

Returns:

Track purity score in [0, 1].

Return type:

float

Examples

>>> true_labels = np.array([0, 0, 0, 1, 1, 1])
>>> estimated_labels = np.array([0, 0, 0, 1, 1, 1])  # Perfect
>>> track_purity(true_labels, estimated_labels)
1.0
>>> estimated_labels = np.array([0, 0, 1, 1, 1, 1])  # Mixed
>>> track_purity(true_labels, estimated_labels)
0.833...

pytcl.performance_evaluation.track_metrics.track_fragmentation(true_labels, estimated_labels, time_indices=None)[source]

Count number of track fragmentations.

A fragmentation occurs when observations from a single ground truth target are split across multiple estimated tracks.

Parameters:

true_labels (ndarray) – Ground truth target labels for each observation.
estimated_labels (ndarray) – Estimated track labels for each observation.
time_indices (ndarray, optional) – Time indices for each observation (for temporal ordering).

Returns:

Number of fragmentations.

Return type:

int

Examples

>>> true_labels = np.array([0, 0, 0, 0])
>>> estimated_labels = np.array([0, 0, 1, 1])  # One fragmentation
>>> track_fragmentation(true_labels, estimated_labels)
1

pytcl.performance_evaluation.track_metrics.identity_switches(true_labels, estimated_labels, time_indices=None)[source]

Count number of identity switches.

An identity switch occurs when an estimated track changes which ground truth target it is associated with.

Parameters:

true_labels (ndarray) – Ground truth target labels for each observation.
estimated_labels (ndarray) – Estimated track labels for each observation.
time_indices (ndarray, optional) – Time indices for each observation.

Returns:

Number of identity switches.

Return type:

int

Examples

>>> true_labels = np.array([0, 0, 1, 1])
>>> estimated_labels = np.array([0, 0, 0, 0])  # Track 0 switches targets
>>> identity_switches(true_labels, estimated_labels)
1

pytcl.performance_evaluation.track_metrics.mot_metrics(ground_truth, estimates, threshold=10.0)[source]

Compute CLEAR MOT metrics.

Parameters:

ground_truth (list of list of ndarray) – Ground truth positions at each time step.
estimates (list of list of ndarray) – Estimated positions at each time step.
threshold (float, optional) – Distance threshold for valid associations (default: 10.0).

Returns:

Named tuple containing MOTA, MOTP, and counts.

Return type:

MOTMetrics

Notes

MOTA (Multiple Object Tracking Accuracy) accounts for false positives, misses, and identity switches. MOTP (Precision) measures localization accuracy for correctly matched pairs.

Examples

>>> gt = [[np.array([0, 0]), np.array([10, 10])],
...       [np.array([1, 0]), np.array([11, 10])]]
>>> est = [[np.array([0.5, 0]), np.array([10.5, 10])],
...        [np.array([1.5, 0]), np.array([11.5, 10])]]
>>> result = mot_metrics(gt, est, threshold=5.0)
>>> result.mota  # High accuracy with small errors
1.0
>>> result.motp < 1.0  # Some localization error
True

Estimation Metrics

State estimation performance metrics (NEES, NIS).

Estimation performance metrics.

This module provides metrics for evaluating state estimation performance, including RMSE, NEES, NIS, and consistency tests.

References

class pytcl.performance_evaluation.estimation_metrics.ConsistencyResult(is_consistent, statistic, lower_bound, upper_bound, mean_value)[source]

Bases: NamedTuple

Result of consistency test.

is_consistent

Whether the estimator is consistent.

Type:: bool

statistic

Test statistic value.

Type:: float

lower_bound

Lower confidence bound.

Type:: float

upper_bound

Upper confidence bound.

Type:: float

mean_value

Mean of the test statistic.

Type:: float

is_consistent: bool: Alias for field number 0

statistic: float: Alias for field number 1

lower_bound: float: Alias for field number 2

upper_bound: float: Alias for field number 3

mean_value: float: Alias for field number 4

pytcl.performance_evaluation.estimation_metrics.rmse(true_states, estimated_states, axis=None)[source]

Compute Root Mean Square Error.

Parameters:

true_states (ndarray) – True state values, shape (N, state_dim) or (N,).
estimated_states (ndarray) – Estimated state values, same shape as true_states.
axis (int, optional) – Axis over which to compute RMSE. - None: RMSE over all elements (scalar result) - 0: RMSE for each state component (vector result) - 1: RMSE for each time step (vector result)

Returns:

Root mean square error.

Return type:

ndarray or float

Examples

>>> true = np.array([[0, 0], [1, 1], [2, 2]])
>>> est = np.array([[0.1, -0.1], [1.2, 0.9], [1.8, 2.1]])
>>> rmse(true, est)  # Scalar RMSE
0.158...
>>> rmse(true, est, axis=0)  # Per-component RMSE
array([0.152..., 0.115...])

pytcl.performance_evaluation.estimation_metrics.position_rmse(true_states, estimated_states, position_indices)[source]

Compute RMSE for position components only.

Parameters:

true_states (ndarray) – True state values, shape (N, state_dim).
estimated_states (ndarray) – Estimated state values, shape (N, state_dim).
position_indices (list of int) – Indices of position components in state vector.

Returns:

Position RMSE.

Return type:

float

Examples

>>> # State = [x, vx, y, vy], positions are indices [0, 2]
>>> true = np.array([[0, 1, 0, 1], [1, 1, 1, 1]])
>>> est = np.array([[0.1, 1, -0.1, 1], [1.2, 1, 0.9, 1]])
>>> position_rmse(true, est, [0, 2])
0.141...

pytcl.performance_evaluation.estimation_metrics.velocity_rmse(true_states, estimated_states, velocity_indices)[source]

Compute RMSE for velocity components only.

Parameters:

true_states (ndarray) – True state values, shape (N, state_dim).
estimated_states (ndarray) – Estimated state values, shape (N, state_dim).
velocity_indices (list of int) – Indices of velocity components in state vector.

Returns:

Velocity RMSE.

Return type:

float

Examples

>>> # State = [x, vx, y, vy], velocities are indices [1, 3]
>>> true = np.array([[0, 10, 0, 5], [1, 10, 0.5, 5]])
>>> est = np.array([[0, 9.5, 0, 5.2], [1, 10.2, 0.5, 4.9]])
>>> velocity_rmse(true, est, [1, 3])
0.316...

pytcl.performance_evaluation.estimation_metrics.nees(true_state, estimated_state, covariance)[source]

Compute Normalized Estimation Error Squared (NEES).

NEES is a measure of filter consistency. For a properly tuned filter, the average NEES should be close to the state dimension.

Parameters:

true_state (ndarray) – True state vector, shape (state_dim,).
estimated_state (ndarray) – Estimated state vector, shape (state_dim,).
covariance (ndarray) – Estimation covariance, shape (state_dim, state_dim).

Returns:

NEES value (chi-squared distributed with df=state_dim).

Return type:

float

Notes

NEES = (x_true - x_est)’ * P^{-1} * (x_true - x_est)

For a consistent filter, NEES should follow a chi-squared distribution with degrees of freedom equal to the state dimension.

Examples

>>> true = np.array([1.0, 2.0])
>>> est = np.array([1.1, 1.9])
>>> P = np.eye(2) * 0.1
>>> nees(true, est, P)
0.2

pytcl.performance_evaluation.estimation_metrics.nees_sequence(true_states, estimated_states, covariances)[source]

Compute NEES for a sequence of estimates.

Parameters:

true_states (ndarray) – True states, shape (N, state_dim).
estimated_states (ndarray) – Estimated states, shape (N, state_dim).
covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

Returns:

NEES values for each time step, shape (N,).

Return type:

ndarray

Examples

>>> true = np.array([[1.0, 2.0], [1.5, 2.5]])
>>> est = np.array([[1.1, 1.9], [1.6, 2.4]])
>>> P = np.array([np.eye(2) * 0.1, np.eye(2) * 0.1])
>>> nees_vals = nees_sequence(true, est, P)
>>> len(nees_vals)
2

pytcl.performance_evaluation.estimation_metrics.average_nees(true_states, estimated_states, covariances)[source]

Compute average NEES over a sequence.

Parameters:

true_states (ndarray) – True states, shape (N, state_dim).
estimated_states (ndarray) – Estimated states, shape (N, state_dim).
covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).

Returns:

Average NEES (should be close to state_dim for consistent filter).

Return type:

float

Examples

>>> true = np.array([[1.0, 2.0], [1.5, 2.5], [2.0, 3.0]])
>>> est = np.array([[1.1, 1.9], [1.6, 2.4], [2.1, 2.9]])
>>> P = np.array([np.eye(2) * 0.1] * 3)
>>> avg = average_nees(true, est, P)
>>> avg  # Should be close to state_dim=2 for consistent filter
0.2

pytcl.performance_evaluation.estimation_metrics.nis(innovation, innovation_covariance)[source]

Compute Normalized Innovation Squared (NIS).

NIS is similar to NEES but computed in measurement space. Used to verify measurement model consistency.

Parameters:

innovation (ndarray) – Innovation (measurement residual) vector.
innovation_covariance (ndarray) – Innovation covariance matrix S.

Returns:

NIS value (chi-squared distributed with df=meas_dim).

Return type:

float

Notes

NIS = nu’ * S^{-1} * nu

where nu = z - H*x_pred is the innovation and S is the innovation covariance.

Examples

>>> nu = np.array([0.5, -0.3])  # Innovation vector
>>> S = np.eye(2) * 0.25  # Innovation covariance
>>> nis(nu, S)
1.36

pytcl.performance_evaluation.estimation_metrics.nis_sequence(innovations, innovation_covariances)[source]

Compute NIS for a sequence of innovations.

Parameters:

innovations (ndarray) – Innovation vectors, shape (N, meas_dim).
innovation_covariances (ndarray) – Innovation covariances, shape (N, meas_dim, meas_dim).

Returns:

NIS values for each time step.

Return type:

ndarray

Examples

>>> innovations = np.array([[0.5, -0.3], [0.2, 0.1]])
>>> S = np.array([np.eye(2) * 0.25, np.eye(2) * 0.25])
>>> nis_vals = nis_sequence(innovations, S)
>>> len(nis_vals)
2

pytcl.performance_evaluation.estimation_metrics.consistency_test(nees_or_nis_values, df, confidence=0.95)[source]

Perform chi-squared consistency test on NEES or NIS values.

Tests whether the average NEES/NIS falls within expected confidence bounds for a consistent estimator.

Parameters:

nees_or_nis_values (ndarray) – NEES or NIS values from multiple time steps or Monte Carlo runs.
df (int) – Degrees of freedom (state_dim for NEES, meas_dim for NIS).
confidence (float, optional) – Confidence level (default: 0.95).

Returns:

Named tuple with test results.

Return type:

ConsistencyResult

Examples

>>> np.random.seed(42)
>>> # Simulate NEES from chi-squared (consistent filter)
>>> nees_vals = np.random.chisquare(df=4, size=100)
>>> result = consistency_test(nees_vals, df=4)
>>> result.is_consistent
True

pytcl.performance_evaluation.estimation_metrics.credibility_interval(errors, covariances, interval=0.95)[source]

Compute fraction of errors within credibility interval.

For a consistent estimator, approximately interval fraction of the errors should fall within the corresponding credibility region.

Parameters:

errors (ndarray) – Estimation errors, shape (N, state_dim).
covariances (ndarray) – Covariances, shape (N, state_dim, state_dim).
interval (float, optional) – Credibility interval (default: 0.95).

Returns:

Fraction of errors within the interval.

Return type:

float

Examples

>>> rng = np.random.default_rng(42)
>>> errors = rng.normal(0, 0.1, (100, 2))  # Small errors
>>> P = np.array([np.eye(2) * 0.1] * 100)  # Matching covariance
>>> frac = credibility_interval(errors, P, interval=0.95)
>>> frac > 0.9  # Most errors within interval
True

pytcl.performance_evaluation.estimation_metrics.monte_carlo_rmse(errors, axis=0)[source]

Compute RMSE from Monte Carlo simulation errors.

Parameters:

errors (ndarray) – Estimation errors from multiple runs, shape (N_runs, N_time, state_dim) or (N_runs, state_dim).
axis (int, optional) – Axis representing Monte Carlo runs (default: 0).

Returns:

RMSE values.

Return type:

ndarray

Examples

>>> # 3 Monte Carlo runs, 2 time steps, 2 state components
>>> errors = np.array([[[0.1, 0.2], [0.15, 0.1]],
...                    [[0.05, 0.1], [0.2, 0.15]],
...                    [[0.15, 0.05], [0.1, 0.2]]])
>>> rmse_per_time = monte_carlo_rmse(errors, axis=0)
>>> rmse_per_time.shape
(2, 2)

pytcl.performance_evaluation.estimation_metrics.estimation_error_bounds(covariances, sigma=2.0)[source]

Compute estimation error bounds from covariances.

Parameters:

covariances (ndarray) – Covariance matrices, shape (N, state_dim, state_dim).
sigma (float, optional) – Number of standard deviations for bounds (default: 2.0).

Returns:

Error bounds (standard deviations) for each component, shape (N, state_dim).

Return type:

ndarray

Examples

>>> P = np.array([[[1.0, 0], [0, 4.0]],
...               [[0.25, 0], [0, 1.0]]])
>>> bounds = estimation_error_bounds(P, sigma=2.0)
>>> bounds[0]  # 2-sigma bounds: 2*sqrt(1), 2*sqrt(4)
array([2., 4.])
>>> bounds[1]  # 2-sigma bounds: 2*sqrt(0.25), 2*sqrt(1)
array([1., 2.])