Adaptive Filtering ================== Adaptive filtering techniques allow Kalman and particle filters to automatically adjust their parameters in response to changing system dynamics and measurement conditions. This guide covers divergence detection, noise covariance estimation, and adaptive Kalman gain strategies for real-time filter tuning. .. contents:: Contents :local: :depth: 3 --- When to Use Adaptive Filtering =============================== Standard Kalman filters assume **known and constant** process and measurement noise covariances ($Q$ and $R$). However, real systems often violate this assumption: **Problem Scenarios:** - **Unknown noise statistics**: Initial $Q$ and $R$ estimates are inaccurate - **Time-varying systems**: Process noise changes due to environmental conditions - **Sensor degradation**: Measurement noise increases over time - **Model mismatch**: System dynamics drift from the assumed model - **Multi-sensor fusion**: Different sensors have varying noise characteristics **Filter Divergence Symptoms:** 1. **Residual saturation**: Innovation sequence shows bias or unusual statistics 2. **Filter overfitting**: Confidence bounds contract despite poor estimate quality 3. **Estimate instability**: State estimates diverge from measurements unexpectedly 4. **Consistency tests fail**: Normalized innovations exceed $3\sigma$ bounds repeatedly **When NOT to adapt:** If your system parameters are well-characterized and constant, standard tuning (static $Q$ and $R$) is more robust. --- Divergence Detection Techniques ================================ Before adapting filter parameters, detect when divergence is occurring. Normalized Innovation Squared Test ---------------------------------- The normalized innovation (residual) should follow $\chi^2(m)$ distribution if filter is consistent: $$\gamma_k = \mathbf{y}_k^T \mathbf{S}_k^{-1} \mathbf{y}_k \sim \chi^2(m)$$ where $\mathbf{y}_k$ is the innovation and $\mathbf{S}_k$ is the innovation covariance. **Algorithm:** .. code-block:: python import numpy as np from scipy.stats import chi2 from pytcl.dynamic_estimation.kalman import ExtendedKalmanFilter class DivergenceDetector: """Detects filter divergence using statistical tests.""" def __init__(self, measurement_dim: int, window_size: int = 100): self.m = measurement_dim self.window_size = window_size self.innovations = [] self.normalized_sq = [] def update(self, innovation: np.ndarray, innovation_cov: np.ndarray) -> dict: """ Check divergence at current timestep. Parameters ---------- innovation : (m,) array Measurement residual y_k innovation_cov : (m, m) array Innovation covariance S_k Returns ------- dict with keys: - normalized_sq: Normalized innovation squared γ_k - mahal_distance: Mahalanobis distance (sqrt of γ_k) - p_value: χ²(m) tail probability - is_outlier: True if γ_k > chi2.ppf(0.99, m) - is_diverging: True if recent mean γ_k exceeds threshold """ # Normalized innovation squared try: S_inv = np.linalg.inv(innovation_cov) gamma = float(innovation.T @ S_inv @ innovation) except np.linalg.LinAlgError: return {'error': 'Singular innovation covariance'} self.normalized_sq.append(gamma) self.innovations.append(innovation.copy()) # Keep window if len(self.normalized_sq) > self.window_size: self.normalized_sq.pop(0) self.innovations.pop(0) # Statistical test p_value = 1.0 - chi2.cdf(gamma, self.m) critical_value = chi2.ppf(0.99, self.m) # 99% quantile is_outlier = gamma > critical_value # Divergence test: mean γ_k should be ≈ m recent_mean = np.mean(self.normalized_sq) divergence_threshold = self.m * 1.5 # Allow 50% excess is_diverging = recent_mean > divergence_threshold return { 'normalized_sq': gamma, 'mahal_distance': np.sqrt(gamma), 'p_value': p_value, 'is_outlier': is_outlier, 'recent_mean': recent_mean, 'is_diverging': is_diverging, } # Example usage detector = DivergenceDetector(measurement_dim=3) # Simulate measurements from a simple filter for k in range(200): # Your filter update produces innovation and innovation_cov innovation = np.random.randn(3) # Simulated innovation_cov = np.eye(3) status = detector.update(innovation, innovation_cov) if status.get('is_diverging'): print(f"Divergence detected at step {k}") print(f" Mean normalized_sq: {status['recent_mean']:.2f} (expected ~{detector.m})") Autocorrelation Test -------------------- White residuals should have zero autocorrelation. Significant autocorrelation indicates model mismatch or parameter errors. .. code-block:: python def autocorrelation_test(innovations: np.ndarray, lag: int = 10) -> dict: """ Test residual autocorrelation. Parameters ---------- innovations : (N, m) array Time series of innovations lag : int Maximum lag to examine Returns ------- dict with autocorrelation values and significance tests """ N, m = innovations.shape acf_values = {} for component in range(m): signal = innovations[:, component] mean = np.mean(signal) var = np.var(signal) acf = np.zeros(lag + 1) for k in range(lag + 1): if k == 0: acf[k] = 1.0 else: acf[k] = np.mean((signal[:-k] - mean) * (signal[k:] - mean)) / var acf_values[f'component_{component}'] = acf # Ljung-Box test Q_stat = N * (N + 2) * sum(acf[k]**2 / (N - k) for k in range(1, lag + 1)) from scipy.stats import chi2 p_value = 1.0 - chi2.cdf(Q_stat, lag - 1) return { 'acf': acf_values, 'ljung_box_statistic': Q_stat, 'p_value': p_value, 'is_white': p_value > 0.05 } --- Noise Covariance Estimation ============================= Estimate $Q$ and $R$ from batch or online data. Batch Estimation: Maximum Likelihood ------------------------------------- Given a batch of filter residuals, estimate noise covariances via maximum likelihood. $$\hat{\mathbf{Q}} = \frac{1}{N} \sum_{k=1}^{N} \mathbf{x}_k \mathbf{x}_k^T$$ $$\hat{\mathbf{R}} = \frac{1}{N} \sum_{k=1}^{N} \mathbf{y}_k \mathbf{y}_k^T$$ .. code-block:: python class NoiseEstimator: """Batch and online noise covariance estimation.""" @staticmethod def estimate_from_residuals(innovations: np.ndarray, state_predictions: np.ndarray) -> dict: """ Estimate Q and R from filter residuals and predictions. Parameters ---------- innovations : (N, m) array Measurement residuals state_predictions : (N, n) array State prediction errors (x_true - x_pred) Returns ------- dict with estimated R and Q matrices """ N = innovations.shape[0] # Measurement noise covariance R_est = innovations.T @ innovations / N # Process noise covariance (approximation) Q_est = state_predictions.T @ state_predictions / N return { 'R': R_est, 'Q': Q_est, 'R_condition_number': np.linalg.cond(R_est), 'Q_condition_number': np.linalg.cond(Q_est), } @staticmethod def estimate_diagonal(innovations: np.ndarray) -> np.ndarray: """ Quick estimate: diagonal covariance (assumes independent noise). More numerically stable for ill-conditioned problems. """ return np.diag(np.var(innovations, axis=0)) Adaptive Q Estimation (Online) ------------------------------- Track time-varying process noise online. Use exponential weighting to emphasize recent data: $$\hat{\mathbf{Q}}_k = \alpha \hat{\mathbf{Q}}_{k-1} + (1-\alpha) (\mathbf{y}_{k} - \mathbf{H} \hat{\mathbf{x}}_{k|k-1})(\cdot)^T$$ .. code-block:: python class AdaptiveNoiseEstimator: """Online adaptive noise covariance estimation.""" def __init__(self, state_dim: int, meas_dim: int, forgetting_factor: float = 0.95): """ Parameters ---------- state_dim : int State dimension meas_dim : int Measurement dimension forgetting_factor : float Exponential weighting (0.9-0.99 typical) - 1.0: infinite memory (pure averaging) - 0.9: recent ~10 steps emphasized """ self.n = state_dim self.m = meas_dim self.alpha = forgetting_factor self.R_hat = np.eye(meas_dim) self.Q_hat = np.eye(state_dim) self.update_count = 0 def update_R(self, innovation: np.ndarray) -> None: """Update measurement noise estimate.""" self.update_count += 1 # Exponentially weighted update self.R_hat = (self.alpha * self.R_hat + (1 - self.alpha) * np.outer(innovation, innovation)) def update_Q(self, state_residual: np.ndarray) -> None: """Update process noise estimate.""" self.Q_hat = (self.alpha * self.Q_hat + (1 - self.alpha) * np.outer(state_residual, state_residual)) def get_estimates(self) -> dict: """Return current noise estimates.""" return { 'R': self.R_hat.copy(), 'Q': self.Q_hat.copy(), 'updates': self.update_count } **Best Practices:** - Start with `forgetting_factor = 0.95` (effective window ≈ 20 steps) - Increase forgetting factor if system is slowly time-varying - Decrease (e.g., 0.90) for rapidly changing conditions - Regularize: Add small $\epsilon I$ to prevent singularity - Monitor condition numbers to detect ill-posedness --- Adaptive Kalman Filtering ========================== Dynamically adjust Kalman filter gains and covariances. Gain Adaptation Strategy ------------------------ Instead of adapting noise covariances, directly scale the Kalman gain: $$\mathbf{K}_k^{adaptive} = \beta_k \mathbf{K}_k$$ where $\beta_k$ controls filter responsiveness based on residual statistics. .. code-block:: python class AdaptiveKalmanFilter: """ Adaptive Kalman filter with automatic gain tuning. Adjusts gain based on: 1. Innovation magnitude (residual statistics) 2. Consistency tests (are we under/overconfident?) 3. Divergence indicators """ def __init__(self, kf, initial_beta: float = 1.0): """ Parameters ---------- kf : KalmanFilter or ExtendedKalmanFilter Base filter (we'll wrap it) initial_beta : float Initial gain scaling factor (typically 1.0) """ self.kf = kf self.beta = initial_beta self.R_est = kf.R.copy() self.innovation_history = [] self.beta_history = [initial_beta] self.divergence_state = 'normal' def predict(self, dt: float) -> np.ndarray: """Prediction step (standard KF).""" return self.kf.predict(dt) def update(self, z: np.ndarray, dt: float) -> tuple: """ Update step with adaptive gain. Returns ------- (state, covariance, diagnostics) """ # Compute innovation with predicted state H = self.kf.H x_pred = self.kf.x y = z - H @ x_pred # Innovation # Innovation covariance S = H @ self.kf.P @ H.T + self.kf.R # Normalize innovation try: S_inv = np.linalg.inv(S) gamma = float(y.T @ S_inv @ y) except np.linalg.LinAlgError: gamma = 1.0 # Compute scaling factor β_k based on innovation magnitude m = len(z) expected_gamma = m # Expected value of χ²(m) beta = self._compute_adaptive_gain(gamma, expected_gamma) # Scale Kalman gain K = self.kf.P @ H.T @ np.linalg.inv(S) K_adapted = beta * K # Update with adapted gain x_new = x_pred + K_adapted @ y P_new = (np.eye(len(x_pred)) - K_adapted @ H) @ self.kf.P # Update filter state self.kf.x = x_new self.kf.P = P_new # Record history self.innovation_history.append(gamma) self.beta_history.append(beta) return x_new, P_new, { 'innovation_norm_sq': gamma, 'adaptive_beta': beta, 'divergence_state': self.divergence_state } def _compute_adaptive_gain(self, gamma: float, expected_gamma: float) -> float: """ Compute adaptive gain scaling β based on normalized innovation. Strategy: - If γ ≈ expected: No scaling (β = 1) - If γ >> expected: Innovation is large → reduce gain (β < 1) (System is noisier than assumed, filter is overconfident) - If γ << expected: Innovation is small → increase gain (β > 1) (Filter is underconfident, not tracking changes) """ # Clip to prevent extreme swings ratio = gamma / expected_gamma # Use smooth function: β = 1 / sqrt(ratio) # This is the "optimal" scaling for Gaussian errors beta = 1.0 / np.sqrt(max(ratio, 0.5)) # Limit 0.5 ≤ β ≤ 1.4 beta = np.clip(beta, 0.5, 1.4) # Divergence detection: if repeated large innovations if len(self.innovation_history) >= 10: recent_mean = np.mean(self.innovation_history[-10:]) if recent_mean > expected_gamma * 2.0: self.divergence_state = 'diverging' beta *= 0.8 # Reduce gain further else: self.divergence_state = 'normal' return beta **Example: GPS/INS Navigation** .. code-block:: python # Simulate adaptive filtering for GPS/INS fusion from pytcl.dynamic_estimation.kalman import ExtendedKalmanFilter # Create base EKF ekf = ExtendedKalmanFilter( x=np.array([0, 0, 0]), # pos_x, pos_y, vel_x P=np.eye(3), Q=np.diag([0.01, 0.01, 0.001]), R=np.diag([1.0, 1.0]) # GPS measurement noise ) # Wrap with adaptive filtering adaptive_ekf = AdaptiveKalmanFilter(ekf, initial_beta=1.0) # Simulate GPS signal degradation for k in range(100): # Predict adaptive_ekf.predict(dt=0.1) # GPS measurement with time-varying noise if k < 50: gps_noise_std = 1.0 else: gps_noise_std = 5.0 # GPS signal degrades (e.g., urban canyon) z = np.array([0.1*k, 0.05*k**2]) + gps_noise_std * np.random.randn(2) x_est, P_est, diagnostics = adaptive_ekf.update(z, dt=0.1) if k % 20 == 0: print(f"Step {k}: β = {diagnostics['adaptive_beta']:.2f}, " f"γ = {diagnostics['innovation_norm_sq']:.1f}") --- Least Mean Squares (LMS) Adaptation ==================================== For problems where you want to minimize prediction error directly, use adaptive LMS filtering. Algorithm --------- At each step, update filter weights to minimize squared error: $$\mathbf{w}_{k+1} = \mathbf{w}_k + \mu \mathbf{x}_k (z_k - \mathbf{w}_k^T \mathbf{x}_k)$$ where $\mu$ is the step size (learning rate). .. code-block:: python class LMSFilter: """ Least Mean Squares adaptive filter. Minimizes MSE directly using gradient descent on filter weights. """ def __init__(self, filter_order: int, step_size: float = 0.01): """ Parameters ---------- filter_order : int Number of filter taps (order) step_size : float Learning rate μ (typically 0 < μ < 1/(3*power)) """ self.order = filter_order self.mu = step_size self.weights = np.zeros(filter_order) self.buffer = np.zeros(filter_order) self.error_history = [] def update(self, measurement: float) -> tuple: """ Update filter with new measurement. Parameters ---------- measurement : float New input sample Returns ------- (estimate, error, normalized_error_sq) """ # Shift buffer and insert new measurement self.buffer = np.roll(self.buffer, 1) self.buffer[0] = measurement # Filtering (estimate via convolution with weights) estimate = np.dot(self.weights, self.buffer) # Error error = measurement - estimate # Gradient descent update # ∂MSE/∂w = -2 * error * x (x is buffer) self.weights += self.mu * error * self.buffer error_sq = error ** 2 self.error_history.append(error_sq) return estimate, error, error_sq def set_step_size(self, mu: float) -> None: """ Dynamically adjust learning rate. Larger μ: Faster adaptation, but less stable Smaller μ: Slower adaptation, but more stability """ self.mu = mu def get_mse(self, window: int = 100) -> float: """Get recent Mean Squared Error.""" if len(self.error_history) < window: return np.mean(self.error_history) return np.mean(self.error_history[-window:]) **Example: Adaptive System Identification** .. code-block:: python # Adaptive filter to identify unknown system response # True system has impulse response [1.0, 0.5, -0.2] true_system = np.array([1.0, 0.5, -0.2]) # Adaptive LMS filter lms = LMSFilter(filter_order=3, step_size=0.05) for k in range(500): # Input signal (white noise) x_input = np.random.randn() # True system output y_true = np.dot(true_system, np.array([x_input, 0.5 * np.random.randn(), 0.1 * np.random.randn()])) # Measurement with noise z = y_true + 0.1 * np.random.randn() # LMS update y_est, error, mse = lms.update(z) if k % 100 == 0: print(f"Step {k}: Estimated weights = {lms.weights}, " f"MSE = {lms.get_mse():.4f}") --- Recursive Least Squares (RLS) Adaptation ======================================== RLS is more sophisticated than LMS: it maintains a covariance matrix of the problem and achieves faster convergence. Algorithm --------- $$\mathbf{w}_{k+1} = \mathbf{w}_k + \mathbf{K}_k (z_k - \mathbf{x}_k^T \mathbf{w}_k)$$ $$\mathbf{K}_k = \frac{\mathbf{P}_{k-1} \mathbf{x}_k}{\lambda + \mathbf{x}_k^T \mathbf{P}_{k-1} \mathbf{x}_k}$$ $$\mathbf{P}_k = \frac{1}{\lambda}\left(\mathbf{P}_{k-1} - \frac{\mathbf{P}_{k-1} \mathbf{x}_k \mathbf{x}_k^T \mathbf{P}_{k-1}}{\lambda + \mathbf{x}_k^T \mathbf{P}_{k-1} \mathbf{x}_k}\right)$$ where $\lambda$ is the forgetting factor (0.95-0.99 typical). .. code-block:: python class RLSFilter: """ Recursive Least Squares adaptive filter. Faster convergence than LMS at cost of higher computational load. """ def __init__(self, filter_order: int, forgetting_factor: float = 0.95, regularization: float = 1e-6): """ Parameters ---------- filter_order : int Number of filter coefficients forgetting_factor : float λ ∈ (0, 1]. Smaller = faster fade of old data - 1.0: full recursive LS (no forgetting) - 0.95: ~20 step effective window - 0.90: ~10 step effective window regularization : float Small regularization to prevent ill-conditioning """ self.order = filter_order self.lamda = forgetting_factor self.reg = regularization self.weights = np.zeros(filter_order) self.P = (1.0 / regularization) * np.eye(filter_order) self.buffer = np.zeros(filter_order) self.error_history = [] def update(self, measurement: float) -> tuple: """ RLS update step. Returns ------- (estimate, error, condition_number) """ # Shift buffer self.buffer = np.roll(self.buffer, 1) self.buffer[0] = measurement # Prediction error y_est = np.dot(self.weights, self.buffer) error = measurement - y_est # Compute gain matrix K_k denom = self.lamda + self.buffer @ self.P @ self.buffer K = (self.P @ self.buffer) / denom # Update weights self.weights += K * error # Update covariance matrix (Joseph form for stability) P_new = (1.0 / self.lamda) * (self.P - np.outer(K, self.buffer @ self.P)) # Regularization to maintain numerical stability eigenvalues = np.linalg.eigvals(P_new) if np.min(eigenvalues) < 1e-10: P_new += self.reg * np.eye(self.order) self.P = P_new self.error_history.append(error ** 2) cond_number = np.linalg.cond(self.P) return y_est, error, cond_number def get_mse(self, window: int = 50) -> float: """Get recent MSE.""" if len(self.error_history) < window: return np.mean(self.error_history) return np.mean(self.error_history[-window:]) **Comparison: LMS vs RLS** .. code-block:: python # Compare convergence speed true_weights = np.array([0.8, -0.3, 0.2]) lms = LMSFilter(filter_order=3, step_size=0.1) rls = RLSFilter(filter_order=3, forgetting_factor=0.95) lms_errors = [] rls_errors = [] for k in range(200): # Input x = np.random.randn() y = np.random.randn() z = 0.8*x - 0.3*y + 0.2*np.random.randn() + 0.05*np.random.randn() # LMS y_lms, err_lms, _ = lms.update(z) lms_errors.append(np.mean(lms.error_history[-10:]) if lms.error_history else 0) # RLS y_rls, err_rls, _ = rls.update(z) rls_errors.append(rls.get_mse(window=10)) import matplotlib.pyplot as plt plt.figure(figsize=(10, 4)) plt.semilogy(lms_errors, label='LMS', alpha=0.7) plt.semilogy(rls_errors, label='RLS', alpha=0.7) plt.xlabel('Iteration') plt.ylabel('MSE (log scale)') plt.legend() plt.grid(True, alpha=0.3) plt.title('Convergence Speed: LMS vs RLS') plt.tight_layout() plt.show() print(f"LMS final MSE: {lms_errors[-1]:.4f}") print(f"RLS final MSE: {rls_errors[-1]:.4f}") --- Practical Adaptive Filter Systems ================================== Multi-Sensor Adaptive Fusion ---------------------------- Combine multiple sensors with unknown noise characteristics. .. code-block:: python class AdaptiveMultiSensorFusion: """ Fuse multiple sensors with adaptive noise estimation. """ def __init__(self, num_sensors: int, state_dim: int): self.num_sensors = num_sensors self.state_dim = state_dim # Adaptive noise estimates for each sensor self.R_estimates = [np.eye(1) for _ in range(num_sensors)] self.measurement_history = [[] for _ in range(num_sensors)] # State estimate self.x = np.zeros(state_dim) self.P = np.eye(state_dim) def update_sensor(self, sensor_id: int, measurement: float, H: np.ndarray, alpha: float = 0.95) -> None: """ Update with single sensor measurement. Parameters ---------- sensor_id : int Which sensor (0 to num_sensors-1) measurement : float Scalar measurement H : (1, state_dim) array Measurement matrix alpha : float Exponential weighting for R estimation """ z = np.array([measurement]) # Predicted measurement z_pred = H @ self.x innovation = z - z_pred # Update R estimate for this sensor R_old = self.R_estimates[sensor_id] self.R_estimates[sensor_id] = (alpha * R_old + (1 - alpha) * innovation @ innovation.T) # Standard Kalman update with estimated R S = H @ self.P @ H.T + self.R_estimates[sensor_id] K = self.P @ H.T / S self.x = self.x + K @ innovation self.P = (np.eye(self.state_dim) - K @ H) @ self.P self.measurement_history[sensor_id].append(measurement) def get_sensor_reliability(self, sensor_id: int) -> dict: """Rate each sensor based on noise estimates.""" R = self.R_estimates[sensor_id] noise_std = np.sqrt(R[0, 0]) # Lower noise = more reliable reliability = 1.0 / (1.0 + noise_std) return { 'sensor_id': sensor_id, 'noise_estimate': noise_std, 'reliability_score': reliability, 'measurements': len(self.measurement_history[sensor_id]) } GPS/INS Adaptive Integration ----------------------------- Real-world example: Adaptive GPS/INS navigation with sensor degradation detection. .. code-block:: python class AdaptiveGPSINS: """ Adaptive GPS/INS integration. - INS provides high-rate state propagation - GPS provides low-rate absolute measurements - Adapts GPS measurement noise when signal degrades """ def __init__(self): # INS state: [pos_x, pos_y, vel_x, vel_y] self.x = np.array([0.0, 0.0, 1.0, 0.0]) self.P = np.eye(4) * 0.1 # Process noise (INS integration) self.Q = np.diag([0.01, 0.01, 0.001, 0.001]) ** 2 # Measurement noise estimates (GPS) self.R_gps = np.eye(2) * 1.0 # Will adapt # Tuning parameters self.R_max = 100.0 # Max allowed GPS noise self.R_min = 0.1 self.forgetting_factor = 0.95 self.gps_measurement_count = 0 def ins_predict(self, dt: float, accel: np.ndarray) -> None: """Propagate INS.""" F = np.eye(4) F[0, 2] = dt F[1, 3] = dt # State transition self.x = F @ self.x # Simplified (IRL use 9-state IMU model) # Covariance propagation self.P = F @ self.P @ F.T + self.Q def gps_update(self, gps_pos: np.ndarray) -> dict: """ GPS measurement update with adaptive noise. Detects multipath/signal degradation. """ H = np.array([[1, 0, 0, 0], [0, 1, 0, 0]]) z = gps_pos z_pred = H @ self.x innovation = z - z_pred # Innovation covariance S = H @ self.P @ H.T + self.R_gps # Check if innovation is unusually large (signal loss) innovation_norm_sq = innovation @ np.linalg.inv(S) @ innovation expected_norm = 2.0 # Expected χ²(2) is_degraded = innovation_norm_sq > 2.0 * expected_norm # Adaptive R: Increase if GPS is degraded, decrease if good if is_degraded: # GPS signal degraded → reduce gain self.R_gps = np.minimum(self.R_gps * 1.2, self.R_max * np.eye(2)) else: # GPS signal good → normal noise self.R_gps = (self.forgetting_factor * self.R_gps + (1 - self.forgetting_factor) * np.outer(innovation, innovation)) self.R_gps = np.clip(self.R_gps[0, 0], self.R_min, self.R_max) * np.eye(2) # Standard Kalman update K = self.P @ H.T @ np.linalg.inv(S) self.x = self.x + K @ innovation self.P = (np.eye(4) - K @ H) @ self.P self.gps_measurement_count += 1 return { 'is_degraded': is_degraded, 'innovation_norm_sq': innovation_norm_sq, 'gps_noise_std': np.sqrt(self.R_gps[0, 0]), 'gps_count': self.gps_measurement_count } --- Diagnostic Tools ================ Monitor and debug adaptive filters. .. code-block:: python class AdaptiveFilterDiagnostics: """Comprehensive diagnostics for adaptive filters.""" def __init__(self, filter_object): self.filter = filter_object self.history = { 'innovations': [], 'residual_stats': [], 'covariance_norms': [], 'condition_numbers': [], 'divergence_indicators': [] } def log_step(self, innovation: np.ndarray, covariance: np.ndarray) -> None: """Log diagnostic data after each filter step.""" self.history['innovations'].append(innovation.copy()) self.history['covariance_norms'].append(np.linalg.norm(covariance)) self.history['condition_numbers'].append(np.linalg.cond(covariance)) # Innovation stats innov_stats = { 'mean': np.mean(innovation), 'std': np.std(innovation), 'norm': np.linalg.norm(innovation), } self.history['residual_stats'].append(innov_stats) def plot_diagnostics(self) -> None: """Generate diagnostic plots.""" import matplotlib.pyplot as plt fig, axes = plt.subplots(2, 2, figsize=(12, 8)) # Innovation norms innovation_norms = [np.linalg.norm(y) for y in self.history['innovations']] axes[0, 0].plot(innovation_norms, alpha=0.7) axes[0, 0].set_title('Innovation Magnitude Over Time') axes[0, 0].set_ylabel('||innovation||') axes[0, 0].grid(True, alpha=0.3) # Condition number axes[0, 1].semilogy(self.history['condition_numbers'], alpha=0.7) axes[0, 1].set_title('Covariance Matrix Condition Number') axes[0, 1].set_ylabel('cond(P)') axes[0, 1].grid(True, alpha=0.3) # Covariance trace axes[1, 0].plot(self.history['covariance_norms'], alpha=0.7) axes[1, 0].set_title('State Uncertainty (||P||)') axes[1, 0].set_ylabel('||P||_F') axes[1, 0].grid(True, alpha=0.3) # Innovation mean trend means = [s['mean'] for s in self.history['residual_stats']] axes[1, 1].plot(means, alpha=0.7, label='Mean') axes[1, 1].axhline(y=0, color='r', linestyle='--', alpha=0.5) axes[1, 1].set_title('Innovation Mean (Should Be ≈ 0)') axes[1, 1].set_ylabel('Mean innovation') axes[1, 1].grid(True, alpha=0.3) axes[1, 1].legend() plt.tight_layout() return fig --- Tuning Guidelines ================= **Choose Adaptation Strategy:** +---------------------------+---------------+----------+------------------+ | Method | Convergence | Robustness | Computational | +===========================+===============+==========+==================+ | Static tuning | N/A (fixed) | High | Low | +---------------------------+---------------+----------+------------------+ | Gain adaptation (β scaling)| Medium | High | Very Low | +---------------------------+---------------+----------+------------------+ | R estimation (adaptive) | Medium | Medium | Low | +---------------------------+---------------+----------+------------------+ | LMS | Slow | Low | Very Low | +---------------------------+---------------+----------+------------------+ | RLS | Fast | Medium | Medium | +---------------------------+---------------+----------+------------------+ **Parameter Selection:** - **Forgetting factor** ($\alpha$ or $\lambda$): - Fast-changing system: $\alpha = 0.90$ (short memory ~10 steps) - Medium dynamics: $\alpha = 0.95$ (medium memory ~20 steps) - Slow dynamics: $\alpha = 0.99$ (long memory ~100 steps) - **LMS step size** ($\mu$): - Too large: Instability, oscillation - Too small: Slow convergence - Typical: $\mu = 0.01$ to $0.1$ - Rule: $\mu < \frac{1}{3 \times \text{signal power}}$ - **Gain scaling limits**: - Prevent extreme swings: $0.5 \leq \beta \leq 1.5$ - More conservative: $0.7 \leq \beta \leq 1.3$ --- Common Pitfalls =============== 1. **Over-adaptation**: Tracking noise instead of signal - *Fix*: Increase forgetting factor, add regularization 2. **Singularity/ill-conditioning**: Covariance matrix becomes rank-deficient - *Fix*: Add regularization, limit condition numbers, use Joseph form 3. **Divergence undetected**: Filter fails silently - *Fix*: Monitor normalized innovation statistics continuously 4. **Poor initialization**: Adaptive phase leads to bad estimates - *Fix*: Start with conservative (high) noise estimates 5. **Sensor bias**: Adaptive filter cannot correct for systematic bias - *Fix*: Detect and remove bias separately (e.g., least-squares batch estimation) --- See Also ~~~~~~~~ - [Kalman Filter Tuning Guide](kalman_filter_tuning.rst) — Static parameter selection - [Particle Filters](particle_filters.rst) — Non-adaptive but handles non-Gaussian distributions - [Smoothing Algorithms](smoothing.rst) — Batch refinement after filtering - [Troubleshooting Guide](troubleshooting.rst) — Filter divergence diagnosis **References:** - Bar-Shalom, Li & Kirubarajan (2001) — *Estimation with Applications to Tracking and Navigation* - Grewal & Andrews (2015) — *Kalman Filtering Theory and Practice* - Haykin (2002) — *Adaptive Filter Theory* — Comprehensive LMS/RLS theory