Skip to content

[v0.5.0] Implement Out-of-Bag (OOB) Error Estimation for Random Forests #31

@noahgift

Description

@noahgift

Overview

Implement Out-of-Bag (OOB) error estimation for RandomForestClassifier and RandomForestRegressor. OOB provides a validation estimate without needing a separate test set.

Background

When training Random Forests with bootstrap sampling, approximately 37% of samples are "out-of-bag" (not included) for each tree. These OOB samples can be used for validation, providing an unbiased estimate of test error.

Implementation Details

Core Algorithm

For each sample in training data:

  1. Find all trees where this sample was OOB (not in bootstrap sample)
  2. Aggregate predictions from those trees only
  3. Compare OOB prediction with true label/value
  4. Compute OOB score across all samples

API Design

For RandomForestClassifier:

impl RandomForestClassifier {
    // Existing methods...
    
    /// Computes out-of-bag score (accuracy for classification)
    ///
    /// Returns None if bootstrap=false or no OOB samples available
    pub fn oob_score(&self) -> Option<f32>;
    
    /// Gets out-of-bag predictions for training samples
    ///
    /// Returns None if bootstrap=false
    pub fn oob_prediction(&self) -> Option<Vec<usize>>;
    
    /// Internal: track which samples were OOB for each tree
    oob_indices: Vec<Vec<usize>>,  // Per-tree OOB sample indices
}

For RandomForestRegressor:

impl RandomForestRegressor {
    // Existing methods...
    
    /// Computes out-of-bag R² score
    ///
    /// Returns None if bootstrap=false or no OOB samples available
    pub fn oob_score(&self) -> Option<f32>;
    
    /// Gets out-of-bag predictions for training samples
    ///
    /// Returns None if bootstrap=false
    pub fn oob_prediction(&self) -> Option<Vector<f32>>;
    
    /// Internal: track which samples were OOB for each tree
    oob_indices: Vec<Vec<usize>>,  // Per-tree OOB sample indices
}

Key Features

  • Automatic during fit(): OOB indices tracked during bootstrap sampling
  • Free validation: No separate test set needed
  • Unbiased estimate: OOB score ≈ test set accuracy/R²
  • Early stopping: Can use OOB score to determine optimal n_estimators

Acceptance Criteria (EXTREME TDD)

Tests (12+ comprehensive tests)

RandomForestClassifier OOB:

  • oob_score() returns accuracy on OOB samples
  • oob_prediction() returns correct length
  • OOB score close to test set accuracy (within 5%)
  • oob_score() returns None before fit()
  • All samples have at least one OOB tree
  • OOB predictions use only OOB trees (not all trees)

RandomForestRegressor OOB:

  • oob_score() returns R² on OOB samples
  • oob_prediction() returns Vector with correct length
  • OOB R² close to test set R² (within 0.1)
  • oob_score() returns None before fit()
  • Handles edge case: sample never OOB (use all trees)
  • Reproducible with random_state

Examples

  • Update examples/random_forest_iris.rs to show OOB score
  • Update examples/random_forest_regression.rs to show OOB score
  • Demonstrate OOB vs test set comparison

Documentation

  • Update book/src/ml-fundamentals/ensemble-methods.md
    • Add OOB error estimation section
    • Mathematical explanation of why OOB works
    • When to use OOB vs cross-validation
  • Update case studies with OOB usage
    • book/src/examples/random-forest.md
    • book/src/examples/random-forest-regression.md

Technical Design

OOB Sample Tracking

During fit():

for i in 0..n_estimators {
    let bootstrap_indices = _bootstrap_sample(n_samples, seed);
    
    // Track which samples are OOB (not in bootstrap sample)
    let mut oob_mask = vec![true; n_samples];
    for &idx in &bootstrap_indices {
        oob_mask[idx] = false;
    }
    let oob_indices: Vec<usize> = oob_mask.iter()
        .enumerate()
        .filter_map(|(i, &is_oob)| if is_oob { Some(i) } else { None })
        .collect();
    
    self.oob_indices.push(oob_indices);
    
    // Train tree as usual...
}

OOB Prediction

For each training sample:

pub fn oob_prediction(&self) -> Option<Vec<usize>> {
    if self.oob_indices.is_empty() {
        return None;
    }
    
    let n_samples = /* training sample count */;
    let mut predictions = Vec::with_capacity(n_samples);
    
    for sample_idx in 0..n_samples {
        // Find trees where this sample was OOB
        let mut votes = HashMap::new();
        
        for (tree_idx, oob_set) in self.oob_indices.iter().enumerate() {
            if oob_set.contains(&sample_idx) {
                let pred = self.trees[tree_idx].predict_single(sample_idx);
                *votes.entry(pred).or_insert(0) += 1;
            }
        }
        
        // Majority vote (or mean for regression)
        let prediction = /* aggregate votes */;
        predictions.push(prediction);
    }
    
    Some(predictions)
}

OOB Score Computation

Classification:

pub fn oob_score(&self) -> Option<f32> {
    let oob_preds = self.oob_prediction()?;
    
    // Compare with training labels (stored during fit)
    let correct = oob_preds.iter()
        .zip(self.y_train.iter())
        .filter(|(pred, true_label)| pred == true_label)
        .count();
    
    Some(correct as f32 / oob_preds.len() as f32)
}

Regression:

pub fn oob_score(&self) -> Option<f32> {
    let oob_preds = self.oob_prediction()?;
    
    // Compute R² with training targets
    Some(crate::metrics::r_squared(&self.y_train, &oob_preds))
}

Complexity Analysis

  • Time: O(n · n_trees) - iterate through all samples and trees
  • Space: O(n · n_trees) - store OOB indices for each tree
  • Marginal cost: Minimal overhead during fit(), one-time cost for oob_score()

Benefits

  1. Free validation: No need for separate test set
  2. Unbiased estimate: OOB error ≈ test error
  3. Model selection: Choose n_estimators using OOB score
  4. Early stopping: Stop when OOB score plateaus
  5. Confidence intervals: Bootstrap-based uncertainty estimates

Testing Strategy

  • Unit tests for OOB index tracking
  • Integration tests comparing OOB score to test set score
  • Property tests: OOB score should be within reasonable range of test score
  • Edge case: All samples in bootstrap (rare but possible)

Related Work

  • Existing: RandomForestClassifier, RandomForestRegressor
  • Requires: Store training labels/targets during fit()
  • Future: Feature importance using OOB samples (Issue #TBD)

Priority

High - Core feature for v0.5.0, enables model selection without test set

References

  • Breiman, L. (1996). "Out-of-Bag Estimation"
  • Breiman, L. (2001). "Random Forests" (Section 3.1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions