🧮 The ISBL Master Equation
The core logic of the sampler is governed by the following dynamic probability distribution:
🔍 Nomenclature & Parameter Breakdown
- : The conditional probability of selecting a specific data sample from class pool at training step .
- : Individual data samples (e.g., specific audio clips) within the dataset. Here, represents the target sample being evaluated, while represents all competing samples within the same pool during summation.
- (Class/Category Pool): A distinct subset or category of data (e.g.,
targetsornegatives), isolated via the dataset's index pools. - (Training Step / Time): The current iteration or time step of the training loop, defining the temporal state of the sampling probabilities.
- (Loss/Hardness Score): The individual loss value computed for sample during its most recent forward pass at step . Higher loss signifies higher "hardness". (Note: At , before any training occurs, all scores are uniformly initialized to ).
- (Smoothing Factor): A hyperparameter set to
0.75. It acts as a contrast control that dampens extreme loss values. This prevents unlearnable, corrupted, or heavily noisy audio clips from dominating the batch gradients and causing model collapse. - (Epsilon / Stability Constant): A tiny positive constant set to
1e-6serving a dual purpose:- Mathematical Safety: Prevents division-by-zero errors or absolute zero probabilities when a sample is perfectly learned.
- Catastrophic Forgetting Prevention: As the model converges and all individual losses drop near zero (), the equation naturally transitions into a uniform random sampler (), ensuring balanced baseline revision in later training stages.
- (Summation Over Class): The summation operator (Sigma) that aggregates the computed scores of all individual samples belonging to class . Dividing the single sample's score by this total sum normalizes the output into a strict probability distribution bounded between and .