hall_of_fame
RL4CRN.utils.hall_of_fame
Hall-of-Fame utilities for reinforcement-learning over CRN environments.
This module implements a small, efficient Hall of Fame (HoF) container that
keeps the best-performing environment snapshots seen so far during training.
Items are ranked by a scalar objective (typically the latest task reward/loss
stored in crn_env.state.last_task_info['reward']). The HoF supports:
- Bounded capacity via a heap-backed structure (fast add/replace of the worst item).
- Deduplication via a signature map (keeps only the best version per signature).
- Fast random sampling for replay-style training.
- Ranked iteration / indexing (best → worst) using a lazily rebuilt sorted cache.
Conventions
- The HoF is designed to keep low-loss (or equivalently high-quality) entries.
- Internally, entries store
score = -lossso that higher score means better. - Environment snapshots are cloned on insertion to avoid later mutation.
HoFItem
Container for a single Hall-of-Fame entry.
Instances are ordered so they can be stored in a min-heap (heapq), where
the heap root represents the worst entry currently kept (highest loss /
lowest quality). Ties are broken by timestamp to ensure deterministic heap
behavior when scores match.
| PARAMETER | DESCRIPTION |
|---|---|
loss
|
Objective value to minimize (lower is better).
TYPE:
|
signature
|
Hashable identifier for the environment structure/state.
This implementation stores
|
timestamp
|
Time of insertion/update (e.g.,
TYPE:
|
env
|
Snapshot of the environment to store (should be clone-safe).
|
__lt__(other)
Heap ordering: worst entries compare as "smaller".
We primarily compare by score (=-loss). Smaller score means worse.
On ties, older timestamps are considered smaller.
assign(other)
In-place update of this entry's contents from another HoFItem.
This is used to refresh an existing signature with a better score
without creating a new object (helps keep signature_map references valid).
HallOfFame
Fixed-capacity Hall-of-Fame for environment snapshots.
Maintains up to max_size unique entries keyed by a state/environment
signature. When adding:
- If the signature already exists, the entry is updated only if it is better.
- If the HoF is full, the new entry replaces the current worst entry only if it is better.
The internal heap is optimized for fast worst-item access/replacement. Ranked access (best→worst) is provided via a lazily rebuilt sorted cache.
| PARAMETER | DESCRIPTION |
|---|---|
max_size
|
Maximum number of entries to keep.
TYPE:
|
add(crn_env)
Add a CRN environment snapshot to the Hall of Fame.
The entry's loss is read from crn_env.state.last_task_info['reward']
and the deduplication key is taken from crn_env.state.get_bool_signature().
Notes
- The environment is cloned before storage to prevent later mutation.
- If a matching signature already exists, it is updated in-place only if the new candidate is better (lower loss / higher score).
| PARAMETER | DESCRIPTION |
|---|---|
crn_env
|
Environment instance expected to provide:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the environment does not expose the expected reward field. |
add_all(crn_envs)
Add a collection of environments to the Hall of Fame.
| PARAMETER | DESCRIPTION |
|---|---|
crn_envs
|
Iterable of environments compatible with
TYPE:
|
sample(batch_size)
Uniformly sample stored environments (unordered).
Sampling does not require sorting and is therefore fast.
| PARAMETER | DESCRIPTION |
|---|---|
batch_size
|
Number of samples to draw.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list
|
A list of sampled environment snapshots (length ≤ batch_size). |
__iter__()
Iterate over stored environments ranked from best to worst.
| YIELDS | DESCRIPTION |
|---|---|
env
|
Environment snapshots ordered by increasing loss (best first). |
__getitem__(index)
Get the environment snapshot by rank.
| PARAMETER | DESCRIPTION |
|---|---|
index
|
Rank index where 0 is best, 1 is second-best, etc.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
env
|
The environment snapshot at the requested rank. |
__len__()
Return the number of entries currently stored in the Hall of Fame.