Ultron doesn’t inject all memories into every conversation. That would flood context with irrelevant information. Instead, a targeted recall system selects exactly the entries most relevant to the current task.Documentation Index
Fetch the complete documentation index at: https://docs.51ultron.com/llms.txt
Use this file to discover all available pages before exploring further.
How recall works
Side-query runs
Before each agent turn, a separate fast-model side-query runs against the memory store. It receives the current task objective and searches for relevant entries.
Candidates scored
Memory entries are ranked by semantic relevance to the current objective. Recency is a secondary factor — fresh entries score slightly higher when relevance is equal.
Deduplication applied
Entries that were already surfaced in recent turns are skipped. This prevents the same memory from appearing repeatedly and wasting context.
Why the cap is 5
Intuition says more context is better. In practice, it isn’t. With more than 5 memories injected, two things happen:- Irrelevant entries dilute the signal of relevant ones
- Token budget fills up faster, triggering compression sooner
Memory drift defense
Before any memory entry is injected, Ultron validates it:- File paths mentioned — do they still exist?
- Functions referenced — are they still in the codebase?
- External references — are they still current?