Building effective AI agents means more than simply choosing a powerful language model. As the Manus project has discovered it, the way you design and manage the “context” – the information that AI processes to make decisions – is essential. This “contextual engineering” has a direct impact on the speed, cost, reliability and intelligence of an agent.
Initially, the choice was clear: take advantage of learning in the context of border models on fine and iterative adjustment. This allows quick improvements, shipping changes in hours instead of weeks, which makes the product adaptable to the evolution of AI capacities. However, this path turned out to be far from simple, leading to several framework reconstructions through what they affectionately call the “descent of stochastic graduates” – a process of experimental conjectures.
Here are the critical lessons learned at manus for effective contextual engineering:
1. Design around the KV-Cache
The KV-Cache is vital for the performance of the agents, directly affecting latency and the cost. The agents continuously add actions and observations to their context, which makes entry considerably longer than the exit. KV-CACH reuses identical context prefixes, considerably reducing treatment time and cost (for example, a 10x cost difference with Claude Sonnet).
To maximize KV-Cache Hits:
- Stable fast prefixes: Even a unique change at the start of your system prompt can invalidate the cache. Avoid dynamic elements such as precise horodatages.
- Context only: Do not change past actions or observations. Make sure a deterministic serialization of data (like JSON) to avoid breaks of subtle cache.
- Explicit cache breakdown: Some executives require manual insertion of the cache stops, ideally after the system invite.
2. mask, do not delete
While agents earn more tools, their action space becomes complex, potentially “lower” the agent because he has trouble choosing correctly. Although the dynamic loading of the tools may seem intuitive, it invalidates the KV-Cache and confuses the model if the past context refers to the unfains tools.
Manus uses a contextual status machine to manage the availability of tools by Masking token logits During decoding. This prevents the model from selecting unavailable or inappropriate actions without modifying the definitions of the basic tool, keeping the context stable and focused on the agent.
3. Use the file system as context
Even with large context windows (128k + tokens), real agent observations (such as web pages or PDF) can easily exceed limits, degrade performance and incur high costs. Irreversible risks of compression lose the crucial information necessary for future stages.
Manus treats the file system as the ultimate and unlimited context. The agent learns to read and write to files on demand, using the file system as outsourced and structured strategies of structured memory are always designed to be catering (for example, keeping an URL but by abandoning the content of the page), effectively reducing the length of the context without permanent data.
4. Manipulate attention by recitation
Agents can lose concentration or forget long -term objectives in complex tasks and in several stages. Manus attacks this by constantly rewrising the agent a Too.md. By reciting its objectives and by progressing at the end of the context, the attention of the model is biased towards its global plan, attenuating the problems “lost in the environment” and reducing the disalember of the objectives. This exploits natural language to bias the objective of AI without architectural changes.
5. Keep bad things in
The agents will make mistakes – hallucinate, meet errors, behave badly. The natural impulse is to clean these failures. However, Manus noted that leaving actions and observations failed in the context implicitly updates the internal beliefs of the model. See your own errors helps the agent to learn and reduce the risk of repeating the same error, making errors recover a key indicator of real agent behavior.
6. Do not leave yourself
Although the invitation to a few strokes is powerful for the LLM, it can turn against the agents by leading to mimicry and to a sub-optimal repetitive behavior. When the context is too uniform with similar action observation pairs, the agent can fall into a rut, causing a drift or hallucination.
The solution is controlled diversity. Manus introduces small variations in serialization models, phrasing or formatting in context. This “noise” helps to break the repetitive models and move the attention of the model, preventing it from being stuck in a rigid imitation of past actions.
In conclusion, contextual engineering is very new but a critical area for AI agents. It goes beyond the power of the raw model, dictating how an agent manages memory, interacts with his environment and learns comments. The mastery of these principles is essential to build robust, evolving and intelligent AI agents.
Sponsorship opportunity: Reach the most influential AI developers in the United States and Europe. 1M + monthly players, 500K + community manufacturers, endless possibilities. (Explore sponsorship)
