This page is shown in English while a reviewed translation for your locale is prepared.

Why We Engineer AI Accuracy Without “Dynamic Exemplar” Libraries

A Perspectis AI perspective for leaders: accuracy as platform discipline—tenant-aware grounding, structured capabilities, and honest limits of similarity-Q&A retrieval—not hype about prompts alone.

7 min read2026-04-15

A plain-language perspective for leaders, clients, and teams (April 2026)

The short answer

We treat reliable AI assistance as a platform discipline: clear roles for the model, tenant-aware handling, grounded context from each customer’s permitted data, structured hand-offs where automation must not drift, and deliberate routing between internal answers, optional live research, and deeper reasoning when complexity warrants it.

We deliberately do not rely on a fashionable pattern sometimes called dynamic exemplar retrieval—maintaining a large bank of historical question-and-answer pairs and injecting the “closest” examples into every prompt. That pattern can look clever in demos; we prefer an approach that stays explainable, isolated per organisation, and aligned with duty-of-care expectations in professional services.

Why this matters in the market conversation

Headlines often reduce “better AI” to bigger models or cleverer prompts. In regulated and reputation-sensitive industries, leaders rightly ask a different question: what exactly is the system allowed to see, cite, and do—and how do we keep that stable as models and vendors change?

This note is our plain-language answer to one slice of that question: how we think about accuracy and prompt preparation inside Perspectis AI, including the Personal Agent Representative path that supports ChatWindow and related surfaces.

What “prompt engineering” means here (without hype)

Prompt engineering simply means everything we deliberately place in front of the model before it answers: instructions, permitted context, output shape, and guardrails. It is not a magic spell; it is operational briefing—the same idea as giving a senior colleague a tight mandate before they speak on behalf of the firm.

The pattern we avoid: dynamic exemplar retrieval (explained fairly)

Some systems maintain a library of example questions and answers—sometimes drawn from broad datasets or pooled histories. On each new question, they search for similar past Q&As and paste those examples into the prompt so the model can mimic tone and structure.

That can improve fluency in narrow benchmarks. It also introduces risks we care about in enterprise settings: cross-customer leakage if libraries are shared, stale or wrong “authority by similarity,” and opacity (“why did the model lean that way?”) that is hard to defend under audit.

We do not use that global exemplar-Q&A-bank approach for Perspectis AI.

Two clarifications (so nobody confuses our approach with “just another retrieval demo”)

Conversation continuity — We include the current thread (recent turns and, when needed, summaries of longer histories) so the assistant stays coherent. That is the customer’s own conversation, not a retrieved set of strangers’ canned Q&A exemplars.
Organisation documents — Where integrations allow, we may retrieve that organisation’s own documents (for example from a connected document system). That is permitted customer content, not a public similarity library of unrelated Q&As.

How we pursue accuracy instead (structural, not decorative)

1) Security-first staging and tenancy

Before a model produces polished language, we route requests through security-aware, organisation-scoped handling. Not every message follows one undifferentiated “chat only” path: we can branch for voice-oriented flows, specialised product areas, or patterns that warrant a structured context shortcut.

Why it matters: Accuracy starts with the right boundary—who the assistant is acting for, and which data and tools are in scope.

2) Clear instructions and honest classification

We give the model stable role instructions and classify whether a question is primarily about time tracking and billing versus more general assistance—then we align the brief accordingly. Separately, for “behind the scenes” jobs (for example routing to a capability family or choosing research sources), we often require strict machine-readable outputs so downstream logic can trust the result.

Why it matters: The model is less likely to freestyle critical routing decisions without constraints.

3) Grounding in customer operational data—not strangers’ examples

For work-specific questions, we pull in relevant internal context (for example time entries, calendar-related signals, billing-related records, client and project context where applicable) using relevance and recency thinking—not “find the most similar historical chat from the internet.”

Why it matters: Answers become defensible because they trace to permitted operational truth, not to anonymous exemplars.

4) Rule-based template and policy selection where templates exist

Where we offer structured wording options (for example around time-entry descriptions), we choose among them with transparent rules (industry fit, activity type, detail level)—not by similarity search over a global Q&A museum.

Why it matters: Predictable behaviour beats surprising “creative” substitutions in compliance-adjacent workflows.

5) Structured outputs and registered capabilities

When automation must act, we define output shapes the platform can parse, and we connect action paths to registered capabilities exposed through application programming interfaces—so “helpful prose” and “safe execution” stay aligned.

Why it matters: Fewer mismatches between what a human read and what the system did.

6) Intelligent routing: internal answers, optional research, proportionate depth

We do not treat every question identically after the first checks.

Internal answers come from permitted customer data when the question is about that customer’s work.
General knowledge may still apply when the question is not tenant-specific.
When fresh external facts are needed, we can run research paths that select sources (for example web search where configured, internal knowledge, context-style sources, connected documents, or a hybrid mix)—rather than always browsing the open web.
Deployment constraints can narrow or remove external sources so behaviour stays appropriate for locked-down environments.
Complexity signals can route harder questions toward deeper reasoning configurations while keeping routine traffic efficient.

Why it matters: The right kind of evidence is used for the kind of question—without defaulting to a single blunt instrument.

7) Engineering discipline: scenarios, references, and graded natural-language suites

We maintain automated checks that compare structured assistant outputs and tool usage to reference application programming interface outcomes for important integration paths, alongside broader natural-language regression suites with grading rubrics for assistant quality at scale. Browser-only risks (sessions, streaming layouts) sit in separate frontend tests where that separation matters.

Why it matters: Accuracy is treated as an ongoing property of the system—not a one-time model pick.

Comparison at a glance

We intend this table for stakeholder conversations. Wording is intentionally non-technical.

| Topic | Similarity-to-example-Q&A pattern (common in some demos) | How Perspectis AI approaches the same need |

| --- | --- | --- |

| Primary grounding | Retrieved “nearest neighbour” Q&A exemplars | Permitted customer data, conversation thread, and registered capabilities |

| Personalisation mechanism | Often pooled or anonymised example banks | Tenant-scoped context and organisation-owned documents where enabled |

| Explainability | “It looked like these past cases” | Pipeline stages, classification, and reference-backed checks where applicable |

| Risk posture | Higher sensitivity to library composition and leakage | Isolation-by-design themes in our security posture; conservative source selection |

| Automation hand-off | Sometimes loose prose | Structured outputs where machines must consume the result |

| Fresh facts | Not guaranteed | Optional research paths with explicit source choices (where policy allows) |

Legend: directional comparison for positioning, not a weekly feature scorecard.

How this connects to our demo and product story

The Perspectis AI Demo Environment is where we make the abstract concrete: end-to-end professional scenarios (billing, walls, outside counsel guidelines, messaging, orchestration, and more) that only work when accuracy, separation, and accountability are treated as platform properties—not as a single prompt appended to a raw model.

Sources (external, for further reading)

OWASP: Top 10 for Large Language Model Applications — our industry peers increasingly use this framing for prompt injection, excessive agency, and related risks that “clever context stuffing” does not solve by itself.
Anthropic (context for enterprise builders): Claude Managed Agents overview — illustrates how adopting teams often still own policy around a managed agent harness, which aligns with our emphasis on the application plane.

This document is written for external, non-technical readers. Authoritative security assessments and implementation detail live in our internal security and engineering documentation.