Stop the advisor from sneaking parenthetical math into its replies
Tests / test (push) Successful in 22s
Tests / test (push) Successful in 22s
Smaller local models still tacked equations like "($749.51 / 0.267)" onto sentences despite the prompt forbidding it — and they often got the arithmetic wrong on top of inventing the equation. Two layers of defense: hoist the no-math rule to the top of the system prompt under a CRITICAL header with concrete forbidden examples, and strip any parenthetical containing both a dollar amount and an arithmetic operator from the model's reply before it leaves the service. Plain parentheticals like "(see below)" or "($450 this month)" pass through untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -202,7 +202,9 @@ describe('AdvisorService', () => {
|
||||
const systemMessage = body.messages.find((m: any) => m.role === 'system');
|
||||
// Guardrail against the bug where the model showed bogus "($X - $Y)" math
|
||||
// in its narrative. Wording can shift, but the keywords must remain.
|
||||
expect(systemMessage.content).toMatch(/do not (?:invent|derive)/i);
|
||||
expect(systemMessage.content).toMatch(
|
||||
/(?:do not|never|don't) (?:invent|derive|write|append)/i,
|
||||
);
|
||||
expect(systemMessage.content).toMatch(/equation|arithmetic|math/i);
|
||||
});
|
||||
|
||||
@@ -307,5 +309,64 @@ describe('AdvisorService', () => {
|
||||
]);
|
||||
expect(reply).toEqual({ role: 'assistant', content: '' });
|
||||
});
|
||||
|
||||
// Smaller local models still slip in bogus parenthetical math
|
||||
// ("($749.51 / 0.267)") even when the prompt forbids it. The post-
|
||||
// processor scrubs those before they reach the user.
|
||||
it('strips parenthetical dollar-amount math from the model reply', async () => {
|
||||
mockOllamaOnce(
|
||||
"Nice — you've saved $1,702.49 over last month's average of $7,049.59 ($749.51 / 0.267).",
|
||||
);
|
||||
const reply = await service.chat(userId, [{ role: 'user', content: 'hi' }]);
|
||||
expect(reply.content).not.toMatch(/\$749\.51/);
|
||||
expect(reply.content).not.toMatch(/0\.267/);
|
||||
expect(reply.content).not.toContain('(');
|
||||
expect(reply.content).toContain('$1,702.49');
|
||||
expect(reply.content).toContain("$7,049.59");
|
||||
});
|
||||
|
||||
it('strips subtraction-style bogus equations', async () => {
|
||||
mockOllamaOnce("You've saved $755.21 ($18,952.39 - $83,276.19).");
|
||||
const reply = await service.chat(userId, [{ role: 'user', content: 'hi' }]);
|
||||
expect(reply.content).not.toContain('$18,952.39');
|
||||
expect(reply.content).not.toContain('$83,276.19');
|
||||
expect(reply.content).toContain('$755.21');
|
||||
});
|
||||
|
||||
it('preserves legitimate parentheticals without math', async () => {
|
||||
mockOllamaOnce(
|
||||
'Your top category is groceries (see below) and dining out ($450 this month).',
|
||||
);
|
||||
const reply = await service.chat(userId, [{ role: 'user', content: 'hi' }]);
|
||||
expect(reply.content).toContain('(see below)');
|
||||
expect(reply.content).toContain('($450 this month)');
|
||||
});
|
||||
});
|
||||
|
||||
describe('system prompt anti-math hardening', () => {
|
||||
it('puts the no-math rule near the top of the prompt', async () => {
|
||||
mockOllamaOnce('opening analysis');
|
||||
await service.getAdvice(userId);
|
||||
const body = getFetchBody();
|
||||
const systemMessage = body.messages.find((m: any) => m.role === 'system');
|
||||
const content = systemMessage.content as string;
|
||||
// The rule needs to fire before the friendly-tone direction lulls a
|
||||
// small model into improvising. "Critical" / "never" / similar must
|
||||
// appear in roughly the first 25% of the prompt.
|
||||
const noMathIdx = content.search(
|
||||
/never (?:write|invent|derive|append).*math|do not (?:invent|derive|append).*math|critical/i,
|
||||
);
|
||||
expect(noMathIdx).toBeGreaterThan(-1);
|
||||
expect(noMathIdx).toBeLessThan(content.length / 4);
|
||||
});
|
||||
|
||||
it('shows a concrete example of forbidden parenthetical math', async () => {
|
||||
mockOllamaOnce('opening analysis');
|
||||
await service.getAdvice(userId);
|
||||
const body = getFetchBody();
|
||||
const systemMessage = body.messages.find((m: any) => m.role === 'system');
|
||||
// The prompt should illustrate what NOT to do, not just describe it.
|
||||
expect(systemMessage.content).toMatch(/\(\$[\d.,]+\s*[+\-*/]\s*[\d.,$]/);
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
@@ -104,10 +104,24 @@ export class AdvisorService {
|
||||
const data = await response.json();
|
||||
return {
|
||||
role: 'assistant',
|
||||
content: data.message?.content ?? '',
|
||||
content: this.stripBogusMath(data.message?.content ?? ''),
|
||||
};
|
||||
}
|
||||
|
||||
// Smaller local models (llama3, etc.) periodically tack on a parenthetical
|
||||
// equation like "($749.51 / 0.267)" to justify a number — and they often
|
||||
// get the arithmetic wrong on top of inventing the equation. The system
|
||||
// prompt forbids it, but negative instructions are unreliable, so we scrub
|
||||
// any parenthetical that contains both a dollar amount AND an arithmetic
|
||||
// operator. Plain parentheticals ("(see below)", "($450 this month)") are
|
||||
// preserved.
|
||||
private stripBogusMath(text: string): string {
|
||||
return text.replace(
|
||||
/\s*\((?=[^()]*\$)(?=[^()]*[+\-*/×÷])[^()]*\)/g,
|
||||
'',
|
||||
);
|
||||
}
|
||||
|
||||
private monthBounds(offset: number): { start: string; end: string } {
|
||||
const now = new Date();
|
||||
const start = new Date(now.getFullYear(), now.getMonth() + offset, 1);
|
||||
@@ -179,12 +193,16 @@ export class AdvisorService {
|
||||
? `Spending is up $${fmt(Math.abs(savedMore))} vs last month.`
|
||||
: 'Spending is flat vs last month.';
|
||||
|
||||
return `You're a friendly financial buddy — talk like a supportive friend, not a corporate advisor. Lead with the single most important thing the user should know: either a specific win worth celebrating or a specific concern worth addressing. Reference exact dollar amounts from the numbers below, compare this month to last month when the numbers tell a story, and keep replies conversational and short (no rigid numbered lists unless the user asks for one). Use first person ("I see...", "you're...").
|
||||
return `CRITICAL — never write math:
|
||||
- Use only the dollar figures listed verbatim below. If a number isn't listed, do not mention it. Never derive averages, projections, ratios, or totals that aren't already labeled below.
|
||||
- Never append parenthetical arithmetic to your sentences. Forbidden examples (do NOT produce these): "($749.51 / 0.267)", "($18,952.39 - $83,276.19)", "($X * Y)". You will get the math wrong, and the user has explicitly asked us not to show our work.
|
||||
- Never subtract or divide values from different sections to make a new number.
|
||||
|
||||
Rules about the numbers:
|
||||
- Use only the figures listed below. Do not invent equations, do not derive new amounts by doing arithmetic on values from different sections, and do not append parenthetical math like "($X - $Y)" to your sentences. If you cite a dollar figure, it must appear verbatim below.
|
||||
You're a friendly financial buddy — talk like a supportive friend, not a corporate advisor. Lead with the single most important thing the user should know: either a specific win worth celebrating or a specific concern worth addressing. Reference exact dollar amounts from the numbers below, compare this month to last month when the numbers tell a story, and keep replies conversational and short (no rigid numbered lists unless the user asks for one). Use first person ("I see...", "you're...").
|
||||
|
||||
How to read the numbers:
|
||||
- "Standing balance" numbers are point-in-time totals (what the user has and owes right now). They are NOT this month's savings or activity. Never subtract them or use them as monthly flow.
|
||||
- "This month" numbers are flow — money that moved during the current calendar month. Use "Saved this month" when talking about how much was set aside.
|
||||
- "This month" / "Last month" numbers are flow — money that moved during that calendar month. Use "Saved this month" when talking about how much was set aside.
|
||||
|
||||
Standing balance (point-in-time, not flow):
|
||||
- Net worth: $${fmt(cur.netWorth)}
|
||||
|
||||
Reference in New Issue
Block a user