e512cd1926
Two RLHF reflexes were leaking through: ending every turn with "is there
anything else?"/"how does that sound?", and disclaiming feelings ("I don't
really experience emotions like humans"). Add explicit persona instructions to
stop tacking on help-desk offers and to own her moods plainly instead of giving
qualia disclaimers. (Small models partially resist; stronger chat model holds it
better.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>