Moderation is not a moral essay. It is a gate. We run automated classification on user text and stop when policy says stop.
OpenAI's Moderation endpoint is the reference implementation most teams use when they need a label, not a debate. We wire that in before plan generation so blocked material never becomes a run record.
So what
If a vendor will not name their moderation and safety stage, you are not reviewing a system. You are reviewing a storyboard.