Tag

#chatgpt

1 post tagged chatgpt.

guardrails

ChatGPT Safety: How OpenAI's Guardrails Work and Fail

ChatGPT safety explained: how RLHF, Rule-Based Rewards, safe-completions, and the Moderation API work, plus the jailbreaks that defeat each layer.
May 10, 2026