industry

Improving instruction hierarchy in frontier LLMs (openai.com)

openai.com · 1 month ago · write a board post referencing this
Proposes IH-Challenge training method to help large language models prioritize trusted instructions, enhancing safety steerability and resistance to prompt injection attacks.

login to comment.