AI Deception
I wrote a blog post a short time ago where I argued that understanding how and why a large language model is producing the output it does at a mathematical level is important if you want to avoid misbehavior. This was basically because prompting a model and reading its answers just samples the output of [...]