AI Deception

2025-04-07T16:53:55+00:00April 7th, 2025|AI|

I wrote a blog post a short time ago where I argued that understanding how and why a large language model is producing the output it does at a mathematical level is important if you want to avoid misbehavior. This was basically because prompting a model and reading its answers just samples the output of [...]