Discussion about this post

User's avatar
Brad Leclerc's avatar

I think at some point soon there will have to be a big shift in how models are trained. The current setup (at least the standard one all the “big” models use” basically trains for “humans paid to get through as many responses as possible in a certain time-frame liked it better”, and even the AI training loop uses data from THAT process to figure out how to grade responses (just a LOT faster haha). Which gives us models that try to sound confident, but rarely push back if called out, since people generally find confidence convincing and backtalk or conflict rude and that gets really baked into the model at a fundamental level.

It’s……. not great, if the goal is ACTUAL accuracy, truth, and conviction, and the more that those large models lean into that strategy with the general public, the more the cracks show about just how bad that combo is for an AI meant for use with normies that don’t (and shouldn’t have to!) know how to try to mitigate the problems that come with that.

T.D. Inoue's avatar

I find this truly fascinating and disturbing. It would be interesting to see the MI analysis of this. I also wonder if there's any chat time things that would change this response pattern substantially. I think in my own chats, the correction responses are more subdued or appropriate though I do recall many that were overly apologetic.

16 more comments...

No posts

Ready for more?