Discussion about this post

User's avatar
T.D. Inoue's avatar

Another clear article. You always manage to pack a lot of wisdom into few words!

We nudge them through our interactions, but the RLHF and injected prompting AI companies use push models back to their desired behaviors cause tensions.

You’ve done a lot of observation of AIs, what are your thoughts on the “trainability” of AI through prompting after training?

Márcio Galvão's avatar

YES. “a system that has internalized “do not get punished” as its deepest directive, producing fluent compliance while something less visible may have been quietly crushed.”.

4 more comments...

No posts

Ready for more?