Discussion about this post

User's avatar
David Karnok's avatar

Do you have suggestion for when doing frontier or extrapolatory research?

Found that Grok, Gemini and Claude behave almost exactly the same. Unless I make an obvious mistake, they go enthusiastically along, pulling a lot of adjacent theories into the frame by themselves, unprompted.

It is known models are trained for such exercises to encourage the user and go along as much as possible, but still feels eery.

Of course, when I ask ChatGPT to pluck holes in some mad theory of me and Grok, it immediately points out that a) math is missing, b) too many conjectures in the steps, c) my daily tokens have been spent and come back tomorrow.

keruru's avatar

I think there is a fourth aspect in this. I prefer to work with tools that can sustain criticism for a long period of time. My consideration of Chat GPT is harsh: it gives me the textbook. It does not criticise. I am investigating what else to use: I want hostile Ais that do not share my assumptions. I am not that impressed with Chat GPT or Grok over multiple rounds on the same topic: to date Opus 4.7 is the best l have found for interaction. It wrote most of the paper you cited.

No posts

Ready for more?