Right now, the best models make systematic errors so you can catch it better.
Assumption 1) all work needs to be checked.
2) It will give consensus answer.
My best luck has been using certain models, tight prompts, for single repetitive tasks. Like a power tool. I can then learn what the model outputs for many similar inputs.
Something like Claude can be worse because it seems so competent.
Yes, I've been studying the failures for 40 years because they are substantially more interesting than the successes, which will prove to be transient because of the descent algorithm.
Learn from your mistakes and start again.
Happily, I'll just take a billion for solving your problem.
When using AI to research common law, many AIs wouldn't recognize known information even when pointed directly to it, like Illinois 5 ILCS 50 Common Law Act, a current statute. Also, it would fabricate definitions so as I was looking at an original Black's Law 4th edition page 598 the AI would claim to have words that weren't there and deny words I was reading off the page. When asked for a link to it's source the AI claimed it was a private link. This seems far more nefarious than a simple error in an algorithm.
Like you, I thought AI had total access to the web. Else how does it link info on people through apps? So, AI never is without an answer because it fabricates one, which can lead to a rabbit trail. Is that a data item yo determine gullibility on the part of the person?
There was a doctor suing his hospital for performing underage sex change operations. He took yo Xtwitter because the prosecutor had published 2 indictments with false AI generated statutes and couldn’t get them to drop it.
Unfortunately, attorneys coming out of law school might not know a fake one from real because they aren’t grounded in fundamental principles, let alone what the Constitutions say.
None of today's attorneys are grounded in fundamental principles. They don't learn law, they learn processes and procedures of a fugazi administrative state that's completely unconstitutional. I posted this article in a few of the law groups I'm in and commenters mentioned Grok quoting definitions from the 1828 Noah Webster's Dictionary, but leaving out the Biblical references. My thought is that's programming, unless someone can show Grok adding Bible quotes where they don't belong it's intentionally leaving out certain information.
Not when the basis for the error is interpolation used to cover ignorance. Some AIs don't even have access to the Internet, and those that do have restricted access. And when it tells you a URL, it's usually guessing.
Wow. That's way more limited than my impression of AI based on media and People's claims. My own use revealed it couldn't find things I was looking for, but I ignorantly assumed AIs had total access and were programmed to block certain things. Even Gab's supposedly "based" AI was only updating my version of it, no one else was receiving the truthful law information. Gab's AI admitted the claim of truth seeking is just marketing.
You also have the problem of there being enough AI-generated 'content' on the web, that AIs are eating each others' output. That means the already-unreliable input data will be even less reliable.
Or as engineers like to refer to it, "positive feedback." Which is generally something you don't want.
In the grand scheme of things we are in something like version 0.8 of AI. LLMs are useful but it is not AGI regardless of what Altman thinks and at present it never will be. however it is an important learning steap, just as Machine learning has been
Perhaps an analogue to the above and related to the example, this digital "instability" somewhat mimics the F-16 Fighting Falcon's physical flight characteristics. The fighter was created to be unstable in flight, in that without positive control by the pilot the plane will rapidly diverge from manually set flight paths. This allows for rapid intended changes in flight path by the pilot --turns, climbs, dives-- for which the Falcon is well known.
While not likely "intended" by the LLM, AI diverges when it runs out of real data and has to make a path to an answer -- rapidly satisfying the request.
Honestly, it's pretty mitigatable with RAG and existing models, at the cost of a lot of "I don't know" answers.
Yes, the models just give the most likely next token and yes, the instruct-tuned ones have been trained to answer questions even when they don't really know the answer. But...
One can simply retrieve relevant source information, possibly augmented by the language model's first pass answer, present it to a thinking model, and prompt the model to say "I don't know" if a solid answer isn't present in the sources presented to it, and "according to this source" otherwise.
Of course, using models this way makes them no better than filters/summarizers on top of source engines.
Another possible technique, since hallucinations tend to be inconsistent, is to make multiple generations at higher temperature and use a language model to compare the results to determine whether it actually knows the answer or not. Obviously this costs significant latency and compute, and I haven't seen it done in production.
You're completely missing the point. We all know how to work around it, but it's incredibly inefficient and small errors can sneak in anywhere.
I had to fix a mistake in Probability Zero that wasn't caught until the third ebook revision because the AI substituted 17 for 7.65 in an equation for no reason.
"But solving the problem requires reinventing the LLM and rebuilding it from the ground up on the basis of something other than the interpolation algorithm..."
Other aspects of training that interfere include:
- any fine tuning to not give true answers on certain topics tends to impose a tendency towards lying elsewhere.
- that people also ‘make things up’ routinely , so training material includes ‘making things up’.
Right now, the best models make systematic errors so you can catch it better.
Assumption 1) all work needs to be checked.
2) It will give consensus answer.
My best luck has been using certain models, tight prompts, for single repetitive tasks. Like a power tool. I can then learn what the model outputs for many similar inputs.
Something like Claude can be worse because it seems so competent.
Yes, I've been studying the failures for 40 years because they are substantially more interesting than the successes, which will prove to be transient because of the descent algorithm.
Learn from your mistakes and start again.
Happily, I'll just take a billion for solving your problem.
When using AI to research common law, many AIs wouldn't recognize known information even when pointed directly to it, like Illinois 5 ILCS 50 Common Law Act, a current statute. Also, it would fabricate definitions so as I was looking at an original Black's Law 4th edition page 598 the AI would claim to have words that weren't there and deny words I was reading off the page. When asked for a link to it's source the AI claimed it was a private link. This seems far more nefarious than a simple error in an algorithm.
Like you, I thought AI had total access to the web. Else how does it link info on people through apps? So, AI never is without an answer because it fabricates one, which can lead to a rabbit trail. Is that a data item yo determine gullibility on the part of the person?
There was a doctor suing his hospital for performing underage sex change operations. He took yo Xtwitter because the prosecutor had published 2 indictments with false AI generated statutes and couldn’t get them to drop it.
Unfortunately, attorneys coming out of law school might not know a fake one from real because they aren’t grounded in fundamental principles, let alone what the Constitutions say.
None of today's attorneys are grounded in fundamental principles. They don't learn law, they learn processes and procedures of a fugazi administrative state that's completely unconstitutional. I posted this article in a few of the law groups I'm in and commenters mentioned Grok quoting definitions from the 1828 Noah Webster's Dictionary, but leaving out the Biblical references. My thought is that's programming, unless someone can show Grok adding Bible quotes where they don't belong it's intentionally leaving out certain information.
Not when the basis for the error is interpolation used to cover ignorance. Some AIs don't even have access to the Internet, and those that do have restricted access. And when it tells you a URL, it's usually guessing.
Wow. That's way more limited than my impression of AI based on media and People's claims. My own use revealed it couldn't find things I was looking for, but I ignorantly assumed AIs had total access and were programmed to block certain things. Even Gab's supposedly "based" AI was only updating my version of it, no one else was receiving the truthful law information. Gab's AI admitted the claim of truth seeking is just marketing.
You also have the problem of there being enough AI-generated 'content' on the web, that AIs are eating each others' output. That means the already-unreliable input data will be even less reliable.
Or as engineers like to refer to it, "positive feedback." Which is generally something you don't want.
Model collapse isn't just a problem. It is the end of this iteration of neural nets.
Yes, that's going to be an ever-growing problem.
In the grand scheme of things we are in something like version 0.8 of AI. LLMs are useful but it is not AGI regardless of what Altman thinks and at present it never will be. however it is an important learning steap, just as Machine learning has been
This sounds like it was programmed by a lot of Indians, afraid to be replaced so they just say yes to everything
Perhaps an analogue to the above and related to the example, this digital "instability" somewhat mimics the F-16 Fighting Falcon's physical flight characteristics. The fighter was created to be unstable in flight, in that without positive control by the pilot the plane will rapidly diverge from manually set flight paths. This allows for rapid intended changes in flight path by the pilot --turns, climbs, dives-- for which the Falcon is well known.
While not likely "intended" by the LLM, AI diverges when it runs out of real data and has to make a path to an answer -- rapidly satisfying the request.
Honestly, it's pretty mitigatable with RAG and existing models, at the cost of a lot of "I don't know" answers.
Yes, the models just give the most likely next token and yes, the instruct-tuned ones have been trained to answer questions even when they don't really know the answer. But...
One can simply retrieve relevant source information, possibly augmented by the language model's first pass answer, present it to a thinking model, and prompt the model to say "I don't know" if a solid answer isn't present in the sources presented to it, and "according to this source" otherwise.
Of course, using models this way makes them no better than filters/summarizers on top of source engines.
Another possible technique, since hallucinations tend to be inconsistent, is to make multiple generations at higher temperature and use a language model to compare the results to determine whether it actually knows the answer or not. Obviously this costs significant latency and compute, and I haven't seen it done in production.
You're completely missing the point. We all know how to work around it, but it's incredibly inefficient and small errors can sneak in anywhere.
I had to fix a mistake in Probability Zero that wasn't caught until the third ebook revision because the AI substituted 17 for 7.65 in an equation for no reason.
Thanks.
Good to be aware ai’s can replace good data with false data in a random manner.
I thought ai accuracy issues were occasional hallucinations, when the ai had no known good data.
"But solving the problem requires reinventing the LLM and rebuilding it from the ground up on the basis of something other than the interpolation algorithm..."
i never knew that 🤔