LLMs fail in 8 out of 10 early differential diagnosis cases

supersquirrel@sopuli.xyz · 4 days ago

LLMs fail in 8 out of 10 early differential diagnosis cases

mindbleach@sh.itjust.works · 4 days ago

Jesus fuck, of course they do! Don’t use a chatbot for medical advice!

There’s neural networks specifically for diagnosis. Pattern recognition is kinda their whole thing, But a model trained on the whole internet has worryingly high odds of saying ‘you have cancer of the butthole, LOL.’ The correct incidence rate for that description in a medical context is never.

lakemalcom@sh.itjust.works · 4 days ago

For all models, optional real-time web search, browsing, and retrieval features were explicitly disabled when available. Each vignette was evaluated in triplicate. All replicates and vignettes were parsed independently. To ensure comparability, optional features such as real-time search were disabled across all models.

Hm. Seems like kinda hamstringing things

Meron35@lemmy.world · 4 days ago

This is commonly done for the purposes of replicability, but is not at all how these models are deployed in practice.

Larger institutions, especially those with strict data privacy requirements, are deploying locally hosted models permanently RAGed to their own internally vetted documentation.

It would’ve been much more interesting to see how much RAG setups fail, contrary to their marketed promises.

From experience, RAGs do help reduce hallucinations, but LLMs still do dumb things, like jumble up numbers. There were many cases where the LLM confidently presented some numerical results, but the number existed somewhere else entirely, like a footnote on the same page.

CorrectAlias@piefed.blahaj.zone · edit-2 4 days ago

Maybe, but I think it’s important to note that LLMs can hallucinate web results just the same. You can give them a specific web page and they’ll sometimes spit out things that don’t exist on the page, especially if you’re doing it to correct a mistake the model made.

Hirom@beehaw.org · 4 days ago

Looking forward to more chatbot companies getting targetted for practicing medecine without a license.

It may not matter what’s the LLM architecture is, and that there are disclamers in fine prints. If a company sells a service that generate diagnosis, and cause harm due to incorrect diagnosis, the victims may try to sue them for practicing medecine without a license.

psycotica0@lemmy.ca · 4 days ago

So you’re saying there’s a chance… 😛

finallymadeanaccount@lemmy.world · 4 days ago

That robot has a long arm!

tacosanonymous@mander.xyz · 4 days ago

Inspector gadget style. My question is why does the robot have breasts?

Venat0r@lemmy.world · 2 days ago

Because most of the images that the ai that generated the image was trained on had breasts.

P00ptart@lemmy.world · 4 days ago

It’s a fallout assaultron.

finallymadeanaccount@lemmy.world · 4 days ago

Because Elon wants to ultimately combine Optimus robots with Grok loli skins/AI programmed exactly how he wants it, so he can finally have someone who loves him for who he is, and not his money.