It’s really difficult to clean those data. Another case was, when they kept the markings on the training data and the result was, those who had cancer, had a doctors signature on it, so the AI could always tell the cancer from the not cancer images, going by the lack of signature. However, these people also get smarter in picking their training data, so it’s not impossible to work properly at some point.
Comment on Breast Cancer
parpol@programming.dev 5 months ago
[deleted]
SomeGuy69@lemmy.world 5 months ago
EatATaco@lemm.ee 5 months ago
Citation please?
earmuff@lemmy.dbzer0.com 5 months ago
That’s the nice thing about machine learning, as it sees nothing but something that correlates. That’s why data science is such a complex topic, as you do not see errors this easily. Testing a model is still very underrated and usually there is no time to properly test a model.
FierySpectre@lemmy.world 5 months ago
Using AI for anomaly detection is nothing new though. Haven’t read the article (and I doubt it’s going to be that technical) but usually this uses a completely different technique than the AI that comes to mind when people think of AI these days.
Johanno@feddit.org 5 months ago
That’s why I hate the term AI. Say it is a predictive llm or a pattern recognition model.
PM_ME_VINTAGE_30S@lemmy.sdf.org 5 months ago
According to the paper cited by the article OP posted, there is no LLM in the model. If I read it correctly, the paper says that it uses PyTorch’s implementation of ResNet18, a deep convolutional neural network that isn’t specifically designed to work on text. So this term would be inaccurate.
Much better term IMO, especially since it uses a convolutional network. But since the article is a news publication, not a serious academic paper, the author knows the term “AI” gets clicks and positive impressions (which is what their job actually is) and we wouldn’t be here talking about it.
BeliefPropagator@discuss.tchncs.de 5 months ago
That performance curve seems terrible for any practical use.
Image
FierySpectre@lemmy.world 5 months ago
Well, this is very much an application of AI… Having more examples of recent AI development that aren’t ‘chatgpt’(/transformers-based) is probably a good thing.
steventhedev@lemmy.world 5 months ago
The correct term is “Computational Statistics”
spechter@lemmy.ml 5 months ago
Stop calling it that, you’re scaring venture capital
0laura@lemmy.world 5 months ago
it’s a good term, it refers to lots of thinks. there are many terms like that.
wewbull@feddit.uk 5 months ago
So it’s a bad term.
Ephera@lemmy.ml 5 months ago
The problem is that it refers to so many and constantly changing things that it doesn’t refer to anything specific in the end. You can replace the word “AI” in any sentence with the word “magic” and it basically says the same thing…
PM_ME_VINTAGE_30S@lemmy.sdf.org 5 months ago
From the conclusion of the actual paper:
If I read this paper correctly, the novelty is in the model, which is a deep learning model that works on mammogram images + traditional risk factors.
FierySpectre@lemmy.world 5 months ago
The only “innovation” here is feeding full view mammograms to a ResNet18(2016 model). The traditional risk factors regression is nothing special (barely machine learning). They don’t go in depth about how they combine the two for the hybrid model, so it’s probably safe to assume it is something simple (merely combining the results, so nothing special in the training step)
errer@lemmy.world 5 months ago
ResNet18 is ancient and tiny…I don’t understand why they didn’t go with a deeper network. ResNet50 is usually the smallest I’ll use.
PM_ME_VINTAGE_30S@lemmy.sdf.org 5 months ago
Actually they did, it’s in Appendix E (PDF warning) . A GitHub repo would have been nice, but I think there would be enough info to replicate this if we had the data.
Yeah it’s not the most interesting paper in the world. But it’s still a cool use IMO even if it might not be novel novel enough to deserve a news article.
llothar@lemmy.ml 5 months ago
I skimmed the paper. As you said, they made a ML model that takes images and traditional risk factors (TCv8).
I would love to see comparison against risk factors + human image evaluation.
Nevertheless, this is the AI that will really help humanity.