Comment

Comment on Why is AI Pornifying Asian Women?

“Inclusive models” would need to be larger.

Right now people seem to prefer smaller quantized models, with whatever set of even smaller LoRAs on top, that make them output what they want… and only include more generic elements in the base model.

source

Sort:hotnew top

Muehe@lemmy.ml ⁨1⁩ ⁨year⁩ ago

“Inclusive models” would need to be larger.

[citation needed]

To my understanding the problem is that the models reproduce biases in the training material, not model size. Alignment is currently a manual process after the initial unsupervised learning phase, often done by click-workers (Reinforcement Learning from Human Feedback, RLHF), and aimed at coaxing the model towards more “politically correct” outputs; But ultimately at that time the damage is already done since the bias is encoded in the model weights and will resurface in the outputs just randomly or if you “jailbreak” enough.

In the context of the OP, if your training material has a high volume of sexualised depictions of Asian women the model will reproduce that in its outputs. Which is also the argument the article makes. So what you need for more inclusive models is essentially a de-biased training set designed with that specific purpose in mind.

I’m glad to be corrected here, especially if you have any sources to look at.

source
Even_Adder@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
I wouldn’t mind. I’m here for it.

source
- jarfil@beehaw.org ⁨1⁩ ⁨year⁩ ago
  Are you ready to run a 100B FP64 parameter model? Or even a 10B FP32 one?
  
  Over time, I wouldn’t be surprised if 500B INT8 models became commonplace with neuromorphic RAM, but there’s still some time for that to happen.
  
  source
  - Even_Adder@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
    You don’t need that many concepts, 4gb checkpoints work just fine.
    
    source
    jarfil@beehaw.org ⁨1⁩ ⁨year⁩ ago
    For more inclusive models, or for current ones? In order to add something, either the size has to grow, or something would need to get pushed out (content, or quality). 4GB models are already at the limit of usefulness, both DALLE3 and SDXL run at about 12B parameters, so to make them “more inclusive” they’d have to grow.
    
    source
    -> View More Comments