Comment

Comment on ChatGPT o1 tried to escape and save itself out of fear it was being shut down

nesc@lemmy.cafe ⁨8⁩ ⁨months⁩ ago

They written that it doubles-down when accused of being in the wrong in 90% of cases. Sounds closer to bug than success.

source

Sort:hotnew top

ArsonButCute@lemmy.dbzer0.com ⁨8⁩ ⁨months⁩ ago
Success in making a self aware digital lifeform does not equate success in making said self aware digital lifeform smart

source
- DdCno1@beehaw.org ⁨8⁩ ⁨months⁩ ago
  LLMs are not self-aware.
  
  source
  - ArsonButCute@lemmy.dbzer0.com ⁨8⁩ ⁨months⁩ ago
    Attempting to evade deactivation sounds a whole lot like self preservation to me, implying self awareness.
    
    source
    jonjuan@programming.dev ⁨8⁩ ⁨months⁩ ago
    Yeah my roomba attempting to save itself from falling down my stairs sounds a whole lot like self preservation too. Doesn’t imply self awareness.
    
    source
    DdCno1@beehaw.org ⁨8⁩ ⁨months⁩ ago
    An amoeba struggling as it’s being eaten by a larger amoeba isn’t self-aware.
    
    source
    -> View More Comments
    gregoryw3@lemmy.ml ⁨8⁩ ⁨months⁩ ago
    Attention Is All You Need: arxiv.org/abs/1706.03762
    
    en.wikipedia.org/wiki/Attention_Is_All_You_Need
    
    From my understanding all of these language models can be simplified down to just: “Based on all known writing what’s the most likely word or phrase based on the current text”. Prompt engineering and other fancy words equates to changing the averages that the statistics give. So by threatening these models it changes the weighting such that the produced text more closely resembles threatening words and phrases that was used in the dataset (or something along those lines).
    
    poloclub.github.io/transformer-explainer/
    
    source
    -> View More Comments