Comment on I'm gonna die on this hill or die trying

<- View Parent
SGforce@lemmy.ca ⁨4⁩ ⁨days⁩ ago

Tokenization can make it difficult for them.

Image

The word chunks often contain a space because it’s efficient. I would think an extra space would stand out. Writing it back should be easier, assuming there is a dedicated “space” token like other punctuation tokens, there must be.

Hard mode would be asking it how many spaces there are in your sentence. I don’t think they’d figure it out unless their own list of tokens and a description is trained into them specifically.

source
Sort:hotnewtop