Utf8 isn’t ASCII. It takes up more space.
Comment on My password is not accepted because it is too long
Kissaki@feddit.org 1 week agoDo you have a source for the 24?
I can find a 72 byte limit. (Wikipedia, article) That’s three times as many [ascii] utf8 characters I could use.
possiblylinux127@lemmy.zip 1 week ago
Kissaki@feddit.org 1 week ago
No, it does not take up more space for ASCII characters.
If you want a source, Wikipedia
the first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single byte with the same binary value as ASCII
possiblylinux127@lemmy.zip 1 week ago
Good to know
I was just speculating
skullgiver@popplesburger.hilciferous.nl 1 week ago
The specification of the algorithm specifies up to 56 bytes, including a null terminator. If you’re using UCS-2 (2+ bytes per character, like Windows, Java, Javascript, and more languages and platforms do), that’s 27 characters (can’t use the last half byte character pair). Add some margins for extended characters (emoji and such) and you’ll end up just above or below 24. With UTF-8 you can end up doing much better (exclusively Latin-1) or much worse (exclusively non-Latin character sets). Verifying that on the frontend is a massive pain (string length in JS is unreliable) and dynamically switching codecs is a recipe for bugs and security leaks.
The 72 byte limit is the result of the internal workings of most bcrypt algorithms, but if you ever switch implementations you need to make sure that implementation doesn’t change the internal workings if you rely on details like that. If the stars align you can use 71 characters (72 if you use Pascal strings), but that’s far from a given.