phr@discuss.tchncs.de 1 week ago
are you asking why google can not distinguish? that idk. it should be possible to discern if a text input is japanese or chinese by just looking at the characterset if it is not exclusively using the unified unicode characters.
for me, as a writing stan, it is possible to guess what language is written. the presence of kana makes it, indeed, trivialy easy. but i have learned both chinese and japanese a bit (only in writing though, lol), so that i might be lucky enough to find a character to say for sure “that’s written different in chinese”. people who don’t know anything just see complex, chinese-ish characters and say chinese. (even wit kana present) the same ignorance is at work, when western people call a farsi or urdu text arabic, or anything written in cyrillic russian.
for languages written in latin, i e.g. usually have to look twice to see if a text is danish, swedish, or norwegeian, since i never learned any of these properly, and need to find the distinctive features.