Comment on Publishers Always Innovating

<- View Parent
keepthepace@slrpnk.net ⁨2⁩ ⁨months⁩ ago

Yes, PDFs are much more permissive and may not have any semantic information at all. Hell, some old publications are just scanned images!

PDF -> semantic seems to be a hard problem that basically requires OCR, like these people are doing

source
Sort:hotnewtop