I love that PDFs are so difficult to transform into HTML, too
FYI, if that’s relevant to your field, every new article published on arxiv.org now has a HTML render as well.
And on many older publications, transforming “arxiv.org” into “ar5iv.org” leads to an HTML rendering that is a best-effort experiments they ran for a while.
brianary@startrek.website 12 hours ago
I’ve always called Word documents and PDFs “dead-end formats” (DEF). Once you export your data to them, there’s no reliable way to retrieve your data from them for further transformation like you can for YAML, JSON, XML, HTML, Markdown, &c.