Comment

Comment on Elsevier

Diplomjodler3@lemmy.world ⁨1⁩ ⁨year⁩ ago

Just print it to a PDF printer.

source

Sort:hotnew top

unexposedhazard@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
This feels like it should be a browser plugin that automatically anonymizes anything you download.

source
NeatNit@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
I feel like this will cause quality degradation, like repeatedly re-compressing a jpeg. Relevant xkcd

source
- Zorsith@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
  I feel like it would be negligible degradation for this purpose. Still might not anonymize whomever shares it though, could be watermarked with the same Metadata (en.m.wikipedia.org/…/Machine_Identification_Code) without being noticeable to the naked eye
  
  source
- Passerby6497@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Why would it cause degradation? You’re not recompressing anything, you’re taking the visible content and writing it to a new PDF file.
  
  source
  - NeatNit@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
    You’re pushing it through one system that converts a PDF file into printer instructions, and then through another system that converts printer instructions into a PDF file. Each step probably has to make adjustments with the data it’s pushing through.
    
    Without looking deeply into the systems involved, I have to assume it’s not a lossless process.
    
    source
    4am@lemm.ee ⁨1⁩ ⁨year⁩ ago
    Those printer instructions are called Postscript and they’re the basis of PDF.
    
    You are thinking that the printing process will rasterize the PDF and then essentially OCR/vector map it back. It’s (usually) not that complicated.
    
    source
    -> View More Comments
    TomSelleck@lemm.ee ⁨1⁩ ⁨year⁩ ago
    You should maybe look a bit more into it. How do you think commercial printers or even hobbyists maintain fidelity in their images? Most images pass through multiple programs during the printing process and still maintain the quality. It’s not just copy/paste.
    
    source
    -> View More Comments
- Diplomjodler3@lemmy.world ⁨1⁩ ⁨year⁩ ago
  That’s not how PDF works at all.
  
  source
  - NeatNit@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
    See my reply to another comment
    
    source
    Diplomjodler3@lemmy.world ⁨1⁩ ⁨year⁩ ago
    You’re still wrong. the only place where it could cause quality loss if embedded bitmap images are compressed with lower quality settings (which you can adjust). PDF is a vector format, i.e. a mathematical description of what is to be rendered on screen. It was explicitly designed to be scalable, transmittable and rendered on a wide variety of devices without quality loss.
    
    source
    -> View More Comments
- onion@feddit.de ⁨1⁩ ⁨year⁩ ago
  You can ask ChatGPT to spit out the latex code
  
  source
  - NeatNit@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
    What
    
    source
- Turun@feddit.de ⁨1⁩ ⁨year⁩ ago
  I don’t understand the “that’s no how PDFs work” criticism.
  
  Removing data from the original file is the whole point of the exercise! Of course unique tokens can be hidden in plain sight in images, letter spacing, etc. If we want to make sure to remove that we need to degrade the quality of the PDF so that this information is lost in said lossy conversion.
  
  source