Comment by rosquillas

rosquillas Apr 30, 2025 parent

Why not simply upload the pdf version of the scanned book or document? Extracting the text out of a scanned document via GCP Document AI API sounds like unnecessary use of resources

ljoshua May 1, 2025

I was running into context window issues doing this. I could have gone in and split up the scanned book into chapters or something to get around this, and did that for a couple of subjects. But it wasn't too much work (and literally cost me pennies, like six of them) to get the pure text extract, and it's pretty easy to work with now. (Besides, which random dev doesn't love a little side challenge to explore new APIs at home every now and then? ;) )

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous