whenc parent
Try with cpdf (disclaimer, wrote it):
Then you can play around with the JSON, and turn it back to PDF with
No live back-and-forth though.
The live back-and-forth is the main point of what I'm asking for — I tried your cpdf (thanks for the mention; will add it to my list) and it too doesn't help; all it does is, somewhere 9000-odd lines into the JSON file, turn the part of the content stream corresponding to what I mentioned in the earlier comment into:
[
[ { "F": 0.0 }, "g" ],
[ { "F": 0.0 }, "G" ],
[ { "F": 0.0 }, "g" ],
[ { "F": 0.0 }, "G" ],
[ "BT" ],
[ "/F19", { "F": 10.9091 }, "Tf" ],
[ { "F": 88.93600000000001 }, { "F": 709.0410000000001 }, "Td" ],
[
[
"Subsequen",
{ "F": 28.0 },
"t",
{ "F": -374.0 },
"to",
{ "F": -373.0 },
"the",
{ "F": -373.0 },
"p",
{ "F": -28.0 },
"erio",
{ "F": -28.0 },
"d",
{ "F": -373.0 },
"analyzed",
{ "F": -373.0 },
"in",
{ "F": -374.0 },
"our",
{ "F": -373.0 },
"study",
{ "F": 83.0 },
",",
{ "F": -383.0 },
"Bridge's",
{ "F": -373.0 },
"paren",
{ "F": 27.0 },
"t",
{ "F": -373.0 },
"compan",
{ "F": 28.0 },
"y",
{ "F": -373.0 },
"Ne",
{ "F": -1.0 },
"wGlob",
{ "F": -27.0 },
"e",
{ "F": -374.0 },
"reduced"
],
"TJ"
],
[ { "F": -16.936 }, { "F": -21.922 }, "Td" ],
This is just a more verbose restatement of what's in the PDF file; the real questions I'm asking are:- How can a user get to this part, from viewing the PDF file? (Note that the PDF page objects are not necessarily a flat list; they are often nested at different levels of “kids”.)
- How can a user understand these instructions, and “see” how they correspond to what is visually displayed on the PDF file?
This might actually be something very valuable to me.
I have a bunch of documents right now that are annual statutory and financial disclosures of a large institute, and they are just barely differently organized from each year to the next to make it too tedious to cross compare them manually. I've been looking around for a tool that could break out the content and let me reorder it so that the same section is on the same page for every report.
This might be it.