This idea is very, very closely related to what I did for my PhD dissertation, with our analysis of Darwin's Readings being published in Cognition and available at https://arxiv.org/abs/1509.07175.
Summary: Darwin kept a series of reading notebooks recording everything he read for a very convenient time-span: from his return aboard the Beagle until a few months after The Origin of Species was published. (CUL-DAR119, CUL-DAR128, http://darwin-online.org.uk/EditorialIntroductions/vanWyhe_n...) I used the bibliographic entries to look up 97% of the English, non-fiction works in his library and trained an LDA topic model over it. We observed clear behavioral shifts in his reading patterns over time, which corresponded with other biographical events.
We tied this into a theory of "information foraging" - that as people are researching they are either exploiting existing topics or exploring new areas. For example, early in the reading history, one reading largely followed the next - there were few jumps in subjects between readings. As he was writing the Origin and synthesizing resource, the topic shifts became much more exploratory.
This original model had no goal-directed behavior, it was only a description of what he read, but still found these behavioral shifts when he marked he was beginning the Origin. In a later study, we began to look at the "zeitgeist" question you mention and how Darwin's drafts of the Origin diverged from the culture, as represented by the books he was reading (https://arxiv.org/abs/1802.09944) In this second study, we used the reading model and sampled his writings into that model space.
Both this library and his "to be read" notebooks define a nice "adjacent possible" - could he have read other things that may have been closer to his thoughts in the Origin? We use permutations of what he DID read as a null for studying his reading behavior - could he have read them in a more "optimal" manner? (With a lot of qualification on different "optimal" learning strategies...)
Final note: the time scales of his record keeping were not rare for the time - many people of letters kept "commonplace books" that journaled these reading histories. There was once a project to collect these reading histories, but I don't have the link at hand now.
Summary: Darwin kept a series of reading notebooks recording everything he read for a very convenient time-span: from his return aboard the Beagle until a few months after The Origin of Species was published. (CUL-DAR119, CUL-DAR128, http://darwin-online.org.uk/EditorialIntroductions/vanWyhe_n...) I used the bibliographic entries to look up 97% of the English, non-fiction works in his library and trained an LDA topic model over it. We observed clear behavioral shifts in his reading patterns over time, which corresponded with other biographical events.
We tied this into a theory of "information foraging" - that as people are researching they are either exploiting existing topics or exploring new areas. For example, early in the reading history, one reading largely followed the next - there were few jumps in subjects between readings. As he was writing the Origin and synthesizing resource, the topic shifts became much more exploratory.
This original model had no goal-directed behavior, it was only a description of what he read, but still found these behavioral shifts when he marked he was beginning the Origin. In a later study, we began to look at the "zeitgeist" question you mention and how Darwin's drafts of the Origin diverged from the culture, as represented by the books he was reading (https://arxiv.org/abs/1802.09944) In this second study, we used the reading model and sampled his writings into that model space.
Both this library and his "to be read" notebooks define a nice "adjacent possible" - could he have read other things that may have been closer to his thoughts in the Origin? We use permutations of what he DID read as a null for studying his reading behavior - could he have read them in a more "optimal" manner? (With a lot of qualification on different "optimal" learning strategies...)
Final note: the time scales of his record keeping were not rare for the time - many people of letters kept "commonplace books" that journaled these reading histories. There was once a project to collect these reading histories, but I don't have the link at hand now.