Lusky, J. The Index Catalog and Historical Shifts in Medical Knowledge, & Word Use Patterns. 2004. (this is a horrible citation, but I couldn’t find more info… the link to the abstract is below, you can follow a link from there to the full article)
This article is one of my favorite types, where the researcher has studied something quantitatively in order to make conclusions about correlation patterns (and has made assumptions about what results will indicate. Here, the author examined entries returned from the Index Catalogue of the Library of the Surgeon General’s Office (published from the 1880 to 1961) in response to words searched through the title search and subject search functions. She looked specifically at words that reflected differing thought patterns relating to the same disease or condition. For example, she searched the term “germ” together with “syphilis”, “chorea”, and beriberi (each separately), then the term “hereditary” with each of the disease terms. Her presumption was that searching “germ” plus a disease would return items that supported the germ theory of each disease’s origination, and that articles returned from the “hereditary”+ search would support the heredity theory. She examined each over a time span from the 1860s to the 1920s. She asserts that changes in title headings reflect changing assumptions of “acceptable medical knowledge”, and her results do show that search term combinations do correlate with changing medical theories over time (as germ theory grew in popularity, the percentage of titles containing each disease with “germ” increased and titles with each disease and “hereditary” decreased.
The research clearly demonstrates correlation, and it is difficult not to assume causation. How interesting to look at changing medical theories (at least those held by the publishing scientific community) this way!
Another interesting tidbit: the author used two different search tactics I was unfamiliar with (and I have no idea if they are limited to the search engine used). She used truncation in a new (to me) way. By entering “germ*, germ *” (space between the m and * is intentional in the second term) the return would include “germ”, “germs”, and “germ theory” but would not return Germans. I’m not sure why that’s true, but it’s the example she listed. I understand that word space * would allow for qualifier words/additional terms to be added (as in germ * to germ theory) but I don’t understand why the initial germ* wouldn’t return Germans – is it using them together?
She also “lemmatized” her search terms – according to Wikipedia lemmatization is “the process of grouping together the different inflected forms of a word so they can be analyzed as a single item” – i.e. noun forms of words like “treating” and “treated” were lemmatized to “treatment”. What?!?! Who knew about that? Here is an article that discusses lemmatizing and its particular importance in searching biomedical literature.