The Practice of Science: Broken
I’m going to make what is likely to be taken as an extreme criticism. I hope to start some discussion by this, and perhaps get people to think about a problem in a somewhat different way. So here goes… Back in graduate school, Diane and I would attend conferences with our advisor, Bill Evans. At several conferences, we would take a look at posters or sit together to listen to presentations. And it was a common occurrence that Bill would tell us that some method or result being reported as new was, in fact, something that had been done or accomplished and duly reported in the literature some decades prior. Bill would be able to give us the appropriate pointer to a citation where we could check that, indeed, a chunk of “new” research happened to be stuff that was (in all likelihood unknowingly) a repetition of what had been done before. And that leads me to the main claim I’m going to make here:
A critical part of scientific practice is identification of prior work and putting current research in context of it. However, this part of scientific practice is not just weak or problematic. It is broken.
Let me explain. First, I will quote a definition of an effective method:
The Turing-Church thesis concerns the notion of an effective or mechanical method in logic and mathematics. ‘Effective’ and its synonym ‘mechanical’ are terms of art in these disciplines: they do not carry their everyday meaning. A method, or procedure, M, for achieving some desired result is called ‘effective’ or ‘mechanical’ just in case
1. M is set out in terms of a finite number of exact instructions (each instruction being expressed by means of a finite number of symbols);
2. M will, if carried out without error, always produce the desired result in a finite number of steps;
3. M can (in practice or in principle) be carried out by a human being unaided by any machinery save paper and pencil;
4. M demands no insight or ingenuity on the part of the human being carrying it out.
Now, consider the problem at issue, that of locating prior scientific work on some topic. We’ll need to establish some points.
A. Except for the very most restricted of topics, the scientific literature is too voluminous to be retained in detail in the memory of any one person. (Even someone with an eidetic memory can only flip pages so fast. Using a repository like a library is thus required.)
B. No repository, collection, or database of scientific literature offers a complete index or better coverage of that literature. (One cannot rely upon any single repository, collection, or database to allow one to survey all the work on a chosen topic. Comparisons of results from separate databases show that upwards of a third of entries present in one may be absent in another.)
C. Systems for search based upon less than full content are inadequate. No information retrieval system known can entirely replace or make up for a lack of supplying appropriate keywords. Keyword identification for indexing can be incomplete or erroneous.
D. Insignificant or negative results are commonly left unreported.
E. Methods that are tried, but fail to meet desired criteria for producing reliable results, are almost never reported.
What I will assert here is that there is no effective method that will, for each and every choice of scientific topic, produce the result that one will obtain a list of the existing publications that bear upon that topic, and cannot produce the complete range of work that has been attempted on the topic. Where one may occasionally obtain a complete result is likely to be where the topic is both recent in origin and where there are yet few relevant published papers.
Certainly, one can apply certain steps and often obtain a partial list of existing publications. Consulting this specific library or that specific database, where library and database are each noted for extensive and general holdings or entries, will likely provide one with some relevant references. But this does not alter the fact that a complete review of relevant literature is not only highly unlikely for any particular topic; it becomes much less likely that for any series of unrelated searches that one would obtain a complete review in every case.
Why should completeness be desirable or desired for this problem? Certainly, completeness has been unobtainable for well over a century, and yet scientific practice continues and has accomplished amazing things. I would argue that scientific practice has succeeded in spite of instead of because of existing methods of reviewing prior work. There are a number of inefficiencies that result from our current methods. First, researchers may unknowingly replicate work. “Reinventing the light-bulb” has entered popular language as the extreme example of this outcome. This problem is not even limited to the natural sciences; it is also endemic to the practice of mathematics, as illustrated by the case of “intelligent design” advocate William A. Dembski re-inventing the Renyi information measure where a=2 in a self-web-published article (see this page for details). Second, researchers are denied the benefit of prior thought and effort in a field of inquiry. Where this does not actually involve “re-invention” scenarios, it will mean that synthetic thinking will be handicapped by keeping prior work obscure. Third, incompleteness implies that the most successful modern researchers are likely to be those who not only have a primary aptitude in their field of specialization, but who also have a high aptitude in being able to mine the current system of literature retrieval for the best available approximation to complete results on specific inquiries. These are the people who will least often find themselves in the position of getting negative comments in peer-review for having overlooked prior work. This secondary aptitude is largely a matter of talent and art rather than application of a straightforward set of techniques, which means that scientific success is a composite of work done using the scientific method and of work done in the tradition of scholarship in the humanities, which is in various aspects not well systematized. This violates item (4) in the list of attributes of an effective method given above.
There are various other drawbacks to incompleteness that could be explicated, but I think that these are enough to show that the problem is real and worthy of our attention. OK, at this point I’ve either convinced you that there is a problem worth considering, or that I’ve gone completely loopy. If the latter, you can drift off to doing something else, and take it as read that I already know that a substantial number of readers will disagree with me on this stuff.
Please note that my list of issues that lead to incompleteness are more inclusive than simply accusing the library sciences of not having delivered a solution. Part of the problem includes what is deemed worth remembering and preserving about effort in the sciences that is not directly fruitful. Just as “hunting” does not mean “catching”, so too does “research” often fail to obtain a desired result. Knowledge about methods that are ineffective or not well-suited to particular problems is useful in the sense that it, if it is published and attended to, can reduce wasted effort on the part of other researchers. There is a lot of jocularity concerning the titles of the humor magazines, “The Journal of Irreproducible Results” and the “Annals of Improbable Results”, but various researchers in conversation will admit that there is something to the notion that our current emphasis on telegraphic reportage of results overlooks the utility of laying out not only what worked in a research project, but also the usually various problems that a research effort encounters and sometimes overcomes.
Given that I have advocated for the existence of a problem, a reasonable concern is whether I have any constructive notion about what to do about it. I think that the problem is certainly difficult, but I do have hopes that it may actually be soluble. What remains to be seen is whether the level of sustained effort that would be necessary to get to a solution that meets several of the properties of an effective method would be worth the very real costs associated with it.
The first cost would be freeing scientific knowledge from the grip of commercial interests. The publication of scientific research is largely done in such a way that the publishing entities expect to retain rights to the work published and be compensated in some way for access to that work over a significant period of time (copyright currently granting something like most of a century of protection). This is a large problem in itself, as current research appears in a bewildering array of thousands of research journals, each applying somewhat differing standards and procedures to the peer-review process. Indexing services themselves add another layer of costs, and any solution to the problem is going to have the side effect of putting these services out of business.
Another cost would be in establishing a comprehensive means of surveying past work. While one may argue that current research is of most value and that a system might be contemplated that would deliver comprehensive and complete results concerning work more recent than some arbitrary date nnnn, this is a bit short-sighted. Once one has the procedural and other concerns taken care of for solving the case of the scientific literature more recent than nnnn, one simply has the finite and diminishing body of surviving work as one expands the system to include earlier values of nnnn. We may as well make a determined effort to incorporate the corpus of human knowledge in the natural sciences in the system.
A further cost lies in making any such system universally accessible. One could argue that, having spent the time, effort, and money to develop a comprehensive system for retrieval of knowledge in the natural sciences that one should attempt to recover part of that cost in direct fees from users. I think that, too, is short-sighted. The benefit of open access to this knowledge lies in fostering work that would in turn be added to the system. Along the way, one would expect that applied results will generate economic benefits that should make the costs discussed so far trivial in comparison. That, though, is simply my expectation and could be a point of argument.
There are the costs of overcoming the technical hurdles in search. I think the progress that was made from 1994 to the present in generating relevant results to natural language queries concerning the unorganized content of the World Wide web indicates that this problem, too, should yield to some concentrated effort, or at least give an approximation to complete results that significantly advances and enhances the progress of scientific research.
There are the costs of changing what gets reported as a scientific result. We need to compensate the time needed to not only briefly report what eventually worked in research (as is done now), but to also report, in sufficient detail to provide a basis for others to benefit without having to travel down false paths themselves, what problems were encountered in technique and methods. This also includes the cost of expanding the volume of reported results to accommodate the extra words needed to do this.
Would a trillion dollars solve this problem? I think so, handily. I suspect that the actual cost would be a small fraction of that figure. (The post just previous reports on how the Encyclopedia of Life project has, as a secondary effort, already digitized about 1% of the available literature on taxonomy for some fraction of their $25 million operating budget.) There are things that are costing us a chunk of a trillion dollars that don’t have anywhere near the potential for economic benefit that solving this problem in scientific practice might offer. In fact, it might be just the thing to do to prepare to pay that outstanding bill.
Update 2023-01-15:
I’ve been interacting with generative AI models, mostly OpenAI’s ChatGPT, for a little over a week, and I think the solution to the technical problem here has arrived. One of these generative AI models premised on GPT-x or some other similar technology should be able to provide the fix we need to level the “literature search” playing field. There appear to have been some specific blocks put in place in ChatGPT to make it nearly unusable for academic literature search, but given how ChatGPT handles the minutiae of coding, I’m confident that is a purposeful hobbling rather than a deficiency in capability. And, as I said, it took nowhere near a trillion dollars to develop the tech. It may cost that much to buy back human knowledge from its commercial gatekeepers, though.
You are absolutely right, and I have been worrying about this problem for years (perhaps decades).
Telegraphically, the emphasis in research on “novelty” is supposed to take care of “scholarship”, but as you point out, it has not (and the problem is worsening with the rapid growth of knowledge).
So far, scientists have been successful in part because of (poorly understood) mechanisms for social thinking. We need more understanding, here.
Nathan Myhrvold seems to be doing something along these lines, but I’m not sure just what.
The game of science has really just evolved to its current state, and I’m not sure that the opposing forces and compromises of this game have served the ideal (“Go and find out”) as well as it has served other interests.
I am an applied mathematician who has spent considerable time and effort in broadening my areas of activity, and though exhilarating it is bloody hard (I have most recently been trying to learn about evolution, and honestly I find computational mathematics restful in comparison).
What has made this past year’s effort possible is that I was awarded a nice no-strings research prize, and I have been able to really stretch, essentially risk-free for my career, to try to acquire some expertise and scholarship without worrying about publication (this year, at least).
To make such opportunities more widely available and encourage people to take them up might be of great benefit to the whole enterprise of science. Done right, it might speed science up by removing some of the inefficiencies you mention.
But I have a feeling that my own dilettantish experiment isn’t the right model; maybe nearly, but more structure might be better.
Better interdisciplinary tools… more speed in learning, searching, reading. More time, too, of course.
Sigh.
Some tangentially related thoughts on another blog I read occasionally.
http://tremont.typepad.com/technical_work/2006/08/a_review_of_hor.html
I was discussing this with a biology grad student a few months ago while he was prepping for comps. I wonder how much difference there is between disciplines? Because we discussed that in biology there are usually key seminal works that everyone will cite, whereas in social sciences we tend to include pages and pages of citations – so I think that makes it easier to locate a large portion of the material that has been published. If you keep going through all the other citations, eventually you hit redundancy, and you’ve collected the majority of material. That being said, we still have the problem of grey literature and unpublished theses.
First note that I think the conclusions that you are trying to make are correct (i.e. a complete literature survey is impossible, and often even a cursory literature survey is impossible).
That said, if you are using the definition of a Turing Machine, “finite” can be very very large; “finite” is a fairly weak condition in this context. For instance, there is nothing non-finite about the procedure:
– go through every piece of literature in turn
if there are finitely many pieces of literature (which indeed there are). Large numbers (e.g. 10^10000000000) are still finite.
I think you want to define “effective” as “can be performed by a human in 3 months”, rather than “can be performed by a Turing Machine given any finite amount of time”.
I think that we already have existence proofs for the effectiveness of full-text searches on very large datasets, at least if you’ve used Google, AltaVista, or Yahoo search engines lately you’ll have seen this. Adding the full-text content of the scientific literature to the available text dataset wouldn’t make any of those take as long as three months to get back to you.
I was thinking, though, that a search engine specifically aimed at the scientific literature would likely be more suitable than simply loading up some existing search engine.