Article
0 comment

Is AI Indexing Nearly Here?

No surprise, publishing continues to react and interact with artificial intelligence. A couple of colleagues recently raised AI on a couple of indexing email lists. I get the sense that many indexers are concerned about the potential for AI to replace us, or at least that publishers will believe that AI can replace a human-written, thoughtfully constructed index. I have to admit I also feel uncertain about what the future holds. I wrote about AI and indexing last year, and I think it is worth considering again. 

Is Indexing by AI Nearly Here?

One colleague flagged this article from The Scholarly Kitchen, “AI-Enabled Transformation of Information Objects Into Learning Objects,” by Lisa Janicke Hinchliffe. Hinchliffe reviews three new AI tools which purport to help readers access and understand academic writings. Of particular interest to indexers is what Hinchliffe writes about Papers AI Assistant: 

When exploring the functionality as a beta tester, I was curious how the results compared to my pre-AI tool practice of making heavy use of CTRL-F to locate keywords in lengthy texts. I found that, not only did the Papers AI save me a great deal of time by providing me with an overview annotated with links to specific sections of the text, it also often alerted me to places in the text where my topic of interest was conceptually discussed without the use of the specific keywords I would have searched.

Did you catch that last bit? Papers AI Assistant can apparently identify discussions of interest without the use of a keyword search. That is what a good index is supposed to do. Is this the beginnings of an AI that can replace indexers? Hinchliffe also writes that, “I am excited by the possibilities these AI tools offer for moving the focus from access to information to comprehension of it.”

A few thoughts: I have to admit that I am skeptical of the claim or hope that Papers AI and similar tools will help readers comprehend information. My sense is that AI works best as a tool, with the user clearly understanding its strengths and limitations, and with the user making the final decision on the quality of results and how best to use the results. That is similar to how I use the search function when indexing. Search is useful for double-checking facts and mentions, but I know that it doesn’t catch everything and isn’t good at providing context; I still need to read and understand the book. My fear is that many users will uncritically accept whatever the AI tool tells them, turning a program like Papers AI into glorified CliffNotes and enabling an even shallower engagement with the text. 

I think it is also worth pointing out that what is described here is not an index. An index is a static document that is browsable. That is very different from an AI highlighting a handful of potentially relevant passages. Browsability is key to an index because it allows the user to serendipitously find information they didn’t know they wanted to find. Being handed a few options leaves the rest of the text opaque and unaccessible. I imagine a user can keep asking the AI new questions, but that puts the onus on the user to know what they are searching for and how to ask relevant questions. 

Of course, if an AI can identify concepts and discussions in the absence of clear keywords, then a logical next step could be to ask that AI to generate an index. I can see value in the ability to create an index on the fly, for any document. I don’t know how much I would trust such an index, though. Hallucinations is one issue. Another is that AI, essentially, is built upon algorithms. Answers are always going to be follow a certain pattern. While indexing is built upon rules and conventions, the indexer also plays a key decision-making role as they shape the contents, phrasing, and structure. These judgement calls extend beyond the formal rules of indexing to take into account elements such as the audience and usability. I am skeptical that an AI would be able to understand and produce these nuances.

Another issue is that these AI tools are entirely digital. They will not work on a print book, though, of course, an AI-generated index could be published in print. Is the future of publishing and of engagement with texts entirely digital? Perhaps in academia and other specialized fields, in which there is so much information to access and consume. Print sales remain strong, however, and I am hopeful that there will continue to be a place for print indexes. Perhaps the future—finally arrived?—is what embedded indexing has long promised, which is one index capable of being used in multiple formats. 

Besides AI replacing indexers, I think it is also worth considering how we as indexers can use AI in our own work. I am aware of one colleague who uses ChatGPT to summarize complicated books and to answer queries about the text, which helps that indexer comprehend the book more quickly. Which sounds very similar to what Papers AI claims to do. I think that is a legitimate use of AI. So long as the indexer is in control—using the AI as a tool, understanding both indexing best practices and the contents of the book, and is actively shaping the index—then why not use AI? I’m also open to having AI index elements which are time-consuming to pick up, such as scientific names, so long as the indexer is providing quality control. What I don’t want to see are indexers—or anyone else—passively accepting an AI-generated index, assuming that it is accurate and functional when it is actually not. That is my worse nightmare about AI, that we abdicate our critical thinking and decision making skills, potentially leading to errors and disasters because we have lost the ability to assess what AI is telling us.

Author Pushback

In contrast to the gold rush to embed AI into publishing, another colleague pointed out that some books are beginning to be published with prohibitions against AI and machine learning listed on the copyright page.  I also recently noticed this in a book I am indexing.

I’ve also heard from a trade client that their authors are starting to insist that book contracts include a clause that their books will not be uploaded or otherwise used to train AI. By extension, this means that all freelancers hired by this press, including myself, are not allowed to use AI tools while working on their manuscripts and proofs (which isn’t a problem for me, since I wasn’t doing so anyway). 

Will authors and publishers win against AI? Will publishers find ways to enforce their contracts and prohibitions? Will publishers change their minds, or will AI developers sufficiently address the fears that authors have? I suspect this may be an area where the publishing industry goes in two different directions: some segments, such as academic publishing, which prize easy access to information (provided you can get behind the paywall), will embrace AI, while other segments, which care more about the author and which sell directly to readers, will reject AI. 

Or, maybe AI in publishing is a bubble and these new applications will fail to live up to their hype. 

I still think that someone will try to develop an AI capable to writing an index. Some publishers will probably adopt it for the sake of saving time and money, even if the resulting indexes are useless. I am also hopeful that the value of the human touch will remain. Even if AI is incorporated into our work, I think there is still place for human guidance and discernment. Machines may be capable of generating an approximation, but only humans can create what is truly useful for other humans.

Leave a Reply