Article
0 comment

Is AI Indexing Nearly Here?

No surprise, publishing continues to react and interact with artificial intelligence. A couple of colleagues recently raised AI on a couple of indexing email lists. I get the sense that many indexers are concerned about the potential for AI to replace us, or at least that publishers will believe that AI can replace a human-written, thoughtfully constructed index. I have to admit I also feel uncertain about what the future holds. I wrote about AI and indexing last year, and I think it is worth considering again. 

Is Indexing by AI Nearly Here?

One colleague flagged this article from The Scholarly Kitchen, “AI-Enabled Transformation of Information Objects Into Learning Objects,” by Lisa Janicke Hinchliffe. Hinchliffe reviews three new AI tools which purport to help readers access and understand academic writings. Of particular interest to indexers is what Hinchliffe writes about Papers AI Assistant: 

When exploring the functionality as a beta tester, I was curious how the results compared to my pre-AI tool practice of making heavy use of CTRL-F to locate keywords in lengthy texts. I found that, not only did the Papers AI save me a great deal of time by providing me with an overview annotated with links to specific sections of the text, it also often alerted me to places in the text where my topic of interest was conceptually discussed without the use of the specific keywords I would have searched.

Did you catch that last bit? Papers AI Assistant can apparently identify discussions of interest without the use of a keyword search. That is what a good index is supposed to do. Is this the beginnings of an AI that can replace indexers? Hinchliffe also writes that, “I am excited by the possibilities these AI tools offer for moving the focus from access to information to comprehension of it.”

A few thoughts: I have to admit that I am skeptical of the claim or hope that Papers AI and similar tools will help readers comprehend information. My sense is that AI works best as a tool, with the user clearly understanding its strengths and limitations, and with the user making the final decision on the quality of results and how best to use the results. That is similar to how I use the search function when indexing. Search is useful for double-checking facts and mentions, but I know that it doesn’t catch everything and isn’t good at providing context; I still need to read and understand the book. My fear is that many users will uncritically accept whatever the AI tool tells them, turning a program like Papers AI into glorified CliffNotes and enabling an even shallower engagement with the text. 

I think it is also worth pointing out that what is described here is not an index. An index is a static document that is browsable. That is very different from an AI highlighting a handful of potentially relevant passages. Browsability is key to an index because it allows the user to serendipitously find information they didn’t know they wanted to find. Being handed a few options leaves the rest of the text opaque and unaccessible. I imagine a user can keep asking the AI new questions, but that puts the onus on the user to know what they are searching for and how to ask relevant questions. 

Of course, if an AI can identify concepts and discussions in the absence of clear keywords, then a logical next step could be to ask that AI to generate an index. I can see value in the ability to create an index on the fly, for any document. I don’t know how much I would trust such an index, though. Hallucinations is one issue. Another is that AI, essentially, is built upon algorithms. Answers are always going to be follow a certain pattern. While indexing is built upon rules and conventions, the indexer also plays a key decision-making role as they shape the contents, phrasing, and structure. These judgement calls extend beyond the formal rules of indexing to take into account elements such as the audience and usability. I am skeptical that an AI would be able to understand and produce these nuances.

Another issue is that these AI tools are entirely digital. They will not work on a print book, though, of course, an AI-generated index could be published in print. Is the future of publishing and of engagement with texts entirely digital? Perhaps in academia and other specialized fields, in which there is so much information to access and consume. Print sales remain strong, however, and I am hopeful that there will continue to be a place for print indexes. Perhaps the future—finally arrived?—is what embedded indexing has long promised, which is one index capable of being used in multiple formats. 

Besides AI replacing indexers, I think it is also worth considering how we as indexers can use AI in our own work. I am aware of one colleague who uses ChatGPT to summarize complicated books and to answer queries about the text, which helps that indexer comprehend the book more quickly. Which sounds very similar to what Papers AI claims to do. I think that is a legitimate use of AI. So long as the indexer is in control—using the AI as a tool, understanding both indexing best practices and the contents of the book, and is actively shaping the index—then why not use AI? I’m also open to having AI index elements which are time-consuming to pick up, such as scientific names, so long as the indexer is providing quality control. What I don’t want to see are indexers—or anyone else—passively accepting an AI-generated index, assuming that it is accurate and functional when it is actually not. That is my worse nightmare about AI, that we abdicate our critical thinking and decision making skills, potentially leading to errors and disasters because we have lost the ability to assess what AI is telling us.

Author Pushback

In contrast to the gold rush to embed AI into publishing, another colleague pointed out that some books are beginning to be published with prohibitions against AI and machine learning listed on the copyright page.  I also recently noticed this in a book I am indexing.

I’ve also heard from a trade client that their authors are starting to insist that book contracts include a clause that their books will not be uploaded or otherwise used to train AI. By extension, this means that all freelancers hired by this press, including myself, are not allowed to use AI tools while working on their manuscripts and proofs (which isn’t a problem for me, since I wasn’t doing so anyway). 

Will authors and publishers win against AI? Will publishers find ways to enforce their contracts and prohibitions? Will publishers change their minds, or will AI developers sufficiently address the fears that authors have? I suspect this may be an area where the publishing industry goes in two different directions: some segments, such as academic publishing, which prize easy access to information (provided you can get behind the paywall), will embrace AI, while other segments, which care more about the author and which sell directly to readers, will reject AI. 

Or, maybe AI in publishing is a bubble and these new applications will fail to live up to their hype. 

I still think that someone will try to develop an AI capable to writing an index. Some publishers will probably adopt it for the sake of saving time and money, even if the resulting indexes are useless. I am also hopeful that the value of the human touch will remain. Even if AI is incorporated into our work, I think there is still place for human guidance and discernment. Machines may be capable of generating an approximation, but only humans can create what is truly useful for other humans.

Article
0 comment

Finding Your Indexing Niche

Last month was very busy for me, culminating in the Indexing Society of Canada’s virtual conference, where I co-presented with Enid Zafran on the current state and future of embedded indexing. I may write more later about embedded indexing, but in the meantime, our findings reminded me of how segmented publishing is.

Publishing houses range from small regional or literary presses that only publish ten or twenty books per year to the giant behemoths, such as HarperCollins or Penguin Random House, with their dozens of imprints. Or, from a small university press that specializes in a handful of subjects and, again, maybe only publishes ten books per year, to the massive scholarly presses like Oxford UP or Palgrave MacMillan. There is also now the distinction between traditional publishers, who buy book rights, and hybrid publishers, who give authors both more responsibility and more control. Self-publishing is also an increasingly viable option.

Some publishers manage production in-house and want to be in direct communication with their freelancers while other publishers prefer to make indexing the author’s responsibility and/or work through third-party production companies. Some publishers prefer embedded indexes while others want a separate back-of-the-book index. Some publishers care about the quality of their books and are willing to pay their freelancers a fair price while other publishers only seem to care about volume and spending as little as they can. 

Then, of course, there are the countless subjects that books are published in. Some publishers are very specialized, while others—especially large publishers—publish across a wide range of subjects.

What this means for you, as a freelance indexer (or editor, or designer, or project manager), is that the type of work you get, the type of clients you work with, and possibly even your income, can vary considerably depending on how you position yourself within these submarkets. 

Do you want to exclusively write embedded indexes? You can do that, and probably receive more offers for work than you can accept. Do you want to specialize in science and engineering texts? You can do that too. Work only with authors? Or only with publishers? You can market yourself to get those results. 

Being a freelancer within an industry as vast as publishing is both an advantage and a challenge.

The advantage is that you can’t possibly work for everyone. This gives you the freedom to pick and choose. Be competitive by choosing a segment or two that is interesting to you and that other indexers are maybe less active in. Only market towards the clients you want to work with and ignore the rest. Find a way to differentiate yourself.

The challenge is that it can be difficult to break into a niche. It takes time to build a reputation and for your name to be passed around word-of-mouth. It can be difficult to identity and contact the gatekeepers who hire or refer freelancers. I am currently trying to shift towards indexing more Asian studies and religious studies books, and even I am finding that to be a slow process. It can also be a challenge to know which niches to pursue.

But even if you experiment with a few niches to see which sticks (which is certainly fair to do as you get started), I still encourage you to try and narrow your focus. It is easier to build expertise in a subset of subjects or with a subset of clients than to be an expert at everything. And while it takes time to break in, once established I think you will find that you have more than enough work.

As you think about which niches to pursue and how to differentiate yourself, consider some of these questions:

  • What subjects do you enjoy? What subjects do you already have some expertise in?
  • Do you have a preference for trade books or scholarly books? What about other areas, such as journals, databases, and websites?
  • Do you enjoy embedded indexing? Are you willing to learn?
  • Do you prefer working with authors or publishers?
  • How much do you want to earn? Which clients are more likely to pay what you want?
  • How many projects do you want per month or per year?

Many indexers, including myself, work within a few niches. Having variety is both an insurance policy against one niche or client disappearing, and switching back and forth between different subjects or types of projects can also be more enjoyable. But I think most long-time indexers would also agree that they don’t try to serve everyone. That is simply too much to ask for one person.

Have a focus, or two or three. Become a recognized expert in those areas. That will serve you better in the long run. To be different is to be competitive.