Article
0 comment

Pointing Readers in the Right Direction

Welcome back to this mini-series on the basic elements of an index.

I’m currently looking at what makes up an entry, which I described as “what this thing is + where to find it.” In my previous post I discussed main headings and subheadings, which form the first part of that equation. Today, I’m writing about the second part, “where to find it,” also known as locators.

Locators are the portion of the entry which tells readers where to find information about the main heading and subheading. They are like directions for the reader to follow. To be effective, locators need to be clear, specific, and accurate. 

There are three points that I want to make about locators.

The first is that a locator can be anything. Page numbers are usually the default locator, especially when indexing books, but other forms of locators can also be used, such as paragraph or policy numbers. The only criteria is that the locator is appropriate for the material being indexed.

My second point is that the locator needs to be clear and specific. Readers should understand how the locator relates to the text, and should be able to easily use the locator to find the desired information.

To give a few examples, page numbers are often augmented when referring to figures and tables or to footnotes and endnotes. For figures and tables, the page number may be placed in italics or bold, or have a fig. or t appended. For notes, the note number is usually appended to the page number, as in 253n43 or 265nn14-15. These allow the reader to more quickly pinpoint the information on the page.

For a cumulative index for a journal or multivolume series, the locator should probably include a volume or issue number alongside the page number, as in VII.343. For a policy document that I update every couple of years, I use unique policy numbers instead of page numbers, as in 20.2.1.2. These policy numbers both direct readers to the specific policy more quickly (especially if there are ten or more policies listed per page), and makes updating the index a whole lot easier as I don’t need to worry about the pagination shifting as policies are added, removed, and revised. 

When augmenting page numbers or using something different for a locator, it may be helpful to explain your choice in a headnote so readers understand how to interpret the locator. Page numbers, being the default, don’t need to be explained.

To give a few examples of locators:

conduction, 83, 84fig. 

convection, 85, 84t, 234n43

safety protocols, 2.1.13.4, 3.3.1.12, 3.4.3.6

thermodynamics, VI.343, IX.23

The last point I want to make is that the number of locators in an array should be reasonable for readers to search. As a general rule of thumb, no more than 6-10 undifferentiated locators. Larger arrays, with more than ten locators, should have subheadings to sort the references and make it easier for the reader to identify the relevant aspects of the subject. The concern is that searching through a long string of undifferentiated locators is time-consuming and may discourage the reader from finding what they need. Better to set the reader up for success by presenting locators in smaller chunks.

 

When writing an entry, the main heading and, possibly, subheadings, tell readers what the subject is, while the locator tells readers where to find information about that subject. Locators can be anything, and should be clear and specific. The number of locators should also be reasonable for readers to search. Do this, and you are setting your reader up for success.

Article
0 comment

Telling Readers What This Thing Is

In my last post I wrote about entries and arrays, which I described as the building blocks of an index. I defined an entry as “what this thing is + where to find it.” Today I’m going to expand on “what this thing is,” also known as main headings and subheadings.

The fundamental purpose of an index is to guide readers to the relevant information that the reader is searching for. To do that, the index needs to be clearly written, which begins with the first words that the reader sees. 

Main Headings

The main heading, also known simply as the heading, kicks off the entry. This is the first word or phrase that you see in an entry and array. It is typically a noun, and should be clear and concise. If a longer phrase is needed, the main heading should lead with the most important element. The main heading should also match how the term is used in the text, such as using the same spelling and capitalization. 

The heading should be reflective of both the text and the audience. Is the book discussing cars more generally or electric vehicles specifically? Or both? Will readers be reading this book to learn about the auto industry, new innovations, or specific brands and models? Or all of the above? To give another example, biblical characters such as Matthew, Silas, and Timothy probably don’t need a gloss clarifying their identity in a work of biblical studies to clarify their identities, but these names may be more unfamiliar to readers if they appear in other disciplines. 

Another consideration is whether or not to pluralize main headings. Should it be dog or dogs? Cantaloupe or cantaloupes? To start, be mindful of differences in nuance. Freedom is somewhat different from freedoms, for example. Otherwise, I tend to follow common usage. If a term is commonly pluralized, then I go ahead and make it plural in the index, which I think reads more naturally. 

To give a few examples of main headings:

Acts (biblical book)

Cleveland, Grover

electric vehicles

heat transfer

London (ON)

trade wars, retaliatory

Subheadings

For short arrays containing a handful of locators, a main heading is usually sufficient to specify what this thing is. But more specificity is often needed for topics with extensive discussion (usually when there are more than 6-10 locators) or if there are different aspects that readers would appreciate differentiated.

The subheading is placed after the main heading. Its purpose is to further clarify what this thing is. Because subheadings often differentiate references from one another, there are usually multiple subheadings per array.

Since the subheading is appended, there is more flexibility in how it can be phrased. Depending on the context, the subheading can be either a short word or phrase, or it can be longer and more descriptive.  In all cases, the relationship between the main heading and subheading should be clear. If possible, I try to also lead with the key word, which both affects how the subheadings are alphabetically sorted and, I think, makes it easier for readers to find the subheading as they scan the array. 

For example,

Acts (biblical book): authorship; within biblical canon; commentaries on; Paul within

Cleveland, Grover: first presidency; free-silver issue; legislative achievements; private life between presidential terms; second presidency

heat transfer: conduction; convection

Effective headings and subheadings connect readers to the text. Main headings are the point at which readers encounter the index, and readers should not need to guess what this thing is. The same is true for subheadings, if the reader decides to read further into the array. 

Tell the reader what they need to know. Be specific and concise. Do this, and your index will be well on its way to being excellent.

Article
0 comment

The Building Blocks of an Index

An index is a document that is scanned to find information. It usually spans several pages. But if you had to break an index down into its smallest part, what would that be?

An index is not like most books or documents in that it does not contain a narrative. It cannot be reduced to plot points or the components of an argument. An index doesn’t even contain proper sentences. Instead, an index is a compilation of references. Broken down, the smallest unit within an index is an entry. 

An entry has two components. Basically, “what this thing is + where to find it.” Using indexing terminology, this is “main heading + locator.” Or, to add another level of specificity, “main heading + subheading + locator.” From the entry, the reader can identify what they are looking at and where to find it in the text. For example,

Foxconn, 45

semiconductors: geopolitics of, 67

The second building block is an array. I like to think of an array as containing everything that an index—and by extension the book or document—has to say about a particular subject. If you want to learn about Foxconn, you search for the Foxconn array. Want to learn about semiconductors, you search for the semiconductors array.

If there is only one mention, then a single entry can serve as a single array. But more often, there are multiple discusses throughout the book, which lead to the creation of multiple entries. Combined together, the entries create an array.

Foxconn, 45, 49, 51-52

semiconductors: fabrication techniques, 54-57; geopolitics of, 67; history of, 23-25; properties of, 34, 44

Why are entries and arrays so important? No one writes an index composed of a single entry.

But every index begins with an entry, and as the index is written, the entries and arrays accumulate. It is through knitting the entries and arrays together than an index emerges.

Step one to writing an index is to write clear, concise, and specific entries, so that “what this thing is” is clear to the reader. Step two is to combine entries into arrays which are clearly organized and easy to scan. Step three is to sort and organize the arrays—creating the structure of the index—so that the index as a whole is easy to navigate. 

Each of these elements—the entry and the array—fit together, like interlocking pieces, to create a coherent whole.

A Note about Terminology 

I’ve noticed that not every indexer, including books about indexing, distinguishes between entries and arrays. I’m guilty myself of using the terms interchangeably, though I try to be clear when I’m writing.

But while terminology varies, I do think that the distinction is important. Because an index is composed of hundreds or thousands of pieces of information, it helps to know what these pieces are and how they interact with each other. An index is also easier to edit and organize if these building blocks are clearly written and well thought out.

As you index, how are these building blocks fitting together? How can you be more mindful of each piece of information and how it interacts with the entries and arrays around it? Does it make a difference to think about indexing as building up from the smallest unit to the larger whole?

Article
0 comment

Making the Index Invisible

So the 18th edition of the Chicago Manual of Style dropped in September. I have to admit I have not bought a copy. While I think their recommendations are solid, I find I don’t use it very much, since I only index and not edit. But I do know some editors who are very excited about the new edition, and there has been chatter among indexers as well on the changes to the chapter on indexing.

The main change in regards to indexing is 15.66, which states:

Chicago now prefers the word-by-word system of alphabetization over the letter-by-letter system (but will accept either in a well-prepared index).

 

I think this change makes sense.

I personally most notice the difference in sorting when indexing Asian studies books, where I tend to see a lot of surnames like Chen, Kim, and Liu. Being so short, these names often get mixed up with other headings when sorted letter-by-letter, whereas I think the index is easier to scan if all of the surnames are sorted together. I’ve also received instructions from a scholarly press to sort the index letter-by-letter except for the names, which the press wants force-sorted word-by-word. Which begs the question: why not sort the entire index word-by-word?

For example, here is a comparison of letter-by-letter compared to word-by-word.

Letter-by-letter sorting
Liang Ji
Liang Qichao
Libailiu (Saturday)
Li Boyuan
Li Chen
Life Weekly
Lin Meijing
Li Shirui
List, Friedrich
Liu Denghan
Liu Jiang
Liu, Jianmei
Liu, Lydia
Liushou nüshi (Those Left Behind; film)
Liuxuesheng (overseas Chinese students)
Liu Yiqing
Li Yuanhong
 
Word-by-word sorting
Li Boyuan
Li Chen
Li Shirui
Li Yuanhong
Liang Ji
Liang Qichao
Libailiu (Saturday)
Life Weekly
Lin Meijing
List, Friedrich
Liu Denghan
Liu Jiang
Liu, Jianmei
Liu, Lydia
Liu Yiqing
Liushou nüshi (Those Left Behind; film)
Liuxuesheng (overseas Chinese students)
 

The word-by-word sorting, for me, is a lot easier to scan and parse when like surnames are grouped together, and when names are sorted together above other terms. It makes me confident that I am seeing all of the names present, rather than being concerned that I am missing a name that is buried below.

Also note that the Liu names are sorted according to the clarified 15.85, which states:

When the same family name is inverted for one person but not for another (e.g., “Li Jinghan” and “Li, Lillian”), the names may be listed together and alphabetized by first names regardless of the comma.

 

This also makes a lot of sense and has been my practice for a long time. By ignoring the comma, the second portion of the name is treated equally for all names, whereas if the comma is taken into account, all the names with commas sort to the top and may cause some names to appear out of order. For example,

Liu, Jianmei, 48
Liu, Lydia, 91
Liu Denghan, 148n6
Liu Jiang, 105
Liu Yiqing, 27, 144n13
 

For another interesting comparison, as a colleague pointed out, try looking for the sorting differences in the indexes between the 17th and 18th editions of the CMOS. And if you’d like to see a full list of the changes to the indexing chapter in CMOS 18,see here.

So will I now unilaterally switch to word-by-word sorting for all of my clients who request that the index follows CMOS? I don’t think so, unless I think that the index will really benefit. I think it is better if I first ask my clients if they want to change, so we are both on the same page and I am not springing a surprise on them. And, to be honest, for most indexes I don’t think that the difference between word-by-word and letter-by-letter sorting will be that noticeable.

This brings me to my larger point, which is that the mechanics of a well written index should be invisible to the user. I doubt that any reader will browse the index and think, “I wonder what the alphabetical sort is?” That is not the reader’s concern. What the reader cares about is quickly finding information.

To facilitate finding information, every aspect of the index needs to work together. This includes the sorting, the structure, term selection, phrasing, and cross-references. When it works, the reader shouldn’t notice how the index works because the reader is too busy digging into the book. When the index does not work—that is the point when the reader is pulled out of the index and is frustrated at their inability to access the information they want. The reader may not be able to articulate whythe index is not working, but something about the contents and mechanics of the index is wrong.

Bringing this back to sorting, for many indexes the difference will be negligible between letter-by-letter and word-by-word sorting. As CMOS states, they will accept either in a well-prepared index. For other books, like for me with Asian studies titles, the difference will be more pronounced.

When indexing, pay attention to when the difference matters. Make decisions based on what will make the user experience the most seamless. Pay attention to how the different elements of the index fit together. Striving to make the index invisible may be an odd way to think about indexing, but to be invisible means that the index works, which is what we ultimately want for our readers.

Article
0 comment

Paying Attention to Terminology

I am writing today about some decisions that I needed to make on a recent index. In the grand scheme of the index, these decisions only affected a few entries. I am tempted to brush these off as not very important and not worth discussing. Yet much of indexing is about paying attention to the details without getting lost in the details. And I think this is a unique situation that illustrates an important point about term selection. At least, it made me sit up and think carefully as I was working.

A good index encapsulates two different goals, which can sometimes seem like they are in opposition to each other. The index needs to be both a reflection of what the author has written and be an attempt to clearly communicate with the reader. Lose one of these aspects, and the index ceases to function.

Term selection is key to achieving both of these goals. The terms used in the index need to both match the text and how the reader is likely to search. Ideally the author and the reader are in alignment, but sometimes the author uses different language than what the reader might expect. In those situations, the index may need to bridge the gap.

I recently ran into this issue when writing the index for Saint Paul the Pharisee: Jewish Apostle to All Nations, by Father Stephen De Young (Ancient Faith Publishing, 2024).

If you are familiar with Christianity, the title may be a hint that the author is taking a different tact with terminology. While Paul was a pharisee prior to his conversion, he is now more commonly known as the Apostle Paul, or Paul the Apostle. Yet here Fr. Stephen is emphasizing Paul’s Jewishness.

In the book’s Introduction, Fr. Stephen addresses this question of terminology:

Throughout this book, I have deliberately eschewed certain language. This language is certainly acceptable and has become the usual language of the Church. However, familiar terminology can sometimes be misleading. By using the word Messiah instead of Christ, community instead of church, or Torah instead of law, I hope to unsettle commonly held notions and help the reader reassess Paul in his historical context, rather than project the experience of present-day Christians into the past.

 

This shift in terminology also extends to names, which is where I noticed the biggest difference in regards to the index.

In addition to “Paul the Pharisee,” Fr. Stephen also frequently refers to Paul by his former name, Saul of Tarsus. Jesus is referred to as “Jesus of Nazareth,” rather than Jesus Christ. A figure such as the Apostle John, also known as John the Evangelist, John the Theologian, or John the Divine, is here referred to as John, the son of Zebedee. None of these names are incorrect, but they are names that are less commonly used. They support the author’s focus on Paul and the early Church’s Jewish context and alerts readers that the author is taking a different approach.

From an indexing standpoint, do I follow Fr. Stephen’s lead? By using these names, I would provide continuity with the text and reinforce the point that Fr. Stephen is trying to make. But will readers still recognize these names in the index, outside of the context of the text? I am not helping anyone if I include names and terms that readers are unlikely to recognize.

In the end, I decided to lean into the author’s terminology. Christians form the primary audience for this book and, I assume, are familiar enough with with these Biblical figures, even if these are not the names typically used.

Paul I simply indexed as “Paul.” As the subject of the book, I decided a gloss was unnecessary. I also included a See cross-reference from Saul of Tarsus, for any readers looking under Saul and to keep all discussions of Saul/Paul in a single array.

I indexed Jesus as “Jesus of Nazareth,” with a subheading for “as Messiah,” to reflect how the author discusses Jesus. I indexed the other Biblical figures as is (“Peter,” “Silas,” “Timothy,’ etc…) except for when a gloss or tag was needed to disambiguate (for example, “James, brother of the Lord’ and “James, son of Zebedee”). This is again following the author’s approach and trusting that readers will recognize these names.

I did, however, include glosses for several of the provinces and cities discussed, such as “Achaia (province)” and “Perge (city),” especially the less well-known places (I didn’t include glosses for cities like Athens and Rome). This may not have been necessary, but I personally like knowing where things are and what things are, so as a reader I would have appreciated the differentiation.

As I wrote at the beginning, these names form a small proportion of the overall index. Was it really worth spending time considering how best to balance the author’s approach versus reader expectations? There are plenty of other discussions in the book, such as discussions about Paul’s missionary journeys, the history of the early Church, and theological issues that Paul addresses in his epistles, that I also wanted to get right.

And yet names matter and terminology matters. The index would have presented a different message if I had used more conventional names for these figures and the index would have appeared disjointed from the text. Writing a good index is often about paying attention to the details so that the entire index works together as a whole and in conjunction with both the text and readers. The trick is to see both the details and the whole. It can be easy to lose sight of the big picture.

For this book, while the author opted to shift the terminology to make a point, I decided that most readers would still be able to follow along in the index. I didn’t need to include much in the way of signposts and clarifications. But for other books, extensive use of cross-references and glosses may be necessary. While reflecting the text and the author’s intentions, the index also needs to be responsive to readers. Thankfully, we have tools to bridge that gap.

The first step, though, is paying attention to the language used by the author. The next step is considering the audience. Do the two match? From here you can select terms and write an index that is clear and recognizable to all.

Article
0 comment

My Index Editing Process

Last time I wrote about reading like an indexer and what it is I do and look for when reading a text and writing the rough draft of an index. Today I’d like to reflect on my editing process.

A few months ago I started tracking my time when I index. I had previously done so, but not effectively and I eventually gave up. This time, I’ve created a new system and a new spreadsheet that is much easier to use, and I am a lot happier with the results.

One of my insights so far is that I spend about an equal amount of time drafting and editing. I have to admit that this surprised me. I knew that editing took up a fair amount of time, but I didn’t realize that the time spent is often about 50/50. For some indexes, I actually spend a little more time editing, making the time split closer to 45/55 or even 40/60.

Reflecting further on my process, I tend to spread drafting the index over 3-6 days, depending on the length of the book. Whereas I tend to edit within 2-3 days. When drafting, I am learning what the book is about. When editing, I am fully immersed in the index and I treat it more like a sprint. It probably also helps that by the time I get to editing, the deadline is looming.

I’m realizing that I also tend to draft quickly. I do try to write a fairly clean draft, taking into account context, clarity, and relevance, as I previously discussed. I believe in trying to set myself up for an easier edit. But I also know that this is not my final draft and that some things won’t become clear until I’ve read the whole book, and so I also try to keep moving.

Editing an index, for me, is both seeing the index as a whole and going through the index line by line. I like to give myself space between drafting and editing, which usually means sleeping on the draft and beginning to edit the next day. This helps to give me some distance so I can more clearly see the whole index with fresh eyes.

I usually begin by skimming the index, making note of the larger arrays for the metatopic and supermain discussions. This reminds me of the structure I am aiming for, and is a chance to consider if I want to make any major changes. I then start at the top of the index and work my way down, line by line. I know some indexers edit using multiple passes, each pass looking at a different element. I think I would go utterly cross-eyed and unable to make sense of the index if I tried multiple passes. Instead, my goal is to fully edit the array in front of me before I move on to the next. This may mean jumping around the index to also edit related arrays, and sometimes I will go back to re-edit an array if I change my approach, but generally speaking, I systematically move through the index.

With each array, I am first of all looking for clarity. Does the main heading and any subheadings make sense? If there are subheadings, I look to see if any can be combined or reworded, or if subheadings need to be added for unruly locators. I consider if anything needs to be double posted, and check to make sure that is done properly. I consider and check cross-references. I investigate any notes I may have left for myself. I also spot-check a few locators to make sure I understood the text properly. I may also run a quick search of the PDF to see if I missed any references. I don’t check every locator, which I think would be very time-consuming—to a certain extent, I need to trust that my drafting process was thorough and accurate—but these spot checks do provide peace of mind and I do sometimes find errors.

Reviewing arrays with no subheadings is usually quick, unless I’ve left a note for myself or I decide to spot check. Arrays with subheadings take more time. If an arrays has 20+ subheadings, I may spend as much as twenty or more minutes making sure that the array is in order. I often find the larger the book, the larger the index, the more subheadings there will be, and the longer editing will take.

Considering my process, I do wonder if I can shave off time. I could spot check a little less, especially for simple arrays with no subheadings, trusting that I picked up what was necessary. I can also pay more attention, when drafting, to larger arrays, so that editing them isn’t so onerous. I could also explore using more macros and patterns for batching tasks such as double-posting or removing subheadings. What I like about my process, though, is that it is thorough and I can clearly see what is completed and what is still to come. Editing line by line helps to keep my thoughts in order.

Other Approaches to Editing

My approach to editing is not the only approach, of course. I’ve mentioned making multiple passes. I also know of indexers who do a quick edit at the end of each day, while drafting, so that the draft is cleaner. I’ve also heard indexers who say that they do such a thorough job drafting that the editing process only takes them a couple of hours. I don’t know how that works for them. I seem to need a lengthier editing process for the index to gel and come together. And that’s okay. We are all different. What matters is that you find a process that works for you.

I find it interesting to hear how others index, even if it is not something I would do myself. I hope this glimpse into my process gives you something to think about.

Article
0 comment

Reading Like an Indexer

So you are sitting down to write an index. You scroll to the first page in the PDF, or, if you’ve printed out the proofs, you place the first page on the desk in front of you, and then…what? What is your thought process? How do you decide what entries to extract? How do you read?

Reading to index is different than reading to edit, reading to learn, or reading for pleasure. I think of reading to index as a process of disassembly. I try to identify how the author has written and structured the text, and I then pull apart all of those pieces, big or small, and reassemble them into the form of an index. This is very much an active reading, in which I am identifying, analyzing, and making decisions. 

I generally look for two types of information when I draft an index.

  • Specific details. These are names, places, companies, concepts, etc… that are explicitly mentioned and discussed. These are usually fairly obvious. If there are a lot of names or other such details, I may index a few pages, pick up these details, and then go back and re-read to make sure I also understand the larger discussion.
  • Broader topics. These range from the metatopic—what the whole book is about—to supermain and regular discussions—both themes spanning the book and what specific chapters or sections are about. It is important to have index entries which correspond to these broader discussions, and so in addition to picking up specific details, I try to also understand the big picture. These broader topics are also tied to the structure of the index, as I consider how best to reflect the book’s structure in the index, and as I anticipate that these large discussions will become large arrays, anchoring the index. Depending on the book, as mentioned, I may need to read a section two or more times to properly mine all relevant entries. 

Once I have identified the large and small pieces that the book is made of, I need to decide how to translate that into the index. Here are a few tips I find helpful to keep in mind.

  • Understand what you are reading. This may seem obvious, but I think it is worth stating. The temptation, at least for me, is to guess if I am unsure and to create an entry anyway. And sometimes guessing is the best I can do in that moment. I flag the entry for revisiting later and I move on. What can be more effective, though, is to read ahead a few pages until I do understand, and then go back and create the entry. It’s okay to be patient. Taking the time to understand can pay off later with better understanding of what comes next in the text and with less editing due to a stronger draft. 
  • Place the information in context. Are you looking at a specific detail or a broader topic? How does the detail or topic relate to other details or topics? Can this be turned into a subheading? Should it be double-posted? Is a cross-reference necessary? What other entries does this suggest? While subheadings, cross-references, and double-posts can all be revisited later, when editing the index, I like to start thinking about them while writing the rough draft. The information in the book is an interconnected web, which the index should reflect. So as part of your thought process, get in the habit of looking for these connections. 
  • Filter for relevance. In addition to understanding the larger context, also pay attention to relevance. Think about the audience before you begin writing the index. Consider how much space is available for the index. What should the index focus on? Sometimes I am not sure if an entry is relevant and so I pick it up anyway, labeling it for possible deletion later. But the more I can filter out now, the less I need to cut later. 
  • Communicate with clarity. This is especially true for subheadings. Make sure that readers understand what this entry means. Be concrete and, where relevant, link back to the larger context. You don’t want to leave readers guessing, nor do you want to leave yourself guessing when you come around again to edit.

All combined, this is a lot to do while reading and indexing. It can be difficult to identify both specific details and larger discussions, while also weighing relevance, and paying attention to the context, and thinking about related entries, and thinking about how best to phrase for clarity. Reading to index is a skill that takes practice.

Remember too that the rough draft does not need to be perfect. My drafts are certainly not perfect, and while I am thinking about all of this while drafting, I spend about an equal amount of time editing. 

How you read is up to you. I tend to start reading and I type entries into Cindex, the indexing software that I use, as the entries come to mind. Other indexers prefer to first mark up the proofs, identifying what is indexable and making notes for themselves, before they go back and type up the entries. There is no right or wrong approach, so long as you are paying attention to all aspects of the text, both big and small.

If you are newer to indexing, you may find marking up the proofs to be a good way to visualize or make concrete this thought process. I marked up proofs the first 3-4 years that I indexed, which in hindsight was necessary for me to engrain this way of reading. Once indexing started to become habit, I stopped marking up, though I still read ahead sometimes to better understand what the text is about. 

Writing an index is a unique way to interact with the text. It does require a shift in how you read and see the text. Once you make that shift, indexing becomes easier. 

Article
0 comment

Is AI Indexing Nearly Here?

No surprise, publishing continues to react and interact with artificial intelligence. A couple of colleagues recently raised AI on a couple of indexing email lists. I get the sense that many indexers are concerned about the potential for AI to replace us, or at least that publishers will believe that AI can replace a human-written, thoughtfully constructed index. I have to admit I also feel uncertain about what the future holds. I wrote about AI and indexing last year, and I think it is worth considering again. 

Is Indexing by AI Nearly Here?

One colleague flagged this article from The Scholarly Kitchen, “AI-Enabled Transformation of Information Objects Into Learning Objects,” by Lisa Janicke Hinchliffe. Hinchliffe reviews three new AI tools which purport to help readers access and understand academic writings. Of particular interest to indexers is what Hinchliffe writes about Papers AI Assistant: 

When exploring the functionality as a beta tester, I was curious how the results compared to my pre-AI tool practice of making heavy use of CTRL-F to locate keywords in lengthy texts. I found that, not only did the Papers AI save me a great deal of time by providing me with an overview annotated with links to specific sections of the text, it also often alerted me to places in the text where my topic of interest was conceptually discussed without the use of the specific keywords I would have searched.

Did you catch that last bit? Papers AI Assistant can apparently identify discussions of interest without the use of a keyword search. That is what a good index is supposed to do. Is this the beginnings of an AI that can replace indexers? Hinchliffe also writes that, “I am excited by the possibilities these AI tools offer for moving the focus from access to information to comprehension of it.”

A few thoughts: I have to admit that I am skeptical of the claim or hope that Papers AI and similar tools will help readers comprehend information. My sense is that AI works best as a tool, with the user clearly understanding its strengths and limitations, and with the user making the final decision on the quality of results and how best to use the results. That is similar to how I use the search function when indexing. Search is useful for double-checking facts and mentions, but I know that it doesn’t catch everything and isn’t good at providing context; I still need to read and understand the book. My fear is that many users will uncritically accept whatever the AI tool tells them, turning a program like Papers AI into glorified CliffNotes and enabling an even shallower engagement with the text. 

I think it is also worth pointing out that what is described here is not an index. An index is a static document that is browsable. That is very different from an AI highlighting a handful of potentially relevant passages. Browsability is key to an index because it allows the user to serendipitously find information they didn’t know they wanted to find. Being handed a few options leaves the rest of the text opaque and unaccessible. I imagine a user can keep asking the AI new questions, but that puts the onus on the user to know what they are searching for and how to ask relevant questions. 

Of course, if an AI can identify concepts and discussions in the absence of clear keywords, then a logical next step could be to ask that AI to generate an index. I can see value in the ability to create an index on the fly, for any document. I don’t know how much I would trust such an index, though. Hallucinations is one issue. Another is that AI, essentially, is built upon algorithms. Answers are always going to be follow a certain pattern. While indexing is built upon rules and conventions, the indexer also plays a key decision-making role as they shape the contents, phrasing, and structure. These judgement calls extend beyond the formal rules of indexing to take into account elements such as the audience and usability. I am skeptical that an AI would be able to understand and produce these nuances.

Another issue is that these AI tools are entirely digital. They will not work on a print book, though, of course, an AI-generated index could be published in print. Is the future of publishing and of engagement with texts entirely digital? Perhaps in academia and other specialized fields, in which there is so much information to access and consume. Print sales remain strong, however, and I am hopeful that there will continue to be a place for print indexes. Perhaps the future—finally arrived?—is what embedded indexing has long promised, which is one index capable of being used in multiple formats. 

Besides AI replacing indexers, I think it is also worth considering how we as indexers can use AI in our own work. I am aware of one colleague who uses ChatGPT to summarize complicated books and to answer queries about the text, which helps that indexer comprehend the book more quickly. Which sounds very similar to what Papers AI claims to do. I think that is a legitimate use of AI. So long as the indexer is in control—using the AI as a tool, understanding both indexing best practices and the contents of the book, and is actively shaping the index—then why not use AI? I’m also open to having AI index elements which are time-consuming to pick up, such as scientific names, so long as the indexer is providing quality control. What I don’t want to see are indexers—or anyone else—passively accepting an AI-generated index, assuming that it is accurate and functional when it is actually not. That is my worse nightmare about AI, that we abdicate our critical thinking and decision making skills, potentially leading to errors and disasters because we have lost the ability to assess what AI is telling us.

Author Pushback

In contrast to the gold rush to embed AI into publishing, another colleague pointed out that some books are beginning to be published with prohibitions against AI and machine learning listed on the copyright page.  I also recently noticed this in a book I am indexing.

I’ve also heard from a trade client that their authors are starting to insist that book contracts include a clause that their books will not be uploaded or otherwise used to train AI. By extension, this means that all freelancers hired by this press, including myself, are not allowed to use AI tools while working on their manuscripts and proofs (which isn’t a problem for me, since I wasn’t doing so anyway). 

Will authors and publishers win against AI? Will publishers find ways to enforce their contracts and prohibitions? Will publishers change their minds, or will AI developers sufficiently address the fears that authors have? I suspect this may be an area where the publishing industry goes in two different directions: some segments, such as academic publishing, which prize easy access to information (provided you can get behind the paywall), will embrace AI, while other segments, which care more about the author and which sell directly to readers, will reject AI. 

Or, maybe AI in publishing is a bubble and these new applications will fail to live up to their hype. 

I still think that someone will try to develop an AI capable to writing an index. Some publishers will probably adopt it for the sake of saving time and money, even if the resulting indexes are useless. I am also hopeful that the value of the human touch will remain. Even if AI is incorporated into our work, I think there is still place for human guidance and discernment. Machines may be capable of generating an approximation, but only humans can create what is truly useful for other humans.

Article
0 comment

When Subheadings Are Not So Useful

I love subheadings. They add so much to an index, breaking down long strings of locators into smaller chunks, highlighting meaning distinctions, and gathering related entries into lists so readers only need to search in one place. As I discuss in my last reflection, subheadings can also reflect the story that the text is telling. Well-written subheadings are clear, specific, and meaningful.

But…in indexing there is always a but. Occasionally, a project comes along that proves the exception. 

This happened with a recent index I wrote, for To See What He Saw: J.E.H. MacDonald and the O’Hara Years, 1924-1932, by Stanley Munn and Patricia Cucman (Figure 1 Publishing, 2024). J.E.H. MacDonald was a Canadian painter and a member of the Group of Seven. He fell in love with the landscape around Lake O’Hara, in the Rocky Mountains, and spent several summers there painting. This book takes an interesting approach to MacDonald. Over the course of almost twenty years, the authors sought to identify the exact locations where MacDonald painted. The bulk of the book is composed of a brief discussion of each of the O’Hara paintings, alongside a photograph of what the scene looks like today. The rest of the book is composed of an introduction, an overview of each of MacDonald’s eight trips, and excerpts from MacDonald’s diaries and other writings. The result is a beautifully illustrated coffee-table book. 

The instructions from the press were to only index the paintings, people, and places. While narrow in scope, there isn’t too much else discussed, and these are what readers are most likely to want to find, so I thought the instructions reasonable. Figure 1 Publishing is also very good at providing clear specifications for how long the index can be. For this book, the specs were 55-60 characters per line, for 675 lines total. 

I quickly realized that the book mentions a lot of paintings and places. The book discusses 226 paintings, almost all of them by MacDonald. With each painting taking up at least a line, some of them more, the paintings alone fill up about a third of the index. The rest of the index is mostly places—mountains, lakes, creeks, trails, huts—in and around Lake O’Hara that MacDonald either painted or visited. In comparison, only a few people are mentioned.

I also realized that the book contains a lot of repetition. For example, the same mountain may appear in a couple dozen different paintings. That mountain is mentioned again in the overviews of MacDonald’s trips, and then again in MacDonald’s diaries. This kind of repetition makes sense given how the book describes the same events and paintings from different angles, but it does mean that the mentions add up. Arrays with especially long strings of locators include Cathedral Mountain (49 page references), Hungabee Mountain (39 references), Odaray Bench (34 references), Lake McArthur (32 references), and Lake Oesa (27 references).

Normally, I would add subheadings to these arrays. Asking readers to look up each page reference is a big ask. But for this index, I left those strings, for paintings and places, intact. 

Not using subheadings was a conscious decision, and one I didn’t make lightly. My initial instinct was to find subheadings. But as I indexed and considered the entries, I also realized that subheadings would not be so useful in this particular index. Wanting a second opinion and to avoid surprising the press with a departure from my usual approach, I also queried the editor I was working with and got their approval.

I decided to not use subheadings for two reasons. One, I realized that too many subheadings would quickly make the index too long. Unfortunately, space constraints can sometimes mean putting aside the index that you want to write for the index that fits. In these situations, I need to be strategic about picking and choosing the subheadings that will have the biggest impact, while also being okay with other arrays not having subheadings. 

More importantly, though, for this book, I couldn’t think of subheadings that I was satisfied with. For subheadings to be effective, they need to clearly articulate additional information that readers can use to narrow their search. But what if there are no clear distinctions between locators? In that case, I think the long strings of locators should be left alone. It is not helpful to introduce artificial distinctions or to get so granular that context is lost. 

As I mentioned, this book contains a lot of repetition. Places either appear in MacDonald’s paintings, are places that MacDonald visited, or both. This doesn’t provide much to hang a wide range of subheadings. 

I briefly considered listings all of the paintings that each mountain or other feature appears in, along with a subheading for MacDonald’s presence at. For example, 

Cathedral Mountain: MacDonald at; in painting 1; in painting 2; in painting 3; in painting 4; in painting 5; etc…

But this approach presents a few problems. Some arrays would have been enormous, with a dozen or two subheadings for each of the paintings. Besides the space issue, I’m not convinced that listing each painting would have been meaningful to readers. Would readers remember the titles of individual paintings? In many cases, multiple paintings shared the same title. Thankfully, the authors give each painting a unique alphanumeric code, which I included in the index to differentiate. For example, “Lake O’Hara (25-1.3(S))” and “Lake O’Hara (30-3.1).” But I imagine it would still be difficult remembering which is which. Alternatively, I could have created a subheading for “in paintings,” but that would have still resulted in a long string of locators, as would the subheading “MacDonald at.” “MacDonald at” also isn’t very useful since readers can presumably assume that MacDonald was there, as that is the focus of the book. 

Given the space constraint and that either way—with a couple of generic subheadings or without subheadings—the arrays would have long strings of locators, I decided it was best to keep the arrays simple and to forego subheadings. This does mean that readers will need to search through each locator, though readers should also quickly notice the repetition, and it is all there for the dedicated searcher. 

This isn’t to say that I avoided subheadings entirely. I did use them in a few places, mostly for people, though even with people I found it difficult to avoid longer strings of locators. Many of these references are brief mentions and again reflect the repetition throughout the book. For example, here are two arrays for MacDonald’s friend, George Link, and wife, Joan.

Link, George K.K.
     about, 234, 343n83
     Lake O’Hara Trails Club and, 340n25
     MacDonald and, 82, 91, 104, 107, 113, 114, 143, 215, 224, 233, 234, 239, 243, 249, 252, 253, 254, 256, 258, 264, 294, 301, 307, 308, 310
     photographs, 246, 260

MacDonald, Joan
     encouragement from to travel west, 13, 202, 205
     letters to, 93, 96, 115, 120, 121, 122, 131, 167, 175, 200, 203, 204, 205–6, 211–12, 229, 230–31, 236, 240, 256, 265
     Links and, 341n47
     MacDonald’s departure west and, 250
     mentions in MacDonald’s diary, 304, 308
     O’Hara trip with MacDonald, 36, 123, 191, 217, 221, 222, 223–24, 259
     photo album, 224

While I highly encourage you to include subheadings and to make sure that subheadings are clear, specific, and meaningful, I think it is also worthwhile considering the exceptions to the rule. I hope that my approach to the index for To See What He Saw, about J.E.H. MacDonald’s paintings in and around Lake O’Hara, is helpful for considering when subheadings may not be useful. If there is a lot of repetition in the text, if it is difficult to find meaningful distinctions, and if there is a hard space constraint, then it is okay to have long strings of undifferentiated locators. It is not ideal, but it may still be the best solution for that particular text and index.

Article
0 comment

Indexing Local History: Stories I’ve Been Told Series

Front cover for the book Stories I've Been Told, Vol. 3, by Elaine Thomas.Local history can be both a joy and a challenge to index. It can be deeply personal, both for those telling and writing the stories, and for those reading, as it reaffirms our bonds with each other and with the places we belong. The index is often the first point of contact with the text, as readers search for the people, places, and memories that they hold dear—or even search for themselves.

Over the last few years, I’ve had the honour and pleasure of working with Elaine Thomas, an author and storyteller. Elaine is an Albertan transplanted to Fayette County, Texas. She regularly writes for a local newspaper, The Fayette County Record, including, for several years, a column profiling a wide array of locals, especially senior citizens who reminisce about their lives. Elaine is now collecting and self-publishing these columns in the Stories I’ve Been Told series. I have indexed all three volumes published so far, as well as three other books of local and family history that Elaine has written. Today, I want to discuss how I approach indexing the Stories I’ve Been Told series, drawing examples from the third volume that was launched last November.

I love Elaine’s work for a couple of reasons.

Elaine has a knack for finding and telling incredible stories of everyday life. These are ordinary, everyday people, and yet dig beneath the surface, as Elaine does, and extraordinary accomplishments and joy shine forth. It is an excellent reminder of the wealth of knowledge and experience that surrounds all of us, if we only pay attention and listen. Elaine’s books epitomize the value and importance of preserving our local history. 

I also love reading these profiles of people in Fayette County, Texas, because it is a completely different world than what I am familiar with. Add in the fact that many of the people profiled are reminiscing about life during the Depression, World War II, and postwar, and it is a whole other world yet again. Rural farm life among Czech and German immigrants is about as far away as you can get from the concrete Taiwanese city of a million people that I grew up in, except, maybe, for the shared summer heat. I get to explore a different perspective and way of life as I index these books, a way of life that is slowly becoming more familiar with each book.

From an indexing standpoint, there are a couple of challenges which I find are common to indexing local history. The first is, what in indexable?

Details

Local history books often contain a lot of detail. Because there is so much that could potentially be picked up, it is a good idea to decide ahead of time, as much as possible, what is indexable and what can be left aside. I find my approach often evolves as I work and better understand the text. The plan does not need to be rigid. But starting with a plan does help to avoid being overwhelmed by the sheer number of potential entries, and to avoid adding, and then later deleting, irrelevant entries.

For the Stories I’ve Been Told series, I make a distinction between Fayette County and the rest of the country and world. The people profiled are all from and live within Fayette County. I assume that readers of the book also have a connection to the area. So information about Fayette County forms the bulk of what I pick up. I index somewhat less detail about the rest of Texas, with most such entries being about neighbouring counties and cities that readers are likely to be familiar with, and where the people profiled may have studied and worked. I index the least amount of detail about the rest of the country and world. If someone spent part of their career in Virginia, for example, I will likely include that as a subheading, but I probably will not create main headings for places and businesses within Virginia, as I don’t think that readers will be searching for Virginia-related details.

I index all of the local people. This can lead to long lists of family members, if a person profiled mentions all of their grandparents, parents, siblings, spouse, children, and other relatives. But since this is intended for a local readership who may be searching for their families and friends, I think it is important to pick up all of the names, even minor mentions. This can also mean double-checking surnames with Elaine, to make sure I am properly identifying people.

I also pick up places. This includes all of the cities and towns within Fayette County. I also pick up churches, schools, local businesses, significant geographic features, and any other place that seems important. These are often minor mentions, but again, this is a book for local readers. There are memories attached to these places, and local histories can be an aid for people to access their own memories.

I also pick up details for various activities. These can be memories about Christmas or attending dances, childhood memories of working on the farm, or about people’s careers, such as delivering mail or running a flooring business. I also create arrays for local events, like the Fayette County Fair.

Several of the people profiled are veterans who reminisce about their wartime experiences. I’ve learned that honouring vets is important, in a way that seems more strongly emphasized than in Canada. I include several arrays for the different branches of military; the military bases where these veterans served, especially those nearby within Texas; and the wars, which is mostly WWII, along with a few mentions for the Korean and Vietnam wars. 

Basically, if someone or something happened or existed within Fayette County, I index it. Elaine and I want local readers to be able to find their family and friends, and places and events, that are significant to them.

Structure

With so many details, structuring the index is also important, to ensure that the index is easy to search.

I build the index structure around the book’s structure. Stories I’ve Been Told, Vol. 3 contains 30 profiles, with each profile about 8 pages, give or take. This includes 1-3 pages of photographs. Because it is these people who form the core of the book, I use subheadings for each person, focusing on what that person chooses to discuss. I also include a range, at the top of the array, for the whole profile. Photographs are indicated in italics. 

Kea, Arleas Upton, 1–9

career with FDIC, 1, 7

childhood, 2–3

education and desegregation, 3–4, 9

family, 3, 7, 8, 9

photographs, 7–9

prayer and worship, 3

reflections on life and success, 6

at University of Texas, 5–6

I also do a lot of double-posting. All of the churches, for example, are both indexed as standalone entries and are gathered together in a single array. Gathering together does mean that the index will be longer, but I think it is helpful to provide a place for readers to scan if they can’t remember the name of a specific church or if they want to see which churches are mentioned. It is also generally a good practice to provide multiple access points, if there is space, to accommodate how different readers choose to search. If the community that the church is in is not obvious from the church’s name, I also include that detail in parentheses. (I also include the community in parentheses for main headings if the community is not obvious from the name of the church, school, or business.)

churches

Bethlehem Lutheran Church (Round Top), 189

Big Spring Hill Baptist Church, 79

Elm Creek Baptist Church (Seguin), 179

Holy Cross Lutheran Church (Warda), 143, 147, 149

Prairie Valley Lutheran Church, 181

prayer and worship, 3

Queen of the Holy Rosary Catholic Church (Hostyn), 89, 92, 217

Sacred Heart Catholic Church (La Grange), 36, 91

St. James Missionary Baptist Church (Plum), 79

St. James Missionary Baptist Church (Schulenburg), 3

St. John the Baptist Catholic Church (Ammannsville), 83

St. Mary Catholic Church (High Hill), 40

St. Mary’s Catholic Church (Ellinger), 125

St. Paul Lutheran Church (La Grange), 27, 220

St. Paul Lutheran Church (Serbin), 97–98, 99

St. Rose of Lima Catholic Church (Schulenburg), 54, 56–57, 101

Sts. Peter and Paul Catholic Church (Plum), 32

Swiss Alp Lutheran Church, 112

Trinity Lutheran Church (Black Jack Springs), 12, 154

I also double-post for significant events and memories that have enough entries to warrant subheadings, such as Christmas:

Christmas

Alvin J. Anders’ memories, 101–2

Christmas trees, 54, 57, 138, 143, 153

Frances Pietsch Schumann’s memories, 143–44

gifts for WWII soldiers, 51

Gracie Loessin Taylor’s memories, 153

Kahlich family traditions, 54–57

mail delivery and, 134

Santa Claus, 54–55

St. Nicholas (St. Nicholas Day), 53–54, 58

For military arrays, such as the wars and branches of the military, I suspect that some readers would like to see who else served, and so I double-post names in these arrays as well, in addition to double-posting military bases. For example,

U.S. Air Force

Bien Hoa Air Force Base (Vietnam), 225, 227–28

Eugene J. “Gene” Wessels, 177, 178–79, 183

Fort Francis E. Warren Air Force Base, 178

Harry Richard “Dick” Peck, 223–28, 231, 232

Lackland Air Force Base, 61

Laredo Air Force Base, 223–24

For Stories I’ve Been Told, Vol. 3, thanks to a suggestion from Elaine, I’ve also included cross-references from the various towns to the people profiled who are from those towns, so that readers can more easily see the connections between people and places. I should have thought of this for the earlier volumes, and I’m glad Elaine noticed this possibility. For example,

Rutersville (TX), 69, 89–90. See also Dixon, Richard; Fietsam, Lydia Eberenz

Working with the Author

Especially if you are not familiar with the area or history, take advantage of the author’s knowledge. When I first started indexing this series, Elaine’s insights were invaluable as I made my plan for how to index, as well as helpful feedback on the draft index. This can be a fruitful collaboration to serve the readers.

Indexing local history can often be more work than it initially appears. All of those details and entries can add up, and then you need to decide how best to organize. Indexing local history can also be satisfying, helping readers remain engaged with their history and community, as well as the reminder that each of us live extraordinary lives, if only we can see ourselves, and each other, from the right angle. 

If you would like to see the full index for Stories I’ve Been Told, Vol. 3, you can find it on Amazon, using the “Read Sample” feature. The indexes for the first two volumes are also available for viewing.If you would like to buy a copy, proceeds are being donated to assist struggling students at Blinn College, Schulenburg campus. Elaine also writes a lovely blog, Stories From the Slow Lane, where you can enjoy more stories about the past.