out of the land of the dex
Oct. 17th, 2006 08:55 amI finally tidied up and submitted the index yesterday - the index that has been hanging over my head since Dr. Academic first asked me last year to compile it, and has been weighing far more heavily on my time for the last six weeks since the proofs arrived. The indexable part of the text is about 250 pages, but it's a mightily fact- and name-filled book. The index has about 750 main entries, and a total of about 5500 page references. I hope the publisher doesn't want to cut it. But at least for the moment I'm done, which means it's time for my indexing rant:
I took a course in indexing in library school. It was by far the most useless course I took there. Neither the professor, nor the textbook, had any real interest in teaching us how to index. The emphasis was all on the arrangement of the index after it's compiled: the differences between word-by-word and letter-by-letter alphabetization, and the exact rules for each; the advantages and disadvantages of run-in versus indented sub-entries; varying systems for inverting foreign names (a topic already covered in the cataloging rules); how far to indent the run-on lines; and so forth. All things an indexer needs to consider, to be sure, but covered in oddly minute detail in the weird absence of the main topic. It reminded me - and I said so at the time - of a cookbook that begins, "First, cook the meal. Now, we will discuss the finer points of setting the table." Or, more relevantly for some of you, a writing class that's all about spelling corrections, formatting the document, and whether to submit it on paper or electronically.
I've remained interested in indexing - it bears some similarities to cataloging - and I've indexed one commercially published book before, a collection of reviews. But just about everything I've read on indexing has had the same weird absence of advice on how to index, omitted in favor of discussion of how to arrange the index. Perhaps, at most, there will be advice on the color highlighter to use when marking up the proofs.
The publisher informed me that they follow the Chicago Manual of Style, so I read its chapter on indexing. Sure enough, it's mostly about arranging the index. But buried in all that are a few nuggets that help with the actual indexing. Chicago defines light and heavy indexing and tells you how many references by average per text page are deemed to constitute each. This was a useful guideline. It specifically warns against the most sure-fire sign of a badly compiled index: long strings of undifferentiated page references. I set myself a limit of seven at the maximum, and often split into sub-entries at much smaller numbers than that.
One might think that indexing is is simply a matter of going through the book looking for names. Some indexers certainly do; I've seen indexes compiled that way. They are called "bad indexes." But good indexing requires thought. In the middle of discussing something else, Chicago informs the reader that "names or terms that occur in passing references and scene-setting elements that are not essential to the theme of a work need not be indexed," and kindly but parsimoniously gives one example. That's a start, though there's much more that could be said on the subject of proper names. But what about other topics? When a book's topic is large and complex but not clearly differentiated into parts, how do you decide which terms to use and which pages to list it for, when it permeates the whole book? Even a proper name: C.S. Lewis is named on literally 4/5ths of the pages in this book, so how does one construct sub-entries?
There's no advice or guidance on such questions. It helps to know the text well, and one of the reasons indexing is time-consuming is the necessity for the indexer to become familiar with the book. I had a head start here, as I'd read the text five or six times through already during its long gestation, making comments and looking for glitches. But this only helped me backtrack more, as the way to index something became clear halfway through, and I could say, "I know there were other instances of this earlier." Having a searchable PDF of the whole text was a great help, though I actually did the indexing from a printout in a three-ring binder. My highlighting pen, as it happens, was pink, this being the most common color after yellow at the office supply store, and much easier to see.
Indexers are legendarily supposed to be bad. There's some truth to this. I tried to get a sense of how to index by deconstructing the indexes to two of my own articles, as those were texts I know well. These were published in books that were indexed by their editors. I went through the indexes looking for the page numbers that my article covered, typed the numbers and entries into a spreadsheet, and rearranged it by page number, to see what topics were chosen for indexing. One of the indexes was pretty good; the other was haphazard at best, picking random names irrelevant to the topic, leaving out basic concepts, and so on.
Isaac Asimov tells in his autobiography of having hired a professional indexer for one of his first non-fiction books, and being so appalled at the results that he vowed to index all his further books himself. But I can't tell you where in his two-volume autobiography he says this, for although the books are indexed, Asimov himself was a terrible indexer. Each has two separate indexes (itself against the advice of Chicago and most other index-arrangement authorities). One is the "Name Index," by which Asimov means persons, though he doesn't say this. No places or institutions need apply. (Can you find from the index when he started working for the Navy Yard or for Boston University, say? No.) All the names are given, however irrelevant to Asimov's life story ("Calvin, John, 290") or however unlikely that readers will look them up ("Edith, 534; Eileen, 165"). And of course they're totally undifferentiated: "Campbell, John W., Jr., 189, 192, 194-207, 212, 219, 223-25 ..." and so on in a regular stream up to "650, 661, 669, 677, 687, 699, 701," and that's just the first volume. The other is the "Title Index," by which Asimov means - but again doesn't bother to explain specifically - the titles of his own stories and books. The clumsy part here comes when dealing with a story whose title changed somewhere in the writing or editing or publishing process. Asimov mindlessly indexes it just under whatever title it's referred to by on the page. No cross-references. You have to look at all the pages referred to and find the comment about the title change to discover that there's another title with a whole bunch more references.
I tried to avoid such pitfalls. My assignment is full of names and titles, but I tried to be intelligent about them. There are many long lists of names. If the individual names seemed relevant, I indexed them; if not, not. But I always indexed the list as a unit for whatever point it was there to make. Similarly with book titles. If a book by an Inkling was mentioned, I indexed it, directly under title. Almost always I made an entry for the author as well, but not sub-entered under book title - that's already covered - but by topic. Why is the book mentioned at this point? Is it as an example of a work written in collaboration, or with feedback, from other Inklings? Was it reviewed by another Inkling, or did somebody mention it favorably, or for that matter criticize it? Into the proper subcategory of the author entry it goes. If the book's title entry gets long, make sub-entries for that too.
One wants to be consistent in sub-entry terminology, but I felt no need to be consistent in depth of indexing. If a minor entry has just 4 or 5 references, don't bother to make sub-entries unless there's an important point to be brought out. If, among the many sub-entries on a major topic, there are 3 or 4 hair-splitting distinctions with only one or two references each, bring 'em together. But if they have 5 references each, keep them split, or split them if they haven't been already.
Nor is there any need to be reciprocal. The subtopic is the subtopic of that particular main entry. If A is quoted describing B's personality, the sub-entry under A should say, "on B" unless this needs to be further divided. But the sub-entry under B should say "personality," not "A on." If a reader wants to know what A said about B, look under A, not B. And so on.
This index was done in three stages. First, marking the proofs. Second, typing the entries and page numbers into a spreadsheet, which I then had the computer alphabetize (computers alphabetize word-by-word, so as the publisher requested letter-by-letter I manually rearranged it in final copy - not a difficult task). After completing the spreadsheet I thought I was mostly done, and had only to cut-and-paste into a Word document with some minor editing. Wrong. Stage three, the actual writing of the index, was the hardest and took by far the longest. Some entries were short and simple, but I always searched automatically through the PDF to see if I'd missed an obscure name reference or distinctive use of a word. Sometimes I had. But with the sub-entries, all my notes in the database were ad hoc. Constant rethinking was required to sort them out, and often I had to go back and make changes. There was nothing mechanical about this. It took thought and time.
I did some proofreading against the spreadsheet and found a few other errors when checking the printout for something. I'm sure there are others left. But I hope the index will be useful. This book is going to remake Inklings studies, and I get to guide people through its pages.
I took a course in indexing in library school. It was by far the most useless course I took there. Neither the professor, nor the textbook, had any real interest in teaching us how to index. The emphasis was all on the arrangement of the index after it's compiled: the differences between word-by-word and letter-by-letter alphabetization, and the exact rules for each; the advantages and disadvantages of run-in versus indented sub-entries; varying systems for inverting foreign names (a topic already covered in the cataloging rules); how far to indent the run-on lines; and so forth. All things an indexer needs to consider, to be sure, but covered in oddly minute detail in the weird absence of the main topic. It reminded me - and I said so at the time - of a cookbook that begins, "First, cook the meal. Now, we will discuss the finer points of setting the table." Or, more relevantly for some of you, a writing class that's all about spelling corrections, formatting the document, and whether to submit it on paper or electronically.
I've remained interested in indexing - it bears some similarities to cataloging - and I've indexed one commercially published book before, a collection of reviews. But just about everything I've read on indexing has had the same weird absence of advice on how to index, omitted in favor of discussion of how to arrange the index. Perhaps, at most, there will be advice on the color highlighter to use when marking up the proofs.
The publisher informed me that they follow the Chicago Manual of Style, so I read its chapter on indexing. Sure enough, it's mostly about arranging the index. But buried in all that are a few nuggets that help with the actual indexing. Chicago defines light and heavy indexing and tells you how many references by average per text page are deemed to constitute each. This was a useful guideline. It specifically warns against the most sure-fire sign of a badly compiled index: long strings of undifferentiated page references. I set myself a limit of seven at the maximum, and often split into sub-entries at much smaller numbers than that.
One might think that indexing is is simply a matter of going through the book looking for names. Some indexers certainly do; I've seen indexes compiled that way. They are called "bad indexes." But good indexing requires thought. In the middle of discussing something else, Chicago informs the reader that "names or terms that occur in passing references and scene-setting elements that are not essential to the theme of a work need not be indexed," and kindly but parsimoniously gives one example. That's a start, though there's much more that could be said on the subject of proper names. But what about other topics? When a book's topic is large and complex but not clearly differentiated into parts, how do you decide which terms to use and which pages to list it for, when it permeates the whole book? Even a proper name: C.S. Lewis is named on literally 4/5ths of the pages in this book, so how does one construct sub-entries?
There's no advice or guidance on such questions. It helps to know the text well, and one of the reasons indexing is time-consuming is the necessity for the indexer to become familiar with the book. I had a head start here, as I'd read the text five or six times through already during its long gestation, making comments and looking for glitches. But this only helped me backtrack more, as the way to index something became clear halfway through, and I could say, "I know there were other instances of this earlier." Having a searchable PDF of the whole text was a great help, though I actually did the indexing from a printout in a three-ring binder. My highlighting pen, as it happens, was pink, this being the most common color after yellow at the office supply store, and much easier to see.
Indexers are legendarily supposed to be bad. There's some truth to this. I tried to get a sense of how to index by deconstructing the indexes to two of my own articles, as those were texts I know well. These were published in books that were indexed by their editors. I went through the indexes looking for the page numbers that my article covered, typed the numbers and entries into a spreadsheet, and rearranged it by page number, to see what topics were chosen for indexing. One of the indexes was pretty good; the other was haphazard at best, picking random names irrelevant to the topic, leaving out basic concepts, and so on.
Isaac Asimov tells in his autobiography of having hired a professional indexer for one of his first non-fiction books, and being so appalled at the results that he vowed to index all his further books himself. But I can't tell you where in his two-volume autobiography he says this, for although the books are indexed, Asimov himself was a terrible indexer. Each has two separate indexes (itself against the advice of Chicago and most other index-arrangement authorities). One is the "Name Index," by which Asimov means persons, though he doesn't say this. No places or institutions need apply. (Can you find from the index when he started working for the Navy Yard or for Boston University, say? No.) All the names are given, however irrelevant to Asimov's life story ("Calvin, John, 290") or however unlikely that readers will look them up ("Edith, 534; Eileen, 165"). And of course they're totally undifferentiated: "Campbell, John W., Jr., 189, 192, 194-207, 212, 219, 223-25 ..." and so on in a regular stream up to "650, 661, 669, 677, 687, 699, 701," and that's just the first volume. The other is the "Title Index," by which Asimov means - but again doesn't bother to explain specifically - the titles of his own stories and books. The clumsy part here comes when dealing with a story whose title changed somewhere in the writing or editing or publishing process. Asimov mindlessly indexes it just under whatever title it's referred to by on the page. No cross-references. You have to look at all the pages referred to and find the comment about the title change to discover that there's another title with a whole bunch more references.
I tried to avoid such pitfalls. My assignment is full of names and titles, but I tried to be intelligent about them. There are many long lists of names. If the individual names seemed relevant, I indexed them; if not, not. But I always indexed the list as a unit for whatever point it was there to make. Similarly with book titles. If a book by an Inkling was mentioned, I indexed it, directly under title. Almost always I made an entry for the author as well, but not sub-entered under book title - that's already covered - but by topic. Why is the book mentioned at this point? Is it as an example of a work written in collaboration, or with feedback, from other Inklings? Was it reviewed by another Inkling, or did somebody mention it favorably, or for that matter criticize it? Into the proper subcategory of the author entry it goes. If the book's title entry gets long, make sub-entries for that too.
One wants to be consistent in sub-entry terminology, but I felt no need to be consistent in depth of indexing. If a minor entry has just 4 or 5 references, don't bother to make sub-entries unless there's an important point to be brought out. If, among the many sub-entries on a major topic, there are 3 or 4 hair-splitting distinctions with only one or two references each, bring 'em together. But if they have 5 references each, keep them split, or split them if they haven't been already.
Nor is there any need to be reciprocal. The subtopic is the subtopic of that particular main entry. If A is quoted describing B's personality, the sub-entry under A should say, "on B" unless this needs to be further divided. But the sub-entry under B should say "personality," not "A on." If a reader wants to know what A said about B, look under A, not B. And so on.
This index was done in three stages. First, marking the proofs. Second, typing the entries and page numbers into a spreadsheet, which I then had the computer alphabetize (computers alphabetize word-by-word, so as the publisher requested letter-by-letter I manually rearranged it in final copy - not a difficult task). After completing the spreadsheet I thought I was mostly done, and had only to cut-and-paste into a Word document with some minor editing. Wrong. Stage three, the actual writing of the index, was the hardest and took by far the longest. Some entries were short and simple, but I always searched automatically through the PDF to see if I'd missed an obscure name reference or distinctive use of a word. Sometimes I had. But with the sub-entries, all my notes in the database were ad hoc. Constant rethinking was required to sort them out, and often I had to go back and make changes. There was nothing mechanical about this. It took thought and time.
I did some proofreading against the spreadsheet and found a few other errors when checking the printout for something. I'm sure there are others left. But I hope the index will be useful. This book is going to remake Inklings studies, and I get to guide people through its pages.
no subject
Date: 2006-10-17 06:58 pm (UTC)Perhaps *you* should write the text on how to index.
Got here via