Written by Allen Wyatt (last updated April 15, 2023)
This tip applies to Word 2007, 2010, 2013, 2016, 2019, Word in Microsoft 365, and 2021
Isao wonders if there is a way to easily construct a list of all the unique words in a document. He doesn't need to know how many times each word appears; he just needs the list of unique words. In addition, uppercase and lowercase variations on the same word should count as the same word.
There is no built-in Word function or tool to do this. However, in VBA you can access the Words collection, which includes all the words in the document. With this in mind, you can create a macro that builds a sorted list of unique words in the document and then adds those words to the end of the document.
Sub UniqueWordList() Dim wList As New Collection Dim wrd Dim chkwrd Dim sTemp As String Dim k As Long For Each wrd In ActiveDocument.Range.Words sTemp = Trim(LCase(wrd)) If sTemp >= "a" And sTemp <= "z" Then k = 0 For Each chkwrd In wList k = k + 1 If chkwrd = sTemp Then GoTo nw If chkwrd > sTemp Then wList.Add Item:=sTemp, Before:=k GoTo nw End If Next chkwrd wList.Add Item:=sTemp End If nw: Next wrd sTemp = "There are " & ActiveDocument.Range.Words.Count & " words " sTemp = sTemp & "in the document, before this summary, but there " sTemp = sTemp & "are only " & wList.Count & " unique words." ActiveDocument.Range.Select Selection.Collapse Direction:=wdCollapseEnd Selection.TypeText vbCrLf & sTemp & vbCrLf For Each chkwrd In wList Selection.TypeText chkwrd & vbCrLf Next chkwrd End Sub
Note that each word in the document is extracted, converted to lowercase, and then added to the wList collection, in sorted order. Words are only added if they are alphabetic (thus, numbers are excluded, as is punctuation), and the macro pays no attention to the case of the words. You should also be aware that the macro only looks at words in the main body of the document. It does not include any words in places such as headers, footers, text boxes, or shapes.
The macro could easily be changed to allow for varying needs. For instance, you could have the macro stick the wordlist into a separate document instead of at the end of the current document. All you would need to do is to insert this line before the exiting line shown second here:
sTemp = "There are " & ActiveDocument.Range.Words.Count & " words " sTemp = sTemp & "in " & ActiveDocument.Name & ", but there " sTemp = sTemp & "are only " & wList.Count & " unique words." Documents.Add ActiveDocument.Range.Select Selection.Collapse Direction:=wdCollapseEnd Selection.TypeText vbCrLf & sTemp & vbCrLf For Each chkwrd In wList Selection.TypeText chkwrd & vbCrLf Next chkwrd End Sub
Note that there was only one substantive change in the macro: The addition of the "Documents.Add" method to create the new document for the summary.
For some other ideas on getting words out of a document—including macros that tally word frequency—you may want to refer to this tip: Generating a Count of Word Occurrences.
Note:
WordTips is your source for cost-effective Microsoft Word training. (Microsoft Word is the most popular word processing software in the world.) This tip (7697) applies to Microsoft Word 2007, 2010, 2013, 2016, 2019, Word in Microsoft 365, and 2021.
Learning Made Easy! Quickly teach yourself how to format, publish, and share your content using Word 2013. With Step by Step, you set the pace, building and practicing the skills you need, just when you need them! Check out Microsoft Word 2013 Step by Step today!
Do you need to know the frequency with which certain words occur in your documents? There is no built-in way to derive ...
Discover MoreNeed to see how many pages, words, paragraphs, or lines are in your document? Word makes it easy to retrieve such ...
Discover MoreWord provides a hyphenation tool that can help you hyphenate words within a document. If you want to apply hyphenation to ...
Discover MoreFREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."
2023-04-17 12:40:13
Andrew
Rereading this, I realize there is one obscurity that ought to be explained. The statement WordList(s) = WordList(s) + 1 when operating on a Diction in which the key "s" is not already defined will add that key to the dictionary with the associated item being an empty string (essentially, an uninitialized variant which converts to 0 when subject to the + addition operator.
2023-04-17 12:34:52
Andrew
I used to do this by converting all of the words in a document to single lines, sorting them, and using a wildcard search (replacing
"(*^13)@" with "\1" - a GREAT trick from the WordMVP site). I like this Tips.net approach better, but using a Scripting.Dictionary instead of a Collection makes process MUCH simpler. My new implementation follows. (It takes the Range to be operated on and the Scripting.Dictionary to use as parameters so as to facilitate running feeding all of the different stories of a document through it (and not just the main story). The dictionary keys the number of occurrences of the words themselves - which information is occasionally useful. "Uniqueness" of words is as defined by the Scripting.Dictionary.Exists property and depends on the WordList's .CompareMode property -- no need for the use of LCase since the default .CompareMode is "TextCompare" when creating a dictionary, but this will let differentiate capitalized words from non-capitalized words, which I often have to do.
Sub UniqueWordListFromRange(WordList As scripting.Dictionary, Range As Range)
Dim s As String, r As Range
For Each r In Range.Words
s = RTrim(r) ' Remove possible trailing spaces per https://learn.microsoft.com/en-us/office/vba/api/word.words
If Left(s, 1) Like "[A-Za-z0-9]" Then WordList(s) = WordList(s) + 1
Next r
End Sub
Accessing the list of unique words to add to the end of the document is similarly greatly simplified (and it would be simple work to alphabetize the result):
Sub UniqueWordList2()
Dim WordList As New scripting.Dictionary: WordList.CompareMode = TextCompare
UniqueWordListFromRange WordList, ActiveDocument.Content
ActiveDocument.Content.InsertAfter vbCr & vbCr & "These " & WordList.Count & _
" unique words precede this summary:" & vbCr & vbCr & Join(WordList.Keys, vbCr)
End Sub
Got a version of Word that uses the ribbon interface (Word 2007 or later)? This site is for you! If you use an earlier version of Word, visit our WordTips site focusing on the menu interface.
Visit the WordTips channel on YouTube
FREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."
Copyright © 2024 Sharon Parq Associates, Inc.
Comments