Written by Allen Wyatt (last updated April 22, 2023)
This tip applies to Word 2007, 2010, 2013, 2016, 2019, Word in Microsoft 365, and 2021
Larry has a 200-page document, with each page containing a text box with text. He would like to copy the contents of all the text boxes to a new document without needing to manually extract the text from each one by one. He wonders if this can be done easily.
If the text boxes are in the main body of your document, you might try to use the searching capabilities of Word to extract the text. First, though, create a brand new document; this will be where you end up pasting the text.
Now, switch back to your original document and do a bit of analysis on the text within the text boxes. I find it helpful to figure out if the text is using a common style; in my case, I noticed that each paragraph within my text boxes used the Normal style.
Now, click somewhere in your document's main body, outside of any text boxes. Then, follow these steps:
If your text boxes don't use the Normal style for their text, all you need to do is figure out what common attribute it does use and then specify that attribute in steps 4 through 6.
These steps, as mentioned, work great if your text boxes are in the main body of your document. It is possible for text boxes to also be in other places, such as headers, footers, or footnotes. In addition, if the text in your text boxes doesn't share some common attribute that you can discern, then the steps won't produce a satisfactory result.
In this case, the only real way that we've found to do extract the text is to use a macro. The following is a rather simple one that adds a new document and then steps through each story in the source document. (A story is a portion of the document such as headers, footers, footnotes, endnotes, main body, etc. Since text boxes could be in each of these, it makes sense to process each story.) It then looks at all the shapes in the story and, if the shape is a text box, it then copies the text to the sText string. This is then "typed" into the new document.
Sub XferTextBoxContents() Dim Source As Document Dim stry As Range Dim shp As Shape Dim sText As String Set Source = ActiveDocument Documents.Add DocumentType:=wdNewBlankDocument ' The newly added document is now the ActiveDocument For Each stry In Source.StoryRanges For Each shp In stry.ShapeRange If shp.Type = msoTextBox Then ' Copy text to string, without last paragraph mark sText = Left(shp.TextFrame.TextRange.Text, _ shp.TextFrame.TextRange.Characters.Count - 1) If Len(sText) > 0 Then Selection.TypeText Text:=sText Selection.TypeParagraph End If End If Next shp Next stry Source.Activate End Sub
The macro doesn't change the original document, and when it is completed, the new document will contain only the text that was in the original's text boxes.
There are a few things you should note about using a macro such as this. First, how text boxes appear within the original document doesn't reflect how they are actually stored and accessed within a macro. For instance, let's say that you have several different sections in your source document, and each has a header and footer, and each header and footer contains a text box. When you look at the document on the screen, the text boxes in the header may appear higher on the page than the text boxes in the footer, and there may be text boxes in the main body of the document that appear between those.
The macro, however, steps through each story in the document and then processes each text box within those stories. This means that the text boxes for all the headers may appear in the target document before all those from the footers, and they may be followed by the text boxes from the main body of the document. The bottom line is that you should not expect the "order" of the text in the target document to match the apparent order you may see in the source document.
The upshot of this realization is that if your original document was created by a program—for instance, a PDF to Word document converter—that program could have tried to maintain the appearance of the original PDF document by sticking everything within a bunch of text boxes. I've seen some converter programs that place every line or even every word into a separate text box. Run the macro on such a document, and you may be dissatisfied with what is created in the new target document. If that is the case, the only potential solution is to grab the original PDF and use a different, higher-quality converter program that doesn't rely so heavily on text boxes.
WordTips is your source for cost-effective Microsoft Word training. (Microsoft Word is the most popular word processing software in the world.) This tip (9755) applies to Microsoft Word 2007, 2010, 2013, 2016, 2019, Word in Microsoft 365, and 2021.
Comprehensive VBA Guide Visual Basic for Applications (VBA) is the language used for writing macros in all Office programs. This complete guide shows both professionals and novices how to master VBA in order to customize the entire Office suite for their needs. Check out Mastering VBA for Office 2010 today!
Drop shadows add a nice touch to text boxes, making it seem like they are hovering above the page. Here are the simple ...
Discover MoreText boxes can be helpful for segmenting information from your main document and for creating unique page layouts. What ...
Discover MoreText boxes can be handy when it comes to noting information in a document or dealing with some tricky layout issues. If ...
Discover MoreFREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."
2023-04-23 23:05:44
Tomek
Quite clever approach to extract content of text boxes. Also, it keeps the order of text from boxes almost consistent with their order in the original document. It may switch the order for the text boxes originating from the same page, if those were not created in sequence, but it seems to at least keep the boxes from the same page as consecutive ones.
I did not try boxes from headers and footers and the macro approach but I think this exceeded Larry's request.
Why am I commenting on this. Because I tried to provide help for this tip and failed. I tried to select all text boxes by macro (easy to do) copy them and paste them into a new document. What I got was a tangled mess of text boxes overlapping, even though they had been set to disallow overlap. When I untangled them (by converting to in-line objects) they were at some illogical order, mixing boxes originating from different pages. The order did not follow the numeric ID of the boxes nor was it sorted by their names.
Got a version of Word that uses the ribbon interface (Word 2007 or later)? This site is for you! If you use an earlier version of Word, visit our WordTips site focusing on the menu interface.
Visit the WordTips channel on YouTube
FREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."
Copyright © 2024 Sharon Parq Associates, Inc.
Comments