Written by Allen Wyatt (last updated May 22, 2023)
This tip applies to Word 2007, 2010, 2013, 2016, 2019, and Word in Microsoft 365
Aaron has a document that contains a number of HTML tags, and he would like to remove the tags but maintain the formatting they represent. For instance, if he has a phrase that appears this way, he would like to remove the tags ( and ) but have "a phrase" appear in italics. Aaron is pretty sure this can be done with Find and Replace, but he's not quite sure how to go about it.
You are right, Aaron—you can use Find and Replace to accomplish the removal. The way you would do it is to follow these steps:
Figure 1. The Replace tab of the Find and Replace dialog box.
The code that you enter in the Find What box (step 4) may look a little daunting. All you are telling Word to do is to find the beginning HTML tag () followed by any number of characters and ending with the closing HTML tag (). The very short entry in the Replace With box (step 5) simply says to replace whatever is found with the contents of the first element of the Find What box that is surrounded by parentheses—which just happens to be the text between the two HTML tags.
If you want to eliminate the need to remember (or look up) the contents of the Find What box all the time, you can place the Find and Replace operation into a macro:
Sub ConvertItalicTags() Selection.Find.ClearFormatting Selection.Find.Replacement.ClearFormatting Selection.Find.Replacement.Font.Italic = True With Selection.Find .Text = "\<i\>([!<]@)\" .Replacement.Text = "\1" .Forward = True .Wrap = wdFindContinue .Format = True .MatchCase = False .MatchWholeWord = False .MatchAllWordForms = False .MatchSoundsLike = False .MatchWildcards = True End With Selection.Find.Execute Replace:=wdReplaceAll End Sub
Assign the macro to a shortcut key, and you can remove the italic HTML tags anytime you need. You could also expand the macro to make similar changes relative to other HTML tags you may need to remove. You may even want to make sure that alternate tags are dealt with. For instance, HTML uses both and tags to display information in italic, which means you should account for the possibility of both sets of tags in your macro.
Of course, there is an entirely different approach you could use to get rid of the HTML tags and still retain the formatting associated with those tags. That would be to save the HTML-encoded text into a text file, open it in your browser, copy the text within the browser window, and paste it directly into a Word document. If all goes well, you would have the desired formatted text in your finished document.
Note:
WordTips is your source for cost-effective Microsoft Word training. (Microsoft Word is the most popular word processing software in the world.) This tip (10308) applies to Microsoft Word 2007, 2010, 2013, 2016, 2019, and Word in Microsoft 365.
Do More in Less Time! Are you ready to harness the full power of Word 2013 to create professional documents? In this comprehensive guide you'll learn the skills and techniques for efficiently building the documents you need for your professional and your personal life. Check out Word 2013 In Depth today!
Sarra is having a problem getting Find and Replace to behave properly when replacing italic-formatted text. This tip ...
Discover MoreThe Find and Replace capabilities of Word are quite powerful. Knowing how to find and replace highlighted text can be a ...
Discover MoreIf your document contains quoted text, you might want a way to remove the quotes and format the text in some way. This ...
Discover MoreFREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."
2021-01-02 20:11:45
\<i\>(<[A-z ]@>)(\</i\>)
is the correct function for removing text hypertext commands (specifically italics)
description:
"\<" text of "<" not wildcard
"i" text "i"
"\>" text of ">" not wildcard
"(" open parenthesis denoting beginning of text to format"
"<" start with wildcard for text
"[A-z ]" characters that are included as start or part of the string affected by the formatting. (A, z, (space))
"@" include all consecutive occurrences of the prior character(s)
">" closing wildcard for text
")" closing parenthesis denoting end of text to format
"(\</i\>)" description of hypertext command to end italics
2019-08-14 02:39:19
Ken
Franci
You don't have a matching pair of round brackets.
2019-08-13 05:33:48
Franci
HI!
Using \1 in the Replace With box gives me a "Replace with text contains a group number which is out of range" Error Message. What am I doing wrong?
Got a version of Word that uses the ribbon interface (Word 2007 or later)? This site is for you! If you use an earlier version of Word, visit our WordTips site focusing on the menu interface.
Visit the WordTips channel on YouTube
FREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."
Copyright © 2024 Sharon Parq Associates, Inc.
Comments