Please Note: This article is written for users of the following Microsoft Word versions: 2007, 2010, 2013, and 2016. If you are using an earlier version (Word 2003 or earlier), this tip may not work for you. For a version of this tip written specifically for earlier versions of Word, click here: Understanding Unicode Characters.

Understanding Unicode Characters

Written by Allen Wyatt (last updated April 26, 2021)
This tip applies to Word 2007, 2010, 2013, and 2016


2

You may have heard of the term Unicode before, and wondered what it meant. Normal single-byte encoding schemes (such as ASCII and ANSI) allow only up to 256 unique individual characters to be encoded and displayed on the computer. In the global computer community, where each member is required to work in their own language, this is a problem. There are far more than 256 characters in common use throughout the world.

This is where Unicode comes into play.

Depending on the version of Unicode being used, the standard requires anywhere from two to five bytes for encoding each character. As of this writing, the current Unicode standard is 9.0.0, which uses five bytes and 128,172 characters defined. This standard, devised and promoted by the Unicode Consortium (http://www.unicode.org), allows for the display of virtually all the unique language characters in the world. A team of computer professionals, linguists, and scholars continue to work on the actual development of Unicode.

The use of multiple bytes to define each character means that Unicode can be used to encode most of the characters used in the world's major languages. There is an extension mechanism built into the standard, as well, which means it is possible to encode close to a million more characters, if necessary. This ability should be sufficient for all known language requirements, plus the encoding of all the historic scripts of the world. (This includes languages and symbols that are no longer in use.)

As presently defined, Unicode 9.0.0 (the latest version, released in June 2016) includes codes for characters used in the major written languages of the world, including Arabic, Armenian, Balinese, Bengali, Bopomofo, Buhid, Canadian Syllabics, Cherokee, Chinese, Cyrillic, Deseret, Devanagari, Ethiopic, Georgian, Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanun—o, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Mongolian, Myanmar, Ogham, Old Italic (Etruscan), Oriya, Phoenician, Runic, Sinhala, Syriac, Tagalog, Tagbanwa, Tamil, Telugu, Thaana, Thai, Tibetan, and Yi. Work is progressing to add more characters from lesser-known languages.

In addition, Unicode also includes many different symbols, including numbers, general diacritics, general punctuation, general symbols, dingbats, emojis, arrows, blocks, box drawing forms, geometric shapes, mathematical symbols, musical symbols (western and byzantine), technical symbols, braille patterns, and Kangxi radicals.

Unicode is supported in all modern versions of Windows and Word. Exactly what standard of Unicode that is supported depends on the version of Windows and Word in question.

WordTips is your source for cost-effective Microsoft Word training. (Microsoft Word is the most popular word processing software in the world.) This tip (11277) applies to Microsoft Word 2007, 2010, 2013, and 2016. You can find a version of this tip for the older menu interface of Word here: Understanding Unicode Characters.

Author Bio

Allen Wyatt

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He is president of Sharon Parq Associates, a computer and publishing services company. ...

MORE FROM ALLEN

Setting an Alarm

By using the alarm capabilities of Windows, you can make sure you never miss another important time again. This tip shows ...

Discover More

Printing Post Office Permits on Envelopes

When preparing to snail-mail information, you may want to print your envelopes with permit information in the upper-right ...

Discover More

Checking Bilingual Documents

Do you routinely work with multiple languages in your documents? If so, you may appreciate the suggestions in this tip, ...

Discover More

Learning Made Easy! Quickly teach yourself how to format, publish, and share your content using Word 2013. With Step by Step, you set the pace, building and practicing the skills you need, just when you need them! Check out Microsoft Word 2013 Step by Step today!

More WordTips (ribbon)

Selecting Tabs in Dialog Boxes

Dialog boxes normally present information in a series of tabs. If you want to move from tab to tab without taking your ...

Discover More

Keeping Word Open after Closing Documents

Usually when you are done working on a document, you want to close Word completely and move on to something else. There ...

Discover More

Missing Top and Bottom Margins

You get your document set up just the way you want it, and then notice that all of a sudden Word doesn't show any top or ...

Discover More
Subscribe

FREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."

View most recent newsletter.

Comments

If you would like to add an image to your comment (not an avatar, but an image to help in making the point of your comment), include the characters [{fig}] (all 7 characters, in the sequence shown) in your comment text. You’ll be prompted to upload your image when you submit the comment. Maximum image size is 6Mpixels. Images larger than 600px wide or 1000px tall will be reduced. Up to three images may be included in a comment. All images are subject to review. Commenting privileges may be curtailed if inappropriate images are posted.

What is 6 - 0?

2017-04-18 13:31:43

Rod Grealish

James, Open web page https://unicode-table.com/en/. On the right, click "Open in separate page". This will display a list of Unicode blocks. Select the one in which you are interested. You can also enter a character in the search box near the top of the page to find a Unicode value.


2017-04-17 13:50:11

James

Allen, Thank you for the theory, but practically, a few questions:

1) where or how do we find the Unicode of a character?
2) or vice versa, how do we decode the Unicode to find out what character it represents
3) How can this be done with a macro?
Regards.


This Site

Got a version of Word that uses the ribbon interface (Word 2007 or later)? This site is for you! If you use an earlier version of Word, visit our WordTips site focusing on the menu interface.

Videos
Subscribe

FREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."

(Your e-mail address is not shared with anyone, ever.)

View the most recent newsletter.