Strings are made up of a collection of bytes (8 binary digits) that represent the characters that the string contains. In Python 3 string are encoded following the UTF-8 standard, and may contain 1,112,064 different code points (or symbols). This allows Python programs to process strings of all languages, throughout the world.

Objectives

Upon completion of this chapter’s exercises, you should be able to:

  • Use the ASCII character set to represent characters as numbers and to convert numbers back to their ASCII character.
  • Define and apply the UNICODE character encoding to extend the ASCII set to represent a myriad of international characters and symbols.
  • Specifically understand the UTF-8 method of representing UNICODE characters.
  • Differentiate a byte array from a string and convert one to another.

Download PDF of Chapter