w3schools    w3Schools
Search W3Schools :
HOME HTML CSS XML JAVASCRIPT ASP PHP SQL MORE...   References Examples Forum About

XML Certification
Download XML editor
Custom Programming
Table of contents
HTML Reference
HTML by Alphabet
HTML by Function
HTML Attributes
HTML Events
HTML Colornames
HTML Character Sets
HTML ISO-8859-1
HTML Symbols
HTML Lang Codes
HTTP Messages

<h1> - <h6>

Selected Reading
Web Statistics
Web Glossary
Web Hosting
Web Quality

W3Schools Tutorials
W3Schools Forum

Helping W3Schools


HTML Character Sets

Previous Next

HTML Character Sets

To display an HTML page correctly, the browser must know what character-set to use.

The character-set for the early world wide web was ASCII. ASCII supports the numbers from 0-9, the uppercase and lowercase English alphabet, and some special characters.

Complete ASCII reference.

Since many countries use characters which are not a part of ASCII, the default character-set for modern browsers is ISO-8859-1.

Complete ISO-8859-1 reference.

If a web page uses a different character-set than ISO-8859-1, it should be specified in the <meta> tag.

Try it yourself

ISO Character Sets

It is the International Standards Organization (ISO) that defines the standard character-sets for different alphabets/languages.

The different character-sets being used around the world are listed below:

Character set Description Covers
ISO-8859-1 Latin alphabet part 1 North America, Western Europe, Latin America, the Caribbean, Canada, Africa
ISO-8859-2 Latin alphabet part 2 Eastern Europe
ISO-8859-3 Latin alphabet part 3 SE Europe, Esperanto, miscellaneous others
ISO-8859-4 Latin alphabet part 4 Scandinavia/Baltics (and others not in ISO-8859-1)
ISO-8859-5 Latin/Cyrillic alphabet part 5 The languages that are using a Cyrillic alphabet such as Bulgarian, Belarusian, Russian and Macedonian
ISO-8859-6 Latin/Arabic alphabet part 6 The languages that are using the Arabic alphabet
ISO-8859-7 Latin/Greek alphabet part 7 The modern Greek language as well as mathematical symbols derived from the Greek
ISO-8859-8 Latin/Hebrew alphabet part 8 The languages that are using the Hebrew alphabet
ISO-8859-9 Latin 5 alphabet part 9 The Turkish language. Same as ISO-8859-1 except Turkish characters replace Icelandic ones
ISO-8859-10 Latin 6 Lappish, Nordic, Eskimo The Nordic languages
ISO-8859-15 Latin 9 (aka Latin 0) Similar to ISO 8859-1 but replaces some less common symbols with the euro sign and some other missing characters
ISO-2022-JP Latin/Japanese alphabet part 1 The Japanese language
ISO-2022-JP-2 Latin/Japanese alphabet part 2 The Japanese language
ISO-2022-KR Latin/Korean alphabet part 1 The Korean language

The Unicode Standard

Because the character-sets listed above are limited in size, and are not compatible in multilingual environments, the Unicode Consortium developed the Unicode Standard.

The Unicode Standard covers all the characters, punctuations, and symbols in the world.

Unicode enables processing, storage and interchange of text data no matter what the platform, no matter what the program, no matter what the language.

The Unicode Consortium

The Unicode Consortium develops the Unicode Standard. Their goal is to replace the existing character-sets with its standard Unicode Transformation Format (UTF).

The Unicode Standard has become a success and is implemented in XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc. The Unicode standard is also supported in many operating systems and all modern browsers.

The Unicode Consortium cooperates with the leading standards development organizations, like ISO, W3C, and ECMA.

Unicode can be implemented by different character-sets. The most commonly used encodings are UTF-8 and UTF-16:

Character-set Description
UTF-8 A character in UTF8 can be from 1 to 4 bytes long. UTF-8 can represent any character in the Unicode standard. UTF-8 is backwards compatible with ASCII. UTF-8 is the preferred encoding for e-mail and web pages
UTF-16 16-bit Unicode Transformation Format is a variable-length character encoding for Unicode, capable of encoding the entire Unicode repertoire. UTF-16 is used in major operating systems and environments, like Microsoft Windows 2000/XP/2003/Vista/CE and the Java and .NET byte code environments

Tip: The first 256 characters of Unicode character-sets correspond to the 256 characters of ISO-8859-1.

Tip: All HTML 4 processors already support UTF-8, and all XHTML and XML processors support UTF-8 and UTF-16!

Previous Next

Learn how your website performs under various load conditions

Web Load and Performance Testing   

WAPT is a load, stress and performance testing tool for websites and web-based applications. In contrast to "800-pound gorilla" load testing tools, it is designed to minimize the learning curve and give you an ability to create a heavy load from a regular workstation. WAPT is able to generate up to 3000 simultaneously acting virtual users using standard hardware configuration. Virtual users in each profile are fully customizable. Basic and NTLM authentication methods are supported. Graphs and reports are shown in real-time at different levels of detail, thus helping to manage the testing process.

Download the free 30-day trial!

E Components
$15 Domain Name
Save $20 / year!
Buy UK Domain Names
Register Domain Names
Cheap Domain Names
Cheap Web Hosting
Best Web Hosting
PHP MySQL Hosting
Top 10 Web Hosting
UK Reseller Hosting
Web Hosting
FREE Web Hosting
Website Templates
Flash Templates
Website Builder
Internet Business Opportunity
Custom Programming
FREE Trial or Demo
Web Content Manager
Forms,Web Alerts,RSS
Download XML editor
FREE Flash Website
FREE Web Templates
US Web Design Schools
HTML Certification
JavaScript Certification
XML Certification
PHP Certification
ASP Certification
Home HOME or Top of Page Validate   Validate   W3C-WAI level A conformance icon Printer Friendly  Printer Friendly

W3Schools is for training only. We do not warrant the correctness of its content. The risk from using it lies entirely with the user.
While using this site, you agree to have read and accepted our terms of use and privacy policy.
Copyright 1999-2009 by Refsnes Data. All Rights Reserved.