submit urlsubmit rss feedadd directory

article

A character encoding or character set (sometimes referred to as code page) consists of a code that pairs a sequence of characters from a given set with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the storage of text in computers and the transmission of text through telecommunication networks. Common examples include Morse code, which encodes letters of the Latin alphabet as series of long and short depressions of a telegraph key; and ASCII, which encodes letters, numerals, and other symbols, both as integers and as 7-bit binary versions of those integers, generally extended with an extra zero-bit to facilitate storage in 8-bit bytes (octets).

In earlier days of computing, the introduction of character sets such as ASCII (1963) and EBCDIC (1964) began the process of standardisation. The limitations of such sets soon became apparent, and a number of ad-hoc methods developed to extend them. The need to support multiple writing systems, including the CJK family of East Asian scripts, required support for a far larger number of characters and demanded a systematic approach to character encoding rather than the previous ad hoc approaches.

Simple character sets


Conventionally character set and character encoding were considered synonymous, as the same standard would specify both what characters were available and how they were to be encoded into a stream of code units (usually with a single character per code unit). For historical reasons, MIME and systems based on it use the term charset to refer to the complete system for encoding a sequence of characters into a sequence of octets.

More on [ Character encoding ]


directory of related categories

 

 
 
directory of related topics

Computer Data Formats
ASCII :: Text
Encoding :: XML
Fonts :: Globalization

 
Character_Encoding RSS feed
Character Encoding - Twitter Search

Firefox character encoding settings http://bit.ly/BAVLy
maheshspeaks (Mahesh Sabharwal) Fri, 06 Nov 2009 13:13:36 -0000
Firefox character encoding settings http://bit.ly/BAVLy
@sharonoday Thanks 4 that tweet! Nd new crayon color! Blew up my blog fixing character encoding prob w/ 2 preschoolrs underfoot ;(
MarianSparks (Marian Sparks!) Fri, 06 Nov 2009 05:02:19 -0000
@sharonoday Thanks 4 that tweet! Nd new crayon color! Blew up my blog fixing character encoding prob w/ 2 preschoolrs underfoot ;(
round tripping, across the XML universe, always getting character encoding issues, coz we can't find reverse #startrek #indesign
zackster (zac spitzer) Fri, 06 Nov 2009 02:31:11 -0000
round tripping, across the XML universe, always getting character encoding issues, coz we can't find reverse #startrek #indesign
There is a special room in hell for the engineers that devised SMPPv3.4 and GSM03.38 character encoding and bit packing.
psidnell (Paul Sidnell) Thu, 05 Nov 2009 17:56:09 -0000
There is a special room in hell for the engineers that devised SMPPv3.4 and GSM03.38 character encoding and bit packing.
for #ruby #unicode experts. #google translate #webservice. foreign language character encoding is incorrect. any ideas? http://bit.ly/3Cn0Ev
sukhchander (sukhchander) Wed, 04 Nov 2009 16:20:32 -0000
for #ruby #unicode experts. #google translate #webservice. foreign language character encoding is incorrect. any ideas? http://bit.ly/3Cn0Ev
Between linux, mac and windows boxes no wonder I end up with character encoding problems sometimes
sdempsey (Shane Dempsey) Wed, 04 Nov 2009 09:40:25 -0000
Between linux, mac and windows boxes no wonder I end up with character encoding problems sometimes

 
Subscribe to Character_Encoding RSS feed

directory of related sites

A Brief History of Character Codes - A concise history of the development of character encoding in Western and East Asian languages, including ASCII, EBCDIC, Unicode and TRON.

Techniques for Foreign Content on the Web - Pennsylvania State University's guide to reading and publishing different languages on the web. Includes details of various encoding systems and links.
Meta Description: [ Some tips for typing and reading foreign languages in the Penn State computing envronment ]

3rdpageSearch - Front end to several search engines and portals that allows you to enter queries in various character sets.
Meta Description: [ Offers a multilingual search tool for entering specific language characters in the search box from a table of special characters in various character sets. ]

An Early History of Character Set Standardization - Covers the beginnings of the ASCII standards from ASCII-1963 onwards and information on Cyrillic, Japanese, Korean, Thai and Vietnamese encoding systems, including various localized versions of EBCDIC. With tables and links to other resources.

ASCII and EBCDIC Compared - A comparison of two of these two basic encoding systems, with tables.
Meta Description: [ A technical comparison of the ASCII and EBCDIC character sets ]

Basis Technology: Presentations and Papers - A wide range of articles on Unicode, East Asian localization and Internationalization issues.

Character Set Issues beyond HTML3.2 - Internationalization issues beyond HTML3.2 and ISO-8859-1. Includes information on Baltic encodings.

Characters and Encodings - A tutorial on character code issues in digital processing and transfer of text data, on the Internet or otherwise. Includes tables and a detailed listing of control codes. In English and Finnish.

Chilkat Charset Conversion Component - A character set conversion component for Unicode, Japanese, Chinese, Korean, Cyrillic, Arabic, Hebrew, Thai, Vietnamese and all Western languages.
Meta Description: [ Charset Conversion Component for Chinese, Japanese, Korean, Thai, Arabic, Hebrew, Vietnamese, Cyrillic, and all other languages. ]

Dan's Web Tips: Characters and Fonts - Hints and tips about character sets and fonts in web development. Includes links to related resources.
Meta Description: [ Hints and tips about character sets and fonts in Web development. ]

ECMA: Character Code Structure and Extension Techniques - Specifies the structure of ECMA-35, for 8-bit codes and 7-bit codes which provide for the coding of character sets, with a detailed PDF document.

eGrannie: ASCII-EBCDIC chart - A side-by-side comparision of ASCII and EBCDIC encoding.
Meta Description: [ eGrannie cheat sheets and quick study references for IT professionals ]

EKI Letter Database - Query character sets, encoding, codepages and Unicode information in an easy-to-use web form. Held at the Institute of the Estonian Language.

GNU Aspell: Czyborra.com Mirror - Information on Latin and non-Latin encoding systems, codepages and character sets by Roman Czyborra.

HTML Document Representation - Chapter covering document character sets and encodings in HTML from the World Wide Web Consortium's HTML 4.0 Specification.

HTML Validation: Using Character Encodings - How to validate HTML documents in various character encodings.
Meta Description: [ How to validate HTML documents in various character encodings. ]

IANA: Character Sets - The official names for character sets that may be used in the Internet and referred to in Internet documentation - held at the Internet Assigned Number Authority.

ISO 639 Language Names - The standard names for use in SGML and XML, including a complete list of language name codes.

LangBox International - Codetables for ISO 8859-6, ASMO 449 plus, ASMO 708 (Arabic) and ISO 8859-8 (Hebrew) and further information about the company's work in multilingual UNIX.
Meta Description: [ LangBox International is a company specialized in Internationalization and Localization of UNIX applications. Both Character (TTY) and Graphical (X11/Motif) interfaces and several languages (Arabic, Farsi, Hebrew, Greek, Cyrillic, Turkish, Thai) are supported. Main products are : LANGBOX... ]

MS Windows characters in HTML - A review of the HTML authoring problems caused by some special characters which belong to MS Windows character set but not to ISO Latin 1. Includes technical details and substitution tables. In English and Finnish.
Meta Description: [ A review of the HTML authoring problems caused by some special characters which belong to MS Windows character set but not to ISO Latin 1, such as em dash, trademark symbol, and asymmetric quote characters. ]

ScientificPublications.com: Czyborra.com Mirror - Mirror of Roman Czyborra's work on character sets and encoding systems. In English and German.
Meta Description: [ Im Schriftbereich ist Unix ist trotz vielfältiger qualitativ hochwertiger komplexer Mechanismen ]

Tutorial: Shady Characters - A tutorial that explains HTML character sets, character encodings and character references from Webreference.com.
Meta Description: [ A tutorial that explains HTML character sets, character encodings and character references. ]

WhatAsciiCode.com - Quick reference and searchable ASCII code and conversion tables.
Meta Description: [ Contains ascii code tables and information. ]

World Wide Web Consortium - Covers code tables, Unicode, HTML and XML and links to other resources and discusses internationalization and localization issues relating to character sets.
Meta Description: [ How to declare the character encoding of a document in XML or HTML, and various useful links to related information. ]

Xceed Binary Encoding Library - A library for Windows developers that allows applications to encode binary data and files into text and vice-versa.
Meta Description: [ Xceed Binary Encoding Library (ActiveX, .NET and COM component) - Xceed Software provides a comprehensive suite of software component products designed to greatly simplify the task of adding specific capabilities to Windows or web applications ]

Character_Encoding related videos
Reputation Disaster Training Simulation Play Through
Next Video
Character_Encoding related videos

 

HOMEADVERTISINGABOUT US

articlesartsbusinesscomputersgameshealthhospitalshomekids & teensnewsmobilephysiciansrecreationreferenceregionalscienceshoppingsocietysportsworld


Submit a Site About Become an Editor