In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits (or other information-bearing units) than an unencoded representation would use through use of specific encoding schemes. For example, this article could be encoded with fewer bits if we accept the convention that the word "compression" be encoded as "comp".
One popular instance of compression that many computer users are familiar with is the ZIP file format, which, as well as providing compression, acts as an archiver, storing many files in a single output file.
As is the case with any form of communication, compressed data communication only works when both the sender and receiver of the information understand the encoding scheme. For example, this text makes sense only if the receiver understands that it is intended to be interpreted as characters representing the English language. Similarly, compressed data can only be understood if the decoding method is known by the receiver. Some compression algorithms exploit this property in order to encrypt data during the compression process so that decompression can only be achieved by an authorized party (eg. through the use of a password).
Compression is possible because most real-world data has statistical redundancy. For example, the letter 'e' is much more common in English text than the letter 'z', and the probability that the letter 'q' will be followed by the letter 'z' is rather small. Lossless compression algorithms usually exploit statistical redundancy in such a way as to represent the sender's data more concisely, but nevertheless perfectly.