Binary encode and decode


Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix representation. Each base64 digit represents exactly 6 bits of data.

Three 8-bit bytes i. The particular set of 64 characters chosen to represent the 64 place-values for the base varies between implementations. The general strategy is to choose 64 characters that are both members of a subset common to most encodings, and also printable. This combination leaves the data unlikely to be modified in transit through information systems, such as email, that were traditionally not 8-bit clean. Other variations share this property but differ in the symbols chosen for the last two values; an example is UTF For instance, uuencode uses uppercase letters, digits, and many punctuation characters, but no lowercase.

If there is only one significant input byte e. The example below uses ASCII text for simplicity, but binary encode and decode is not a typical use case, as it can already be safely transferred across all systems that can handle Base The more typical use is to encode binary data such as an image ; the resulting Base64 data will only contain 64 different ASCII characters, all of which can reliably be transferred across systems that may corrupt the raw source bytes.

A quote from Thomas Hobbes ' Leviathan be aware of spaces between lines:. In the above quote, the encoded value of Man is TWFu. Encoded in ASCII, the characters Maand n are stored binary encode and decode the bytes 7797andwhich are the 8-bit binary values, and Binary encode and decode three values are joined together into a bit string, producing As this example illustrates, Base64 encoding converts three octets into four encoded characters.

If there are two significant input bytes e. As illustrated in the first table above, when the last input group contains only one binary encode and decode, the four least significant bits of the last content-bearing 6-bit block will turn out to be zero:. And when binary encode and decode last input group contains two octets, the two least significant bits of the last content-bearing 6-bit block will turn out to be zero:.

The example below illustrates how truncating the input of the above quote changes the output padding:. The same characters will be encoded differently depending on their position within the three-octet group which binary encode and decode encoded to produce the four characters. The ratio of output bytes to input bytes is 4: In theory, the padding character is not needed for decoding, since the number of missing bytes can be calculated from the number of Base64 digits.

In some implementations, the padding character is mandatory, while for others it is not used. One case in which padding characters are required is concatenating multiple Base64 encoded files.

When decoding Base64 text, four characters are typically converted back to three bytes. The only exceptions are binary encode and decode padding binary encode and decode exist. Without padding, after normal decoding of four characters to three bytes over and over again, less than four encoded characters may remain. In this situation only two or three characters shall remain.

A single remaining encoded character is not possible because a single base 64 character only contains 6 bits, and 8 bits are required to create a byte, so a minimum of 2 base 64 characters are required: Implementations may have some constraints on the alphabet used for representing some bit patterns.

This notably concerns the last two characters used in the index table for index 62 and 63, and the character used for padding which may be mandatory in binary encode and decode protocols, or removed in others. The table below summarizes these known variants, and link to the subsections below. PEM defines a "printable encoding" scheme that uses Base64 encoding to transform an arbitrary sequence of octets to a format that can be expressed in short lines of 6-bit characters, as required by transfer protocols such as SMTP.

To convert data to PEM printable encoding, the first byte is placed in the most significant eight bits of a bit bufferthe next in the middle eight, and the third in the least significant eight bits. If there are fewer than three bytes left to encode or in totalthe remaining buffer bits will be zero. The buffer is then used, six bits at a time, most significant first, as indices into the string: The process is repeated on binary encode and decode remaining data binary encode and decode fewer than four octets remain.

If three octets remain, they are processed normally. If fewer than three octets 24 bits are remaining to encode, the input data is right-padded with zero bits to form an integral multiple of six bits. This signals the decoder that the zero bits added due to padding should be excluded from the reconstructed data. This also guarantees that the encoded output length is a multiple of 4 bytes. PEM requires that all encoded lines consist of exactly 64 printable characters, with the exception of the last line, which may contain fewer printable characters.

Lines are delimited by whitespace characters according to local platform-specific conventions. MIME does not specify a fixed length for Baseencoded lines, but it does specify a maximum line length of 76 characters.

Very roughly, the final size of Baseencoded binary data is equal to 1. The size of the binary encode and decode data can be approximated with this formula:. Modified Base64 simply omits the padding and ends immediately after the last Base64 digit containing useful bits leaving up to three unused bits in the last Base64 digit.

Unless implementations are written to a specification that refers to RFC and specifically requires otherwise, RFC forbids implementations from generating messages containing characters outside the encoding alphabet or without padding, and it also declares that decoder implementations must reject data that contain characters outside the encoding alphabet.

Base64 encoding can be helpful when fairly lengthy identifying information is used in an HTTP environment. Also, many applications need to encode binary data in a way that is convenient for inclusion in Binary encode and decode, including in hidden web form fields, and Base64 is a convenient encoding to render them in a compact way.

XML identifiers and name tokens are encoded using two variants:. The atob and btoa JavaScript methods, defined in the HTML5 draft specification, [10] provide Base64 encoding and decoding functionality to web pages.

The btoa method outputs padding characters, but these are optional in the input of the atob method. From Wikipedia, the free encyclopedia. This article's lead section does not adequately summarize key points of its contents. Please consider expanding the lead to provide an accessible overview of all important aspects of the article.

Please discuss this issue on the article's talk page. Retrieved March 18, Message Encryption and Authentication Procedures.

Format of Internet Message Bodies. World Wide Web Consortium. Retrieved 2 January Introduced by changeset Retrieved from " https: Views Read Edit View history. This page was last edited on 3 Aprilat By binary encode and decode this site, you agree to the Terms of Use and Privacy Policy.

Yes except last line. No unless specified by referencing document. The Wikibook Algorithm implementation has a page on the topic of: