Let's consider a [[Markdown]] file[^1] whose contents entirely consisted of: ```markdown Hello world! -Aaron- ``` This file is 20 characters long. Because of this, and because the file uses [[UTF-8]] encoding and all these characters are included in the original [[ASCII Art|ASCII]] table, the file is also 20 bytes long. | Count | Character | Decimal Position in ASCII | Hexidecimal | [[Binary]] | | ----- | ---------- | ------------------------- | ----------- | ---------- | | 1 | H | 72 | 48 | 01001000 | | 2 | e | 101 | 65 | 01100101 | | 3 | l | 108 | 6c | 01101100 | | 4 | l | 108 | 6c | 01101100 | | 5 | o | 111 | 6f | 01101111 | | 6 | (space) | 32 | 20 | 00100000 | | 7 | w | 119 | 77 | 01110111 | | 8 | o | 111 | 6f | 01101111 | | 9 | r | 114 | 72 | 01110010 | | 10 | l | 108 | 6c | 01101100 | | 11 | d | 100 | 64 | 01100100 | | 12 | ! | 33 | 21 | 00100001 | | 13 | (linefeed) | 10 | 0a | 00001010 | | 14 | - | 45 | 2d | 00101101 | | 15 | A | 65 | 41 | 01000001 | | 16 | a | 97 | 61 | 01100001 | | 17 | r | 114 | 72 | 01110010 | | 18 | o | 111 | 6f | 01101111 | | 19 | n | 110 | 6e | 01101111 | | 20 | - | 45 | 2d | 00101101 | So the whole file[^2] is: ```plaintext 01001000 01100101 01101100 01101100 01101111 00100000 01110111 01101111 01110010 01101100 01100100 00100001 00001010 00101101 01000001 01100001 01110010 01101111 01101110 00101101 ``` You can see the individual bytes of a file on Mac using the `xxd` command: ```bash xxd examplefile.txt ``` > Fun fact: the `linefeed` is both newline & carriage return because I'm running a non-Windows system. In Windows-based systems you need to use two characters for this. ## Unicode & UTF-8 Unicode is an [[ISO]] standard - ISO 10646 "Universal Character Set". It is the simple ordered list of characters. It says "this character exists at this index number". UTF-8 is an encoding method to say "this binary string maps to one of those index numbers this way". **** # More ## Source - Some self, for some - https://en.wikipedia.org/wiki/Character_encoding - https://www.youtube.com/watch?v=gd5uJ7Nlvvo&pp=2AbIGtIHCQmHCgGHKiGM7w%3D%3D [^1]: like this one you're reading, if you were reading this on your computer and not the published version. [^2]: Technically due to the [[Mac File Storage|APFS]] the file takes up those 20 bytes & reserved space for the rest of the 4KB block they're stored in.