keyboards: apple keyboard | boot camp | ios keyboard
character sets and encodings: ascii | control characters | ansi escapes | unicode
operating systems and applications: mac | windows | editors | emacs | html
non-ascii characters: english punctuation | latin accent | polytonic greek | mathematics | keyboard notation
Notes on how I use my keyboard
Apple Keyboard
I use the keyboards that come with my laptops. I don't plug in an external keyboard.
I use Apple laptops. The keyboards are almost identical to the keyboard pictured above. The differences is that there are brightness controls for the keyboard at F5 and F6 and the eject key has been replaced by a power key.
Boot Camp
I run Windows on an Apple laptop using Boot Camp. Here are the Boot Camp mappings:
PC key | AK key |
---|---|
Cmd (Ctrl+Esc) | ⌘ |
Backspace | delete |
Enter | return |
Alt | option |
AltGr (Ctrl+Alt) | ^⌥ |
Pause/Break | fn esc |
Insert | fn return |
Delete | fn delete |
Home | fn← |
End | fn→ |
PgUp | fn↑ |
PgDn | fn↓ |
Num Lock | fn F6 |
Print Screen | fn⇧F11 |
Print Active Window | fn⇧⌥F11 |
Scroll Lock | fn⇧F12 |
iOS Keyboard
The iOS software keyboard has three panels. They cover all of the printing ASCII characters except for the backquote. The backquote can be generated by holding down on the apostrophe and selecting from a pop-up list. Some of the other keys have pop-up lists as well.
These pop-ups were added to Mac OS X 10.7. They are available in some applications (Chrome). Others will repeat instead (Terminal.app).
letter panel pop-ups | |
---|---|
e | e è é ê ë ē ė ę |
y | ÿ y |
u | ū ú ù ü û u |
i | ì į ī í ï î i |
o | õ ō ø œ ó ò ö ô o |
a | a à á â ä æ ã å ā |
s | s ß ś š |
l | ł l |
z | z ž ź ż |
c | c ç ć č |
n | ń ñ n |
number and punctuation panel pop-ups | |
0 | ° 0 |
- | - – — • |
$ | ¥ € $ ¢ £ ₩ |
& | & § |
" | « » „ “ ” " |
. | . … |
? | ? ¿ |
! | ! ¡ |
' | ` ‘ ’ ' |
% | % ‰ |
I've successfully cut-and-paste other Unicode characters on iOS using the Unicode Consortium Code Charts.
One can add other keyboards to iOS. Use Settings | General | Keyboard | Keyboards. This is also where to remove the Emoji keyboard. I added the Greek keyboard in addition to a U.S. keyboard. When multiple keyboards are chosen in Settings, there is a globe key on the keyboard to switch between them.
I turn off Auto-Capitalization and Auto-Correction.
- Auto-Capitalization
- Auto-Correction
- Check Spelling
- Enable Caps Lock
- "." Shortcut
Caps Lock is effected by double tapping the shift key. The "." shortcut is effected by double tapping the space bar. It inserts a period followed by a space.
There is an iOS app called Prompt which serves as an ssh client. The keyboard has an extra row on the top which is always visible. These are the keys:
ESC CTRL TAB / - | @ ↑ ↓ ← →
By holding the ESC key, one gets a pop-up with META on it.
Holding the arrow keys causes them to repeat.
The keys in the middle {{/ - | @ can be changed by pressing and holding them.
Character Sets and Encodings
ASCII is a set of 128 printing and control characters. 8-bit ASCII is a way to represent ASCII characters with bytes. In 8-bit ASCII the most significant bit in a byte is zero. A string which is valid 8-bit ASCII is also valid UTF-8. See man iconv for instructions on how to convert a string in a different encoding to UTF-8.
Printing characters are characters which when typed render a character. Control characters are characters which when typed instruct a device, operating system, or application to perform an action other than render a character. Few applications or devices support all control characters, so in practice many control characters are ignored or cause an error.
ASCII
zone | range | chars | range | chars | range | chars |
---|---|---|---|---|---|---|
numeric | 32 33-41 42-47 |
SPACE ! " # $ % & ' ( ) * + , - . / |
48-57 | 0-9 | 58-63 | : ; < = > ? |
uppercase | 64 | 65-90 | A-Z | 91-95 | [ \ ] ^ _ | |
lowercase | 96 | ` | 97-122 | a-z | 123-126 | { | } ~ |
Most American keyboards since the DEC vt100 (1978) and the IBM PC (1981) have had all the non-lowercase printing ASCII characters explicitly depicted. Furthermore the characters have been on the same keys since the Mac (1984) and the IBM Model M (1985).
Keys for control characters are more complicated. Broadly, keyboards since the DEC vt100 have dedicated keys for TAB, ESC, LF, and DEL, with a control modifier key which can be used to enter the other control characters. One adds 64 modulo 128 to the control character value to get the printing character whose key is used in conjunction with the control modifier key.
On DOS and Windows, the Enter key maps to a CR LF character sequence. The IBM PC had both a backspace (BS) and delete (DEL) key. The original ASCII interpretation for these characters was that BS was to be used to backup and overstrike, whereas DEL was used to remove the previous character. In IBM PC usage BS removes the previous character and DEL removes the following character.
On the original Mac there was no ESC or control key. There was, however, both a Return and an Enter key which mapped to CR and LF respectively. CR was used as newline on the original Mac OS, and LF sometimes had an application specific interpretation. Escape and control keys were added to the Mac by 1986.
Control Characters
Sometimes it is necessary to use printing characters to refer to control characters. There are several schemes for doing this. These schemes are ambiguous and rely on the reader to determine whether control characters or printing characters are the intended meaning. The space character is a printing character, but it is useful to have separate notation for it as if it were a control character. The ASCII standard provides codes of two to three printing characters for all of the control characters. The Unicode standard has special characters starting at U+2400 which combine the ASCII codes into a single character.
ASCII | Unix | Emacs | Microsoft | C string |
---|---|---|---|---|
NUL | ^ | C- | \000 | |
SOH | ^A | C-a | CTRL+A | \001 |
STX | ^B | C-b | CTRL+B | \002 |
ETX | ^C | C-c | CTRL+C | \003 |
EOT | ^D | C-d | CTRL+D | \004 |
ENQ | ^E | C-e | CTRL+E | \005 |
ACK | ^F | C-f | CTRL+F | \006 |
BEL | ^G | C-g | CTRL+G | \a |
BS | ^H | C-h | BACKSPACE (CTRL+H) | \b |
TAB | ^I | TAB (C-i) | TAB (CTRL+I) | \t |
LF | ^J | RET (C-j) | ENTER (CTRL+J) | \n |
VT | ^K | C-k | CTRL+K | \v |
FF | ^L | C-k | CTRL+L | \f |
CR | ^M | C-m | CTRL+M | \r |
SO | ^N | C-n | CTRL+N | \016 |
SI | ^O | C-o | CTRL+O | \017 |
DLE | ^P | C-p | CTRL+P | \020 |
DC1 (XON) | ^Q | C-q | CTRL+Q | \021 |
DC2 | ^R | C-r | CTRL+R | \022 |
DC3 (XOFF) | ^S | C-s | CTRL+S | \023 |
DC4 | ^T | C-t | CTRL+T | \024 |
NAK | ^U | C-u | CTRL+U | \025 |
SYN | ^V | C-v | CTRL+V | \026 |
ETB | ^W | C-w | CTRL+W | \027 |
CAN | ^X | C-x | CTRL+X | \030 |
EM | ^Y | C-y | CTRL+Y | \031 |
SUB | ^Z | C-z | CTRL+Z | \032 |
ESC | ^[ | ESC (C-[) | ESC | \033 |
FS | ^\ | C-\ | \034 | |
GS | ^] | C-] | \035 | |
RS | ^^ | C-^ | \036 | |
US | ^_ | C-_ | \037 | |
SP | SPC | SPACEBAR | ||
DEL | ^? | DEL | DELETE | \177 |
"Unix" notation predates Unix, since it was used in ITS operating system documentation in 1969. The Unix notation is used by Mac OS X and we regard it as the preferred notation. We use Emacs notation in the context of text editors and Microsoft notation in the context of Windows, however.
ANSI Escapes
ECMA-48 (pdf)
escape sequences for cursor movement and screen clearing | |
---|---|
sequence | rendering |
ESC [ A ESC [ n A |
move cursor up one or n rows |
ESC [ B ESC [ n B |
move cursor down one or n rows |
ESC [ C ESC [ n C |
move cursor forward one or n columns |
ESC [ D ESC [ n D |
move cursor back one or n columns |
ESC [ E ESC [ n E |
move cursor to beginning of line down one or n columns |
ESC [ F ESC [ n F |
move cursor to beginning of line up one or n columns |
ESC [ n G | move cursor to column n |
ESC [ n ; m H | move cursor to row n, column m |
ESC 0 J ESC 1 J ESC 2 J |
clear screen from cursor to end clear screen from cursor to beginning clear entire screen |
ESC ] 0 ; name ^G | set terminal name |
To create an ESC when editing with emacs, type C-q ESC. To create an ESC when editing with vim, type Ctrl-ESC.
[[table class="wiki-content-table"]]
[[row]]
[[cell style="border: 1px solid black; background-color: #EEE"]]sequence[[/cell]]
[[cell style="border: 1px solid black; background-color: #EEE"]]rendering[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 0 m
ESC [ m[[/cell]]
[[cell style="border: 1px solid black"]]
normal
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 1 m
[[/cell]]
[[cell style="border: 1px solid black"]]
bold
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 3 m
[[/cell]]
[[cell style="border: 1px solid black"]]
italic
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 4 m
[[/cell]]
[[cell style="border: 1px solid black"]]
underlined
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 7 m
[[/cell]]
[[cell style="border: 1px solid black; color: white; background-color: black"]]
reverse video
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 30 m
[[/cell]]
[[cell style="border: 1px solid black"]]
black text color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 31 m
[[/cell]]
[[cell style="border: 1px solid black"]]
red text color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 32 m
[[/cell]]
[[cell style="border: 1px solid black"]]
green text color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 33 m
[[/cell]]
[[cell style="border: 1px solid black"]]
yellow text color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 34 m
[[/cell]]
[[cell style="border: 1px solid black"]]
blue text color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 35 m
[[/cell]]
[[cell style="border: 1px solid black"]]
magenta text color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 36 m
[[/cell]]
[[cell style="border: 1px solid black"]]
cyan text color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 37 m
[[/cell]]
[[cell style="border: 1px solid black"]]
white text color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 40 m
[[/cell]]
[[cell style="border: 1px solid black; color: white; background-color: black"]]
black background color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 41 m
[[/cell]]
[[cell style="border: 1px solid black; background-color: red"]]
red background color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 42 m
[[/cell]]
[[cell style="border: 1px solid black; background-color: green"]]
green background color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 43 m
[[/cell]]
[[cell style="border: 1px solid black; background-color: yellow"]]
yellow background color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 44 m
[[/cell]]
[[cell style="border: 1px solid black; background-color: blue"]]
blue background color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 45 m
[[/cell]]
[[cell style="border: 1px solid black; background-color: magenta"]]
magenta background color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 46 m
[[/cell]]
[[cell style="border: 1px solid black; background-color: cyan"]]
cyan background color
[[/cell]]
[[/row]]
[[row]]
[[cell style="border: 1px solid black"]]
ESC [ 47 m
[[/cell]]
[[cell style="border: 1px solid black; background-color: white"]]
white background color
[[/cell]]
[[/row]]
[[/table]]
Unicode
Unicode characters have a Unicode point which is a number between 0 and 1114111 inclusive. Unicode characters with a point between 0 and 65535 are said to be in the Basic Multilingual Plane (BMP). Unicode points are usually written with hex notation, in which case the BMP points range from U+0000 to U+FFFF. In this notation the highest Unicode point is U+10FFFF.
To enter a Unicode character by point on Mac, switch to Unicode Hex Input. Hold down the option key and type in the four hex digit representation. How to enter characters outside the BMP?
Because it relies on the option key, Unicode Hex Input does not work with Emacs or Terminal with Use option as meta key set.
In Emacs use the keybinding C-x 8 RET or M-x ucs-insert to insert a Unicode character by point.
On Windows, use WordPad. Type the hex value for a point, then Alt+X, and the hex value will be converted to the character. Then copy-and-paste the text to another application.
Unicode 6.2 has 110,182 encoded characters, 137,468 code points reserved for private use, 2,048 code points for surrogates, and 66 code points for non-characters. Non-characters in the BMP are U+FFFF and U+FFEF and the range U+FDD0..U+FDEF.
The fields in [ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt UnicodeData.txt] are
- Point
- Name
- General_Category
- General_Combining_Class
- Bidi_Class
- Decomposition_Type/Decomposition_Mapping
- Numeric_Type/Numeric_Value
- Bidi_Mirrored
- <Obsolete
- <Obsolete
- Simple_Uppercase_Mapping
- Simple_Lowercase_Mapping
- Simple_Titlecase_Mapping
If you need to handle combining characters or bidirectional text there are more properties to be aware of.
Operating Systems and Applications
Shortcuts I find useful:
Shortcuts by Operating System and Application
Mac
Custom key bindings and shortcuts hamper communication and cause disorientation when using other people's setups. That said, I map the caps lock key to the control key.
To map caps lock to control on Mac OS X go to:
System Preferences | Keyboard | Keyboard | Modifier Keys...
In Terminal.app on Mac OS X I check this checkbox:
Preferences... | Keyboard | Use option as meta key
This makes it easier to use Emacs in Terminal.app. The following meta keystrokes are useful for line-mode editing: M-b M-f M-d M-DEL M-l M-u. Also, less has the Emacs binding for M-v. However, using option as a meta key disables option shortcuts for non-ASCII characters. I would like a keystroke shortcut to toggle the Use option as meta key preference, but it doesn't appear to be exposed as a property to AppleScript.
I define the following custom shortcuts on Mac OS X:
keystroke | action | software |
---|---|---|
^⌥⌘M | maximize window | divvy |
^⌥⌘← | put window to left | divvy |
^⌥⌘→ | put window to right | divvy |
⌥⌘Space | switch input mode |
On Mac OS X I use these three input sources:
- ABC Extended
- Greek Polytonic
- LaTeX/APL
Differences between ABC Extended and U.S. International:
Windows
To map caps lock to control on Windows add the following to the Registry:
key | HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Keyboard Layout |
---|---|
name | Scancode Map |
type | REG_BINARY |
data | 00,00,00,00,00,00,00,00,02,00,00,00,1d,00,3a,00,00,00,00,00 |
On Windows I use these three input languages and keyboards:
- ENG - US
- FRA - United States-International
- ΕΛ - Greek Polytonic
I run Windows under Boot Camp, but I don't use the Apple keyboards. Use Shift+Space to change the input method. The United States-International input using combining keystrokes:
input | character |
---|---|
'a | á |
'c | ç |
`a | à |
"a | ä |
^a | â |
~a | ã |
The best way to get the a combining character as a standalone seems to be to press it twice and then delete one.
The United States-International input also uses the Right-Alt key as a special modifier key:
Right-Alt+s | ß |
Right-Alt+1 | ¡ |
Right-Alt+/ | ¿ |
There are others, but I'm not aware of a way to preview them.
Editors
Emacs
I use the following custom key bindings in Emacs:
keystroke | binding | default binding |
---|---|---|
C-c b | M-x revert-buffer | |
C-c c | M-x clipboard-yank | |
C-c d | M-x ido-dired | |
C-c f | (defun display-buffer-file-name () (interactive) (message buffer-file-name)) |
|
C-c r | M-x query-replace | |
C-c v | M-x clipboard-kill-region-save | |
C-c x | M-x clipboard-kill-region | |
C-x b | M-x ido-switch-buffer | M-x switch-buffer |
C-x C-b | M-x ibuffer | M-x list-buffers |
C-x C-f | M-x ido-find-file | M-x find-file |
C-x C-i | M-x ido-insert-file | M-x insert-file |
C-x C-w | M-x ido-write-file | M-x write-file |
The Windows keybindings for copy Ctrl+C, paste Ctrl+V, and cut Ctrl+X conflict with the Emacs bindings. Hence the custom bindings C-c c, C-c v, and C-c x.
Use C-\ or C-x RET C-\ to enable or disable the Emacs input method. One can use M-x list-input-methods to see all the available input methods. I use these input methods:
- latex
- latin-prefix
- rfc1345
- greek
If an input method is currently enabled, use C-h I to see the documentation for it.
When I run Emacs on Mac, here are how the modifier keys are set:
variable | setting |
---|---|
mac-control-modifier | control |
mac-right-control-modifier | left |
mac-option-modifier | meta |
mac-right-option-modifier | left |
mac-command-modifier | super |
mac-right-command-modifier | left |
mac-function-modifier | none |
Because the option key is bound to meta, it is not possible to use the option key to enter Latin accent characters in the customary Mac manner. I never use the right option key as a meta key, however. Setting mac-right-option-modifier to nil means that the right option key can be used to enter Latin accent characters. The other option is to use the latin-prefix input method.
The input method rfc1345 can be used to put macrons on the vowels of Latin text. RFC 1345 is a scheme for representing a variety of non-ASCII characters, including Latin accents, Greek, Cyrillic, Hebrew, Arabic, Hiragana and Katakana, using two character ASCII sequences.
One can use greek for monotonic Greek and greek-babel for polytonic Greek. greek-babel doesn't use the standard Greek keyboard layout, however. θ is bound to J instead of U, for example. Hence I prefer to use the Mac or Windows input method.
Use the keybinding C-x 8 RET or M-x ucs-insert to insert a Unicode character by name or by point in hexadecimal.
HTML
We are told that an HTML document should always declare its character encoding. In HTML4 a document which does not permit deprecated tags starts with this:
<!DOCTYPE HTML
PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
In HTML5 a document starts with this:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
HTML document character encodings are seemingly paradoxical, since the reader must already know the character encoding to read the document in the first place. Some readers may try decoding an HTML document with multiple encodings. Such readers won't decode the entire document, so the declaration is supposed to be in the first 1024 bytes.
HTML 1.0 had the character entity references {{<}} {{>}} and {{&}} since the characters {{<}}, {{> and & are interpreted as markup. HTML 2.0 added character entity references for the 96 upper 8-bit ISO-8859-1 characters. In addition to alphabetical codes, the ISO-8859-1 numeric code could be used: &#NUM; The numeric codes are decimal. There are 252 character entity references in HTML4. The numeric codes are the Unicode points? It is unclear to me whether more character entity references will be added in HTML5.
XML documents do not have to have their encoding declared if they are UTF-8 or UTF-16. UTF-16 must have a byte-order mark. Here is how to declare the encoding:
<?xml version="1.0" encoding="UTF-8"?>
The default character encoding for HTTP 1.1 is ISO-8859-1. Servers which return Unicode documents must declare the character encoding in the response header:
Content-Type: text/html; charset=utf-8
English Punctuation
Non-ASCII punctuation necessary to correctly typeset English.
To enter these characters in Emacs or Windows, memorize the hex Unicode points. In Emacs use C-x 8 RET POINT RET. On Windows, use WordPad. Type the hex value, then use Alt+X. Then cut-and-paste the text to another application.
Windows traditionally uses Alt and the numeric keypad to enter these characters, but this is not possible when running Windows under Boot Camp.
english punctuation | ||||||
---|---|---|---|---|---|---|
chr | mac os x (us extended) |
unicode name | unicode point | html entity | ||
‘ | ⌥] | LEFT SINGLE QUOTATION MARK | 2018 | ‘ | ||
’ | ⇧⌥] | RIGHT SINGLE QUOTATION MARK | 2019 | ’ | ||
“ | ⌥[ | LEFT DOUBLE QUOTATION MARK | 201C | “ | ||
” | ⇧⌥[ | RIGHT DOUBLE QUOTATION MARK | 201D | ” | ||
– | ⌥- | EN DASH | 2013 | – | ||
— | ⇧⌥- | EM DASH | 2014 | — | ||
… | ⌥; | HORIZONTAL ELLIPSIS | 2026 | … |
The Chicago Manual of Style, 16th Ed.:
Published works should use directional (or “smart”) quotation marks, sometimes called typographer’s or “curly” quotation marks.
All software also includes a “default” mark("); in published prose this unidirectional mark, far more portable than typographer's marks, nonetheless signals a lack of typographical sophistication. Proper directional characters should also be used for single quotation marks (‘’).
Published works should use directional (or “smart”) apostrophes.
The apostrophe is the same character as the right single quotation mark. Thanks to the limitations of conventional keyboards and many software programs, the apostrophe has been one of the most abused marks in punctuation—especially in the last generation or so. There are two common pitfalls: using the “default” unidirectional mark ('), on the one hand, and using the left single quotation mark, on the other. The latter usage in particular should always be construed as an error.
I prefer software which does not replace unidirectional marks with directional marks automatically. When using Google Docs, ⌘Z (Ctrl-Z on Windows) will sometimes undo a smart quotes conversion.
Hyphens are used in:
- compound words
- separators in telephone numbers, social security numbers, ISBNs, and spelled out words
En dashes are used in:
- numeric, time, and date ranges, including unfinished ranges
Em dashes are used to:
- set off an amplifying or explanatory clause
- mark the end of an incomplete sentence
In source code the hyphen is used as a minus sign. In typeset mathematics the minus sign is a distinct character.
The Chicago Manual of Style, 16th Ed.:
An //ellipsis// is the omission of a word, phrase, line, paragraph, or more from a quoted passage. . . . Chicago style is to indicate such omissions by the use of three spaced periods rather than by another device such as asterisks.
Latin Accent
My default input source on Mac OS X is U.S. Extended.
In Emacs I use the input method latin-prefix to enter Latin letters with accents.
On Windows I use United States-International.
The Emacs and Windows input methods use modifying prefixes. To enter the prefix character literally, type Space after the character.
For characters which do not have Emacs or Windows prefix sequences, use C-x 8 RET POINT RET (Emacs) or POINT Alt+X (WordPad).
To get the Mac keybindings which are available using the option key, use the Keyboard Viewer, which I have bound to ^⌥⌘K.
In Emacs, browse the current input method bindings with C-h I.
vowels | ||||||
---|---|---|---|---|---|---|
chr | mac os x (us extended) |
emacs (latin-prefix) |
windows (us-intl) |
unicode name | unicode point | html entity |
á | ⌥e a | 'a | 'a | LATIN SMALL LETTER A WITH ACUTE | 00C1 | á |
Á | ⌥e A | 'A | 'A | LATIN CAPITAL LETTER A WITH ACUTE | 00E1 | Á |
é | ⌥e e | 'e | 'e | LATIN SMALL LETTER E WITH ACUTE | 00C9 | é |
É | ⌥e E | 'E | 'E | LATIN CAPITAL LETTER E WITH ACUTE | 00E9 | É |
í | ⌥e i | 'i | 'i | í | ||
Í | ⌥e I | 'I | 'I | Í | ||
ó | ⌥e o | 'o | 'o | ó | ||
Ó | ⌥e O | 'O | 'O | Ó | ||
ú | ⌥e u | 'u | 'u | ú | ||
Ú | ⌥e U | 'U | 'U | Ú | ||
à | ⌥` a | `a | `a | à | ||
À | ⌥` A | `A | `A | À | ||
è | ⌥` e | `e | `e | è | ||
È | ⌥` E | `E | `E | È | ||
ù | ⌥` u | `u | `u | ù | ||
Ù | ⌥` U | `U | `U | Ù | ||
â | ⌥^ a | ^a | ^a | â | ||
 | ⌥^ A | ^A | ^A |  | ||
ê | ⌥^ e | ^e | ^e | ê | ||
Ê | ⌥^ E | ^E | ^E | Ê | ||
î | ⌥^ i | ^i | ^i | î | ||
Î | ⌥^ I | ^I | ^I | Î | ||
ô | ⌥^ o | ^o | ^o | ô | ||
Ô | ⌥^ O | ^O | ^O | Ô | ||
û | ⌥^ u | ^u | ^u | û | ||
Û | ⌥^ U | ^U | ^U | Û | ||
œ | ⌥q | /o2 | LATIN SMALL LIGATURE OE | 0153 | œ | |
Œ | ⌥Q | /O2 | LATIN CAPITAL LIGATURE OE | 0152 | Œ | |
ä | ⌥u a | "a | "a | ä | ||
Ä | ⌥u A | "A | "A | Ä | ||
ë | ⌥u e | "e | "e | ë | ||
Ë | ⌥u E | "E | "E | Ë | ||
ï | ⌥u i | "i | "i | ï | ||
Ï | ⌥u I | "I | "I | Ï | ||
ö | ⌥u o | "o | "o | ö | ||
Ö | ⌥u O | "O | "O | Ö | ||
ü | ⌥u u | "u | "u | ü | ||
Ü | ⌥u U | "U | "U | Ü | ||
ÿ | ⌥u y | "y | "y | ÿ | ||
Ÿ | ⌥u Y | "Y | LATIN CAPITAL Y DIERESIS | 0178 | Ÿ | |
vowels w/ macrons | ||||||
chr | mac os x (us extended) |
emacs (rfc1345) |
windows (us-intl) |
unicode name | unicode point | html entity |
ā | ⌥a a | &a- | LATIN SMALL LETTER A WITH MACRON | 0101 | ||
Ā | ⌥a A | &A- | LATIN CAPITAL LETTER A WITH MACRON | 0100 | ||
ē | ⌥a e | &e- | LATIN SMALL LETTER E WITH MACRON | 0113 | ||
Ē | ⌥a E | &E- | LATIN CAPITAL LETTER E WITH MACRON | 0112 | ||
ī | ⌥a i | &i- | LATIN SMALL LETTER I WITH MACRON | 012B | ||
Ī | ⌥a I | &I- | LATIN CAPITAL LETTER I WITH MACRON | 012A | ||
ō | ⌥a o | &o- | LATIN SMALL LETTER O WITH MACRON | 014D | ||
Ō | ⌥a O | &O- | LATIN CAPITAL LETTER O WITH MACRON | 014C | ||
ū | ⌥a u | &u- | LATIN SMALL LETTER U WITH MACRON | 016B | ||
Ū | ⌥a U | &U- | LATIN CAPITAL LETTER U WITH MACRON | 016A | ||
ȳ | ⌥a y | LATIN SMALL LETTER Y WITH MACRON | 0233 | |||
Ȳ | ⌥a Y | LATIN CAPITAL LETTER Y WITH MACRON | 0232 | |||
consonants | ||||||
chr | mac os x (us extended) |
emacs (latin-prefix) |
windows (us-intl) |
unicode name | unicode point | html entity |
ç | ⌥c c | ~c | Alt+Ctrl+, | LATIN SMALL LETTER C WITH CEDILLA | 00E7 | ç |
Ç | ⌥c C | ~C | Alt+Ctrl+Shift+, | LATIN CAPITAL LETTER C WITH CEDILLA | 00C7 | Ç |
ñ | ⌥n n | ~n | ~n | LATIN SMALL LETTER N WITH TILDE | 00F1 | &ntitle; |
Ñ | ⌥n N | ~N | ~N | LATIN CAPITAL LETTER N WITH TILDE | 00D1 | &Ntitle; |
ß | ⌥s | "s | Alt+Ctrl+s | SMALL LETTER SHARP S | 00DF | ß |
non-english punctuation | ||||||
chr | mac os x (us extended) |
emacs (latin-prefix) |
windows (us-intl) |
unicode name | unicode point | html entity |
¡ | ⌥1 | ~! | Alt+Ctrl+1 | INVERTED EXCLAMATION MARK | 00A1 | ¡ |
« | ⌥\ | ~< | Alt+Ctrl+[ | LEFT-POINTING DOUBLE ANGLE QUOTATION MARK | 00AB | « |
» | ⇧⌥\ | ~> | Alt+Ctrl+] | RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK | 00BB | » |
¿ | ⇧⌥/ | ~? | Alt+Ctrl+\ | INVERTED QUESTION MARK | 00BF | ¿ |
‚ | ⇧⌥0 | SINGLE LOW-9 QUOTATION MARK | 201A | ‚ | ||
‘ | ⌥] | LEFT SINGLE QUOTATION MARK | 2018 | ‘ | ||
„ | ⇧⌥, | DOUBLE LOW-9 QUOTATION MARK | 201E | „ | ||
“ | ⌥[ | LEFT DOUBLE QUOTATION MARK | 201C | “ | ||
ª | ⌥9 | _a | FEMININE ORDINAL INDICATOR | 00AA | ª | |
º | ⌥0 | _o | MASCULINE ORDINAL INDICATOR | 00BA | º |
French
The French ordinal adjectives are
- premier, première
- deuxième
- troisième
The abbreviated forms are 1er, 1re, 2e, and 3e. There are no Unicode characters for the superscripts.
Dieresis (Fr: tréma) is used in French to indicate two separate vowels that would otherwise be interpreted as a diphthong: Noël, naïf. Dieresis on a y is rare: L'Haÿ-les-Roses. The dieresis goes on the second vowel.
The letter g is a fricative before the letters e and i and a stop before the other vowels. A silent u is used to indicate when in fact the g is pronounced as a stop: givre (frost) and guivre (archaic term for a serpent). A dieresis is used if the u is not silent: aigüe (feminine form of acute). Before the spelling reform of 1990 the dieresis was written over the silent e: aiguë.
A few common French words use the œ digraph: mœurs, cœur, sœur, œuf, œuvre, œil. The œu vowel combination is a rounded front vowel like the German ö. œil is pronounced as a rounded front vowel followed by a palatal approximant and perhaps could be regarded as a diphthong.
The letter c is a velar stop before the vowels a, o, and u and an unvoiced dental sibilant before the vowels e and i. The c with cédille ç can be used before a, o, and u to represent an unvoiced dental sibilant. s is a voiced dental sibilant.
Accented letters included c with cédille appear at the same place as the unaccented letters in the collation order. If two words only differ in the presence of an accent, the accented word appear second in the collation order.
The French use guillemets « » to quote speech. The guillemets are separated from the interior text by space:
« Voulez-vous un sandwich, Henri ? »
They use the comma instead of the period as the decimal mark. The digits of large numbers can be set off in threes using spaces, e.g. 1 000 000.
German
The German name for the letter ß is (das) Eszett. ß and ss are used for the same unvoiced sibilant (English s). ß is used after long vowels and diphthongs and ss is used after short vowels. Before the spelling reform of 1996 ß was always used word-finally.
A single s is used for the voiced sibilant (English z), but note that voiced consonants do not occur word-finally in German and the German article das is an orthographic exception.
The letter ß and the unvoiced sibilant it represents does not occur word initially and there is no uppercase version of the letter. When a word is put in all caps it is replaced by SS. Before the spelling reform of 1996 SZ was used.
When German must be written in ASCII, ß is replaced by ss and ä, ö, and ü are replaced by ae, oe, and ue.
An example of how to use quotes in German:
Er fragt: „Wie sagt man ‚foobar‘ auf Deutsch?“
The quotes are called Gänsefüßchen. Germans use the English left quotes on the right. Some fonts such as Verdana have English left quotes which look incorrect when used on the right in German.
Germans write numbers the same way as the French.
Latin
For pedagogical purposes macrons are used to distinguish long vowels from short vowels.
The classical Romans used a mark called the apex for this purpose over A, E, O, and V. The apex is not dissimilar to an acute accent. A long I was indicated by making the letter taller.
Spanish
Spanish has masculine and feminine ordinal ending abbreviation characters.
E.g. primera edición can be abbreviated as 1ª edición.
In Spanish most words that end in vowels, s, or n have penultimate stress and most words that end in r, l, or d have ultimate stress.
When a word does not follow the above pattern the position of the stress is indicated with an acute accent above the vowel receiving stress.
The ñ is treated as a separate letter. The Spanish collation order puts it after n. There is a capitalized version Ñ, but only a handful of words borrowed from foreign or indigenous languages start with it.
Traditionally ch and ll were treated as separate letters that came after c and l in the collation order. Since the reform of 1994 they are treated as two separate letters for collation.
Most Spanish speaking countries write numbers in the French manner. Mexicans write numbers in the English manner.
Polytonic Greek
Polytonic Greek is used to write Classical Greek. It uses acute, grave, and circumflex accents, the iota subscript, and smooth and rough breathing marks.
Mac OS X comes with a Polytonic Greek input source. Here is how it maps Greek letters to the keyboard:
greek letter | mac (greek polytonic) |
---|---|
; : | q Q |
ς Σ | w W |
υ Υ | y Y |
θ Θ | u U |
η Η | h H |
ξ Ξ | j J |
χ Χ | x X |
ψ Ψ | c C| |
ω Ω | v V |
The rest of the Greek letters are mapped to their phonetically matching Latin keys.
accented letter | mac (greek polytonic) |
---|---|
ὰ | ]a |
ά | ;a |
ᾶ | [a |
ἀ | 'a |
ἁ | "a |
ᾳ | ⌥ia |
ἆ | -a |
ἂ | =a |
ἄ | /a |
ἇ | _a |
ἃ | +a |
ἅ | ?a |
greek | mac (greek polytonic) |
usage |
---|---|---|
. | . | as English period |
, | , | as English comma |
; | q | as English question mark |
· | ⌥9 | as English colon or semicolon |
The number keys and their shift punctuation are the same when the Greek Polytonic input source is in effect. The keys for these characters are the same: `~|\,.<
Mathematics
A few non-ASCII mathematical symbols are available in the Mac OS X U.S Extended and the Emacs latin-prefix input methods:
chr | mac os x (us extended) |
emacs (latin-prefix) |
unicode name | unicode point | html entity |
---|---|---|---|---|---|
× | /\ | MULTIPLICATION SIGN | 00D7 | × | |
÷ | ⌥/ | /: | DIVISION SIGN | 00F7 | ÷ |
≠ | NOT EQUAL TO | 2260 | ≠ | ||
≤ | ⌥, | LESS-THAN OR EQUAL TO | 2264 | ≤ | |
≥ | ⌥. | GREATER-THAN OR EQUAL TO | 2265 | ≥ | |
° | ⇧⌥8 | // | DEGREE SIGN | 00B0 | ° |
′ | PRIME | 2032 | ′ | ||
″ | DOUBLE PRIME | 2033 | ″ |
For the remaining mathematical symbols I use input methods based on LaTeX.
chr | unicode name | unicode point | latex | html entity | alt + fn |
---|---|---|---|---|---|
logical operators | |||||
¬ | not sign | U+00AC | \neg | ¬ | ! |
∧ | logical and | U+2227 | \wedge | ∧ | & |
∨ | logical or | U+2228 | \vee | ∨ | | |
∀ | for all | U+2200 | \forall | ∀ | A |
∃ | there exists | U+2203 | \exists | ∃ | E |
sets | |||||
∅ | empty set | U+2205 | \emptyset | ∅ | 0 |
∈ | element of | U+2208 | \in | ∈ | e |
∉ | not an element of | U+2209 | \notin | ∉ | n |
⊂ | subset of | U+2282 | \subset | ⊂ | ( |
⊃ | superset of | U+2283 | \supset | ⊃ | ) |
⊆ | subset of or equal to | U+2286 | \subseteq | ⊆ | [ |
⊇ | superset of or equal to | U+2287 | \supseteq | ⊇ | ] |
∩ | intersection | U+222A | \cap | ∩ | I |
∪ | union | U+2229 | \cup | ∪ | U |
relational operators | |||||
≤ | less-than or equal to | U+2264 | \le | ≤ | < |
≥ | greater than or equal to | U+2265 | \ge | ≥ | |
≠ | not equal to | U+2260 | \ne | ≠ | # |
≈ | almost equal to | U+2248 | \approx | ≈ | ~ |
≡ | identical to | U+2261 | \equiv | ≡ | = |
arithmetic operators | |||||
± | plus-minus sign | U+00B1 | \pm | ± | + |
÷ | division sign | U+00F7 | \div | ÷ | / |
× | multiplication sign | U+00D7 | \times | × | * |
relational algebra | |||||
π | project | ||||
σ | select | ||||
ρ | rename | ||||
⋈ | (natural) join | \bowtie | |||
⋉ | left semijoin | \ltimes | |||
⋊ | right semijoin | \rtimes | |||
▷ | antijoin | ||||
⟕ | left outer join | ||||
⟖ | right outer join | ||||
⟗ | full outer join | ||||
other | |||||
∞ | infinity | U+221E | \infty | ∞ | i |
° | degree sign | U+00B0 | ^\circ | ° | d |
Keyboard Notation
keyboard notation | |||||
---|---|---|---|---|---|
chr | unicode name | unicode point | key | alt key | html entity |
← | leftwards arrow | U+2190 | left | ← | |
↑ | upwards arrow | U+2191 | up | ↑ | |
→ | rightwards arrow | U+2192 | right | → | |
↓ | downwards arrow | U+2193 | down | ↓ | |
⌘ | place of interest sign | U+2138 | command | ||
⏎ | return symbol | U+23CE | return | ||
⌤ | up arrowhead between two horizontal bars | U+2324 | enter | ||
⇧ | upwards white arrow | U+21E7 | shift | ||
⇪ | upwards white arrow from bar | U+21EA | caps lock | ||
⇟ | downwards arrow with double stroke | U+21DF | page down | fn↓ | |
⇞ | upwards arrow with double stroke | U+21DE | page up | fn↑ | |
↖ | north west arrow | U+2196 | home | ⌘↑ | |
↘ | south east arrow | U+2198 | end | ⌘↓ | |
⌥ | option key | U+2325 | option | ||
⌫ | erase to left | U+232B | delete pc backspace | ||
⎋ | broken circle with northwest arrow | U+238B | esc | ||
⏏ | eject symbol | U+23CF | eject | ⌘E | |
⌦ | erase to right | U+2326 | pc delete | fn⌫ | |
none | none | power | |||
⇥ | rightwards arrow to bar | U+21E5 | tab | ||
⇤ | leftwards arrow to bar | U+21E4 | tab | ||
⚙ | gear | U+2699 |
pc delete
ctrl: various notation, multiple unicode characters
enter vs return
escape
eject and power symbol
Ability to enter the printable version of control characters?