users /etc/passwd
groups /etc/group
sudo /etc/sudoers
ldap /etc/nsswitch.conf
hosts /etc/hosts
ports /etc/services
ip protocols /etc/protocols
words /usr/share/dict/words
colors /usr/share/X11/rgb.txt
country codes /usr/share/zoneinfo/
languages ~/Local/etc/ISO-639-2.txt
time zones /etc/localtime
characters ~/Local/etc/UnicodeData.txt
locales /usr/share/i18n/locales
astronomy $PLAN9/sky/constelnames
literature ~/Local/etc/shakes.txt



A colon delimited file with seven fields. Lines which start with a # are comments. The fields are:

  • user
  • password
  • uid
  • gid
  • gecos
  • home
  • shell

Traditionally the system administrator added accounts by editing /etc/passwd. A special version of the editor vipw was available which acquired a lock to avoid corrupting the file.

A non-privileged user could use the passwd, chfn, and chsh commands to change the the password, gecos, and shell fields.

When a user logs in, the LOGNAME and HOME environment variables are set according to the first and sixth fields. HOME is also used for the initial working directory of shell, which is chosen using the seventh field.

/etc/passwd is readable by all users on the system, so it is perhaps surprising that passwords were stored in it. Passwords were stored in an encrypted form using a trap-door hash function. See man 3 crypt for details of the function. A simple attack method would be to use crypt on a list of common passwords and see if any of the results are in /etc/passwd. The crypt function makes this approach somewhat harder by encrypting the password with one of 4096 salt values. In the encrypted password, the first two characters indicate the salt function used. Most modern Linux systems no longer keep the passwords in /etc/passwd, and instead put them in /etc/shadow which is only readable by root.

If a Linux machine is managed by LDAP, one can get the password entries with this command:

getent passwd

Perhaps it is possible for a user on an LDAP managed Linux machine to modify their /etc/passwd entry using ldapmodify.

Mac OS X systems have an /etc/passwd file, though a comment at the top says that it is only used in single user mode. One can enter single user mode by holding down ⌘S while rebooting. One can exit single user mode by typing reboot.

Otherwise, Mac OS X has a opendirectoryd process which is responsible for password information. One can get the equivalent of the /etc/password entry for LOGIN_NAME with this command:

dscl . -read /Users/LOGIN_NAME Password UniqueID PrimaryGroupID RealName NFSHomeDirectory UserShell

It is possible to use sudo dscl to create a user account.



The fields are

  • name
  • password
  • gid
  • list (of user names, comma separated)

How groups work on Unix.

How is the password used?

What about ACLs?



RFC 4511: Lightweight Directory Access Protocol (LDAP)
RFC 2849: LDAP Data Interchange Format (LDIF)

The LDAP protocol defines a way for a client to connect via TCP to an LDAP server perform CRUD operations on entries which are stored in a directory structure.

dn: distinguished name
dc: directory component
ou: organizational unit
cn: common name

  • searches

command line tools

using LDAP for authentication

setting up an LDAP server





A mapping from IP addresses to host names.

How do resolvers work?

On Windows the file is at:




The first two bytes of a TCP packet are the source port. The second two bytes are the destination port. Thus there are 64k possible TCP ports.

The socket system call creates a socket, and the bind system call ...

On Unix, must be root to bind to ports 0 thru 1023.

well known ports 0 thru 1023
registered ports 1024 thru 49151
dynamic or private ports 49152 thru 65535

IP Protocols


An IPv4 packet header is 20 or 24 bytes long. The first byte specifies both the IP version and the length of the header.

The tenth byte of an IPv4 header is called the Protocol field and is used to specify the protocol of the data section. Thus there can be at most 256 protocols. They are listed in /etc/protocols.

In an IPv6 header the protocol is stored at the seventh byte, which is called the Next Header field.



A list of English words. Can be used for spell checking.

One can install foreign language dictionaries for aspell on Linux. See apt-cache search aspell. They probably get installed in /usr/lib/aspell, but a binary format is used which I don't know how to decipher.

How to get the list of English words from aspell:

$ aspell dump master

How to get the list of French words from aspell on Ubuntu Linux:

$ sudo apt-get install aspell-fr

$ aspell --lang=fr dump master



Named colors and their RGB values.

The file is only present if X Windows is installed. However, it is not unknown for other applications to install a file named rgb.txt which can be used if X Windows isn't installed.

The significance of the X Windows color names is that they were supported by Mosaic and Netscape Navigator, and thus became de facto Web standards. They are now part of the CSS and SVG standards, though some changes were made.

Country Codes


A list of two character ISO 3166 country codes, followed by the full name of the country.

The file used to be distributed with the timezone info, but was dropped recently.


The file is pipe-delimited with five fields:

  • 3 letter "bibliographic" code
  • 3 letter "terminologic" code; sometimes absent in which case same as "bibliographic"; when different the "terminologic" code shares letters with the 2 letter code; see the iso 639-2 FAQ.
  • 2 letter code; sometimes absent
  • English name for language
  • French name for language

Time Zones


A symlink to a binary file in /usr/share/zoneinfo. The target of the symlink indicates the time zone the computer is in. Here is how to inspect the file from the command line:

$ zdump /etc/localtime
/etc/localtime  Tue Dec 30 10:03:27 2014 PST


The first column is the two character ISO 3166 country code and joins with /usr/share/zoneinfo/

The second column is the latitude and longitude of the "principal location" of the time zone. The coordinates are in DDMM or DDMMSS format. Positive latitude is north of the equator and positive longitude is east of the prime meridian.

The third column is the name of the time zone. There should be a file in /usr/share/zoneinfo matching this name; e.g. /usr/share/zoneinfo/America/Los_Angeles.

The file /usr/share/zoneinfo/America/Los_Angeles and similar such files are in a binary format. To see the a dump of the data:

$ zdump -v America/Los_Angeles

Mostly what one sees in the output are Daylight Savings Time changes. For each change there will be UTC times for two successive seconds with the corresponding local times. The output also contains the earliest and latest possible times, as well as one day after the earliest time and one day before the latest time.


Not actually a Unix file, perhaps because it is 1.3M in size and grows with each revision to the Unicode standard. Each line contains information about a Unicode character. The fields are semicolon delimited:

  • Point
  • Name
  • General_Category
  • General_Combining_Class
  • Bidi_Class
  • Decomposition_Type/Decomposition_Mapping
  • Numeric_Type/Numeric_Value
  • Bidi_Mirrored
  • <Obsolete
  • <Obsolete
  • Simple_Uppercase_Mapping
  • Simple_Lowercase_Mapping
  • Simple_Titlecase_Mapping

The General Category Values are two letter values. The meaning of the first letter is:

C other
L letter
M combining mark
N number
P punctuation
S symbol
Z separator

How to count the unicode characters by general category:

$ curl

$ awk 'BEGIN{FS=";"} {cnt[$3] += 1} END{for (i in cnt) print i, cnt[i];}'  UnicodeData.txt | sort

The file is ASCII, and doesn't show what the characters look like. One can display a Unicode point in zsh with:

echo $'\u03bb'

For ASCII lookups, don't forget about man ascii.


One can run

$ locale

to see the current locale. The output is a list of environment variables. Usually one sets the LANG environment variable and allows that value to be inherited by the other environment variables.

One can run

$ locale -a

to see the list of supported locales.

On Ubuntu, the default locale is set in /etc/default/locale. Ubuntu ships with a sparse list of "compiled" locales, since each takes 50M of space. Use

$ sudo locale-gen

to "compile" a locale.


If Plan 9 from User Space is installed, one gets a command line ephemeris tool called astro. A common way to install the Plan 9 tools is by putting a script called 9 in PATH:

$ 9 man astro

$ 9 astro -klp

The script 9 defines an environment variable called PLAN9 which contains the installed code. The file $PLAN9/sky/constelnames contains the 88 constellations with their abbreviations in the first column and their full names in the second column, separated by tabs.

The file $PLAN9/sky/estartab contains data about 2158 stars on the ecliptic. It is better to download the PPM star data, however.

The PPM Star Catalog contains information about 468,861 stars. HEASARC provides an ASCII version with pipe delimited fields. The tdat format is described at

Some of the interesting fields:

1 PPM serial number
3 apparent magnitude
4 spectral type
5 right ascension
6 declination
7 galactic longitude
8 galactic latitude
11 proper motion RA sec/yr
12 proper motion declination "/yr
20 SAO catalog designation
21 Henry Draper catalog designation
22 AGK3 catalog designation
23 Cape Photograph DM designation


$ wc ~/Local/etc/shakes.txt
  124787  904061 5589889 /Users/clark/Local/etc/shakespeare.txt