users /etc/passwd groups /etc/group sudo /etc/sudoers ldap /etc/nsswitch.conf /etc/pam.conf /etc/openldap/ldap.conf /etc/openldap/slapd.conf smtp http hosts /etc/hosts ports /etc/services ip protocols /etc/protocols words /usr/share/dict/words colors /usr/share/X11/rgb.txt /opt/X11/share/X11/rgb.txt country codes /usr/share/zoneinfo/iso3166.tab ~/Local/etc/iso3166.tab http://www.ietf.org/timezones/data/iso3166.tab languages ~/Local/etc/ISO-639-2.txt http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt time zones /etc/localtime /usr/share/zoneinfo/zone.tab /usr/share/zoneinfo/America/Los_Angeles characters ~/Local/etc/UnicodeData.txt ftp:ftp.unicode.org/Public/UNIDATA/UnicodeData.txt //~/Local/etc/Scripts.txt http://unicode.org/Public/UNIDATA/Scripts.txt ~/Local/etc/PropList.txt ftp:ftp.unicode.org/Public/UNIDATA/PropList.txt //~/Local/etc/DerivedCoreProperties.txt ftp://ftp.unicode.org/Public/UNIDATA/DerivedCoreProperties.txt locales /usr/share/i18n/locales /usr/share/locales astronomy $PLAN9/sky/constelnames ~/Local/etc/heasarc_ppm.tdat ftp://legacy.gsfc.nasa.gov/heasarc/dbase/dump/heasarc_ppm.tdat.gz literature ~/Local/etc/shakes.txt http://www.gutenberg.org/ebooks/100.txt.utf-8 ~/Local/etc/king_james.txt http://www.gutenberg.org/ebooks/10.txt.utf-8 # Users /etc/passwd A colon delimited file with seven fields. Lines which start with a # are comments. The fields are: • user • password • uid • gid • gecos • home • shell Traditionally the system administrator added accounts by editing /etc/passwd. A special version of the editor vipw was available which acquired a lock to avoid corrupting the file. A non-privileged user could use the passwd, chfn, and chsh commands to change the the password, gecos, and shell fields. When a user logs in, the LOGNAME and HOME environment variables are set according to the first and sixth fields. HOME is also used for the initial working directory of shell, which is chosen using the seventh field. /etc/passwd is readable by all users on the system, so it is perhaps surprising that passwords were stored in it. Passwords were stored in an encrypted form using a trap-door hash function. See man 3 crypt for details of the function. A simple attack method would be to use crypt on a list of common passwords and see if any of the results are in /etc/passwd. The crypt function makes this approach somewhat harder by encrypting the password with one of 4096 salt values. In the encrypted password, the first two characters indicate the salt function used. Most modern Linux systems no longer keep the passwords in /etc/passwd, and instead put them in /etc/shadow which is only readable by root. If a Linux machine is managed by LDAP, one can get the password entries with this command: getent passwd  Perhaps it is possible for a user on an LDAP managed Linux machine to modify their /etc/passwd entry using ldapmodify. Mac OS X systems have an /etc/passwd file, though a comment at the top says that it is only used in single user mode. One can enter single user mode by holding down ⌘S while rebooting. One can exit single user mode by typing reboot. Otherwise, Mac OS X has a opendirectoryd process which is responsible for password information. One can get the equivalent of the /etc/password entry for LOGIN_NAME with this command: dscl . -read /Users/LOGIN_NAME Password UniqueID PrimaryGroupID RealName NFSHomeDirectory UserShell  It is possible to use sudo dscl to create a user account. # Groups /etc/group The fields are • name • password • gid • list (of user names, comma separated) How groups work on Unix. How is the password used? What about ACLs? # Sudo # LDAP The LDAP protocol defines a way for a client to connect via TCP to an LDAP server perform CRUD operations on entries which are stored in a directory structure. dn: distinguished name dc: directory component ou: organizational unit cn: common name • searches ## command line tools ## using LDAP for authentication ## setting up an LDAP server # SMTP # HTTP # Hosts /etc/hosts A mapping from IP addresses to host names. How do resolvers work? On Windows the file is at: C:\Windows\System32\drivers\etc\hosts  # Ports /etc/services The first two bytes of a TCP packet are the source port. The second two bytes are the destination port. Thus there are 64k possible TCP ports. The socket system call creates a socket, and the bind system call ... On Unix, must be root to bind to ports 0 thru 1023.  well known ports 0 thru 1023 registered ports 1024 thru 49151 dynamic or private ports 49152 thru 65535 # IP Protocols /etc/protocols An IPv4 packet header is 20 or 24 bytes long. The first byte specifies both the IP version and the length of the header. The tenth byte of an IPv4 header is called the Protocol field and is used to specify the protocol of the data section. Thus there can be at most 256 protocols. They are listed in /etc/protocols. In an IPv6 header the protocol is stored at the seventh byte, which is called the Next Header field. # Words /usr/share/dict/words A list of English words. Can be used for spell checking. One can install foreign language dictionaries for aspell on Linux. See apt-cache search aspell. They probably get installed in /usr/lib/aspell, but a binary format is used which I don't know how to decipher. How to get the list of English words from aspell: $ aspell dump master


How to get the list of French words from aspell on Ubuntu Linux:

$sudo apt-get install aspell-fr$ aspell --lang=fr dump master


# Colors

/usr/share/X11/rgb.txt

Named colors and their RGB values.

The file is only present if X Windows is installed. However, it is not unknown for other applications to install a file named rgb.txt which can be used if X Windows isn't installed.

The significance of the X Windows color names is that they were supported by Mosaic and Netscape Navigator, and thus became de facto Web standards. They are now part of the CSS and SVG standards, though some changes were made.

# Country Codes

/usr/share/zoneinfo/iso3166.tab

A list of two character ISO 3166 country codes, followed by the full name of the country.

The file used to be distributed with the timezone info, but was dropped recently.

# Languages

http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt

The file is pipe-delimited with five fields:

• 3 letter "bibliographic" code
• 3 letter "terminologic" code; sometimes absent in which case same as "bibliographic"; when different the "terminologic" code shares letters with the 2 letter code; see the iso 639-2 FAQ.
• 2 letter code; sometimes absent
• English name for language
• French name for language

# Time Zones

/etc/localtime

A symlink to a binary file in /usr/share/zoneinfo. The target of the symlink indicates the time zone the computer is in. Here is how to inspect the file from the command line:

$zdump /etc/localtime /etc/localtime Tue Dec 30 10:03:27 2014 PST  /usr/share/zoneinfo/zone.tab The first column is the two character ISO 3166 country code and joins with /usr/share/zoneinfo/iso3166.tab. The second column is the latitude and longitude of the "principal location" of the time zone. The coordinates are in DDMM or DDMMSS format. Positive latitude is north of the equator and positive longitude is east of the prime meridian. The third column is the name of the time zone. There should be a file in /usr/share/zoneinfo matching this name; e.g. /usr/share/zoneinfo/America/Los_Angeles. The file /usr/share/zoneinfo/America/Los_Angeles and similar such files are in a binary format. To see the a dump of the data: $ zdump -v America/Los_Angeles


Mostly what one sees in the output are Daylight Savings Time changes. For each change there will be UTC times for two successive seconds with the corresponding local times. The output also contains the earliest and latest possible times, as well as one day after the earliest time and one day before the latest time.

# Characters

ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt//

Not actually a Unix file, perhaps because it is 1.3M in size and grows with each revision to the Unicode standard. Each line contains information about a Unicode character. The fields are semicolon delimited:

• Point
• Name
• General_Category
• General_Combining_Class
• Bidi_Class
• Decomposition_Type/Decomposition_Mapping
• Numeric_Type/Numeric_Value
• Bidi_Mirrored
• <Obsolete
• <Obsolete
• Simple_Uppercase_Mapping
• Simple_Lowercase_Mapping
• Simple_Titlecase_Mapping

The General Category Values are two letter values. The meaning of the first letter is:

 C other L letter M combining mark N number P punctuation S symbol Z separator

How to count the unicode characters by general category:

$curl ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt$ awk 'BEGIN{FS=";"} {cnt[$3] += 1} END{for (i in cnt) print i, cnt[i];}' UnicodeData.txt | sort  The file is ASCII, and doesn't show what the characters look like. One can display a Unicode point in zsh with: echo$'\u03bb'


For ASCII lookups, don't forget about man ascii.

# Locales

One can run

$locale  to see the current locale. The output is a list of environment variables. Usually one sets the LANG environment variable and allows that value to be inherited by the other environment variables. One can run $ locale -a


to see the list of supported locales.

On Ubuntu, the default locale is set in /etc/default/locale. Ubuntu ships with a sparse list of "compiled" locales, since each takes 50M of space. Use

$sudo locale-gen  to "compile" a locale. # Astronomy If Plan 9 from User Space is installed, one gets a command line ephemeris tool called astro. A common way to install the Plan 9 tools is by putting a script called 9 in PATH: $ 9 man astro

$9 astro -klp  The script 9 defines an environment variable called PLAN9 which contains the installed code. The file$PLAN9/sky/constelnames contains the 88 constellations with their abbreviations in the first column and their full names in the second column, separated by tabs.

The file $PLAN9/sky/estartab contains data about 2158 stars on the ecliptic. It is better to download the PPM star data, however. http://tdc-www.harvard.edu/catalogs/ppm.html The PPM Star Catalog contains information about 468,861 stars. HEASARC provides an ASCII version with pipe delimited fields. The tdat format is described at http://heasarc.gsfc.nasa.gov/docs/software/dbdocs/tdat.html Some of the interesting fields:  1 PPM serial number 2 3 apparent magnitude 4 spectral type 5 right ascension 6 declination 7 galactic longitude 8 galactic latitude 9 10 11 proper motion RA sec/yr 12 proper motion declination "/yr 13 14 15 16 17 18 19 20 SAO catalog designation 21 Henry Draper catalog designation 22 AGK3 catalog designation 23 Cape Photograph DM designation 24 25 # Shakespeare $ wc ~/Local/etc/shakes.txt
124787  904061 5589889 /Users/clark/Local/etc/shakespeare.txt