a tour of command line tools not installed by default but available in the package manager

http://clarkgrubb.com/package-managers

general | editors | version control | programming | debugging | images | documents | scientific | web | cloud | cryptography | compression | databases | appendix

Installation on Mac

for MacOS 12 Monterey

Use the App Store to install Xcode.

Install Homebrew.

$ brew install awscli bash coreutils emacs htop jq node pandoc pari \
  pkg-config pup pyenv python3 r shellcheck tcpflow the_silver_searcher tmux tree

$ pip3 install pycodestyle pylint virtualenv mypy

Some Mac apps can be installed with brew:

$ brew install 1password adobe-creative-cloud affinity-publisher \
  google-chrome java microsoft-office rstudio slack sync

$ brew install --cask docker emacs

Installation on Ubuntu

for Ubuntu 20.04 LTS

$ sudo apt-get install -y awscli curl emacs gcc gdb git htop jq make nodejs npm \
  openjdk-17-jdk openssl pari-gp pkg-config python3.8-dev ruby shellcheck silversearcher-ag \
  ssh strace tcpflow tmux tree

$ pip3 install pycodestyle pylint virtualenv mypy

GUI apps managed by apt:

$ sudo apt-get install -y chromium-browser
$ sudo apt-get remove -y openoffice*.* thunderbird unity-webapps-common libreoffice* \
  software-center rhythmbox totem ubuntu-software

Servers managed by apt:

$ sudo apt-get install -y elasticsearch mariadb-server mongodb-server postgresql redis-server

Summary

brew package apt-get package pip package command line
ant ant ant
apache-spark pyspark
spark-shell
spark-submit
sparkr
apt-file apt-file
asciidoc asciidoc asciidoc
aspell aspell aspell -a < PATH
aspell -c PATH
automake automake autoconf
automake
awscli awscli aws
bash bash
bison bison bison
check check
chromedriver chromium-chromedriver chromedriver
cmake cmake cmake
checkstyle checkstyle checkstyle
colordiff colordiff colordiff
cppcheck cppcheck cppcheck
coreutils gfactor
gsort
cppunit libcppunit-dev
cssc sccs
ctags exuberant-ctags ctags
curl curl
cvs cvs cvs
dash dash dash
ditaa ditaa ditaa
docker-machine docker
elasticsearch
emacs emacs emacs
flex flex flex
ffmpeg ffmpeg ffmpeg
gawk gawk
gcc gcc
gdb gdb gdb
git git
global global global
gtags
glpk glpk-utils glpsol
gnu-sed gsed
gnupg gnupg gpg2
gnuplot gnuplot5 gnuplot
gradle gradle gradle
graphviz graphviz dot
hexedit hexedit hexedit
hive hive
htop htop htop
imagemagick imagemagick composite
convert
identify
montage
ivy ivy ivy
openjdk-8-jdk jar
java
javac
jenkins jenkins jenkins
jq jq jq
librsvg rsvg-convert
libtool libtool
m4 m4 m4
make make
makedepend xutils-dev makedepend
markdown markdown markdown
mathics mathics
maven maven mvn
mercurial mercurial hg
mariadb mariadb-cient mysql
mongodb mongodb-clients mongoexport
mongoimport
mongo
nano nano
nmap nmap nmap
nmon nmon
node
npm
openssl openssl openssl
packer packer
pandoc pandoc pandoc
parallel parallel parallel
pari pari-gp gp
pig pig
pkg-config pkg-config pkg-config
postgresql postgresql-client psql
potrace potrace potrace
pmd pmd
presto presto
pssh pssh pssh
pup pup
pv pv pv
python pyenv
python
python3
python3.8-dev
virtualenv
pip
python
virtualenv
qpdf qpdf qpdf
rcs rcs rcs
redis redis-tools redis-cli
rlwrap rlwrap rlwrap
screenfetch screenfetch screenfetch
shellcheck shellcheck shellcheck
splint splint splint
sqlite sqlite3 sqlite3
strace strace
dtruss
subversion svn
tcpflow tcpflow tcpflow
the_silver_searcher silversearcher-ag ag
tig tig tig
tmux tmux tmux
tree tree tree
valgrind valgrind
vim vim
w3m w3m w3m
webp webp webp
xpdf xpdf pdftotext
xz xz-utils unxz
xz
xzcat

General

ag | bash | colordiff | coreutils | docker | emacs | git | jenkins | m4 | make | nano | parallel | pv | rlwrap | screenfetch | tree

ag

ag is a C implementation of the Perl script ack.

ag is similar to grep -r PATTERN $PWD. However, it will also ignore files listed in any .gitignore or .hgignore files it finds. It will ignore binary files and hidden files; i.e. leading . in the name.

An {{.agignore}} file can be created to ignore additional files. Also, the {{--ignore flag can be given a file name pattern.

Use the -a flag to disable all of the above rules for ignoring files.

The ag flags -A, -B, -c, -C, -F, -i, -l, -L, -o, -v, -w have the same effect as the corresponding flags for grep.

coreutils

Many of the command line tools on Mac are BSD versions. If you are used to the GNU/Linux versions, some flags you expect might be missing.

Installing coreutils on Mac will install command line tools with a g prefix to avoid conflicts: e.g. gcut, gecho, gjoin, ...

The {{gsort}} command has some features which the {{sort}} installed on Mac does not. See the {{-R}} flag for randomly shuffling a file, and the {{--parallel flag for using multiple cores when sorting.

todo: what about gawk and gnu-sed

docker

http://clarkgrubb.com/docker

jenkins

m4

https://www.gnu.org/software/m4/manual/m4.html

m4 is a language neutral solution for substituting values into templates. Here is an example:

$ cat ls-home.m4
#!/bin/sh
cd HOME && ls

$ m4 -DHOME=$HOME ls-home.m4 > ls-home

$ chmod +x ls-home

You might think that sed could do the same thing. However this doesn't work:

$ sed s/HOME/$HOME/ < ls-home.m4

The problem is after the shell expands $HOME, it probably has slashes in it, which are special to sed.

It is possible to prevent a string from getting expanded:

$ cat ls-home.m4
#!/bin/sh
echo `HOME' is HOME
cd HOME && ls

$ m4 -DHOME=$HOME ls-home.m4 > ls-home

$ chmod +x ls-home

You might be tempted to use the C preprocessor, i.e. cpp or gcc -E, but the C preprocessor tends to raise errors for text which doesn't conform to the rules of C code.

The ` and ' are the string delimiting characters in M4. You can escape them so by doubling them, e.g.

``foo''

will expand to

`foo'

Escaping just one of the characters is harder. You must temporarily change the string delimiting characters:

`foo'changequote(`[',`]')`changequote([`],['])

will expand to

foo`

Note how "foo" must be quoted so the changequote builtin will be recognized.

Instead of passing a substitution value on the command line, one could define it in the file:

$ cat ls-clark.m4
#!/bin/sh
define(`dir', `/Users/clark')
ls dir

$ m4 ls-clark.m4 > ls-clark
$ chmod +x ls-clark
$ ./ls-clark

macros with arguments and recursion

Built-in macros:

  • define
  • eval
  • format
  • ifdef
  • ifeq

loops, including files, debugging

make

make is not installed by default on Ubuntu.

http://clarkgrubb.com/makefile-style-guide

The version of make that one gets on a Mac when installing the developer tools is 3.81, which is over ten years old. Installing make with brew puts an executable named gmake in the the search path.

Version 3.82 of GNU make added the .SHELLFLAGS and .ONESHELL special variables.

Version 4.0 of GNU make added $(guile ...) variable function for executing Guile and the $(file ...) variable function for writing a value to a file. The Guile extension language is a compilation option, and Homebrew does not install a version of make with it enabled.

parallel

Generate 16 million random numbers:

$ yes | head -16000000 | awk '{print rand()}' > rand.txt

Generate 16 million numbers more quickly by forking off 16 separate pipelines:

$ for i in $(seq 0 15); do yes | head -1000000 | awk '{print rand()}' > rand.$i.txt & ; done

The example executes a process for every word in a string. It is also possible to execute a process for every line in a file:

$ while read line; do echo $line & ; done < /etc/passwd

With xargs we can invoke 10 processes, but never execute more than 3 processes at the same time:

$ seq 11 20 | xargs -n 1 -P 3 sleep

The -P flag limits the number of concurrent processes. The -n flag controls the number of arguments passed to each command.

The -I flag can be used to defined a string in the command which is replaced by the arguments. This can be used to illustrate the -n flag:

$ echo foo bar baz | xargs -n 1 -I {} echo wombat {} wumpus
wombat foo wumpus
wombat bar wumpus
wombat baz wumpus

$ echo foo bar baz | xargs -I {} echo wombat {} wumpus
wombat foo bar baz wumpus

If -n is not specified, xargs passes as many arguments to the command as it can without exceeded ARG_MAX.

parallel is a Perl script which acts as a drop in replacement for xargs. It has many of the same flags as xargs, but with better default values. For example the default value for -n is 1:

$ echo $'foo\nbar\nbaz' | parallel echo wombat {} wumpus
wombat foo wumpus
wombat bar wumpus
wombat baz wumpus

$ echo $'foo\nbar\nbaz' | parallel -n 3 echo wombat {} wumpus
wombat foo bar baz wumpus

Also, the default value for {{-P}} is the number of available cores, whereas the default value of {{-P}} for {{xargs}} is 1. Also, it is not necessary to use the {{-I}} flag with {{parallel}} to define {{{} as the argument replacement string.

parallel has many options and is perhaps a good choice for parallelizing work, but don't forget about make -j NUM which is analogous to xargs -P NUM and parallel -P NUM.

pv

A way to check up on the throughput one is seeing on a pipe:

$ yes | pv | cat > /dev/null

rlwrap

Here are a few tools without readline built in:

$ rlwrap ocaml
$ rlwrap racket
$ rlwrap sbcl

screenfetch

$ screenfetch
                 -/+:.          clark@ClarkMac
                :++++.          OS: 64bit Mac OS X 10.12.5 16F73
               /+++/.           Kernel: x86_64 Darwin 16.6.0
       .:-::- .+/:-``.::-       Uptime: 22d 19h 21m
    .:/++++++/::::/++++++/:`    Packages: 217
  .:///////////////////////:`   Shell: zsh 5.2
  ////////////////////////`     Resolution: 2880x1800
 -+++++++++++++++++++++++`      DE: Aqua
 /++++++++++++++++++++++/       WM: Quartz Compositor
 /sssssssssssssssssssssss.      WM Theme: Blue
 :ssssssssssssssssssssssss-     Font: SFMono-Regular
  osssssssssssssssssssssssso/`  CPU: Intel Core i7-6920HQ @ 2.90GHz
  `syyyyyyyyyyyyyyyyyyyyyyyy+`  GPU: AMD Radeon Pro 460 / Intel HD Graphics 530
   `ossssssssssssssssssssss/    RAM: 10092MiB / 16384MiB
     :ooooooooooooooooooo+.
      `:+oo+/:-..-:/+o+/-
$ screenfetch
                          ./+o+-       clark@ubuntu
                  yyyyy- -yyyyyy+      OS: Ubuntu 16.04 xenial
               ://+//////-yyyyyyo      Kernel: x86_64 Linux 4.8.0-51-generic
           .++ .:/++++++/-.+sss/`      Uptime: 13d 14h 51m
         .:++o:  /++++++++/:--:/-      Packages: 1855
        o:+o+:++.`..```.-/oo+++++/     Shell: bash 4.3.48
       .:+o:+o/.          `+sssoo+/    Resolution: 1680x1050
  .++/+:+oo+o:`             /sssooo.   DE: Unity 7.4.0
 /+++//+:`oo+o               /::--:.   WM: Compiz
 \+/+o+++`o++o               ++////.   WM Theme: Ambiance
  .++.o+++oo+:`             /dddhhh.   GTK Theme: Ambiance [GTK2/3]
       .+.o+oo:.          `oddhhhh+    Icon Theme: ubuntu-mono-dark
        \+.++o+o``-````.:ohdhhhhh+     Font: Ubuntu 11
         `:o+++ `ohhhhhhhhyo++os:      CPU: 2x Intel Core i7-6920HQ CPU @ 2.903GHz
           .o:`.syhhhhhhh/.oo++o`      RAM: 2316MiB / 3932MiB
               /osyyyyyyo++ooo+++/
                   ````` +oo+++o\:
                          `oo++.

tree

Try this to get an HTML page for browsing Python code:

$ tree -I '*.pyc' -H $(pwd) . > ~/python_file.html

Selected options:

$ tree --help
  ------- Listing options -------
  -d            List directories only.
  -l            Follow symbolic links like directories.
  -f            Print the full path prefix for each file.
  -x            Stay on current filesystem only.
  -L level      Descend only level directories deep.
  -I pattern    Do not list files that match the given pattern.
  -------- File options ---------
  -p            Print the protections for each file.
  -u            Displays file owner or UID number.
  -g            Displays file group owner or GID number.
  -s            Print the size in bytes of each file.
  -h            Print the size in a more human readable way.
  -D            Print the date of last modification or (-c) status change.
  -F            Appends '/', '=', '*', '@', '|' or '>' as per ls -F.
  --inodes      Print inode number of each file.
  --device      Print device ID number to which each file belongs.
  ------- Sorting options -------
  -t            Sort files by last modification time.
  -r            Reverse the order of the sort.
  ------- Graphics options ------
  -i            Don't print indentation lines.
  ------- XML/HTML/JSON options -------
  -X            Prints out an XML representation of the tree.
  -J            Prints out an JSON representation of the tree.
  -H baseHREF   Prints out HTML format with baseHREF as top directory.

Editors

emacs | nano | hexedit | vim

"text mode" editors which can be run inside a terminal or on the far side of an ssh connection

http://hyperpolyglot.org/text-mode-editors

emacs

Mac OS ships with Emacs, but it is an old version.

hexedit

A binary file editor.

F1 for help.

nano

Nano is easier to learn than emacs or vim because it displays the available commands at the bottom of the screen.

vim

A "multi-modal" editor.

Version Control

colordiff

A colorized wrapper to diff. It uses the same flags as diff.

$ sed 's/root/ROOT/' < /etc/passwd > /tmp/passwd
$ colordiff /etc/passwd /tmp/passwd

cvs

git

git is not installed by default on Ubuntu.

http://hyperpolyglot.org/version-control

hg

rcs

sccs

svn

tig

Programming

c/c++ | java | javascript | python | shell

c/c++

https://github.com/clarkgrubb/sample-c-project

  • autoconf
  • automake
  • bison
  • check
  • cmake
  • cppcheck
  • cppunit
  • ctags
  • flex
  • gcc
  • libtool
  • makedepend
  • pkg-config
  • splint
  • valgrind

bison

$ find . -name '*.[ch]' | gtags -f -
$ global -x main
$ global -xr main

pkg-config is used like this:

$ gcc -o test test.c $(pkg-config --libs --cflags libpng)

The package manager is supposed to put .pc files in a location that pkg-config knows about, either /usr/lib/pkgconfig or /usr/local/lib/pkgconfig.

java

https://github.com/clarkgrubb/sample-java-project

javascript

https://github.com/clarkgrubb/sample-javascript-project

$ npm install -g grunt
$ npm install -g gulp

python

Interpreter version and package management tools:

$ pyenv install --list

$ pyenv install 3.7.7

$ virtualenv --python=/Users/clark/.pyenv/versions/3.7.7/bin/python ve

$ . ve/bin/activate

$ pip install click pytest requests

$ pip freeze > requirements.txt

$ pip install -r requirements.txt

Static analysis tools:

$ pycodestyle foo.py

$ pylint foo.py

$ mypy --disallow-untyped-defs foo.py

{{pylint}} uses ~/.pylintrc in the absence of a {{--rcfile command line option.

shell

shellcheck is a static analyzer for shell scripts. Implemented in Haskell. It uses the shebang line to determine the flavor

$ shellcheck foo.sh

The version of bash that ships with the Mac is old. So we use brew to install an up-to-date version, and we rely on the fact that /usr/local/bin is ahead of /bin in the PATH. Bash scripts should have a

#!/usr/bin/env bash

shebang.

If you want a shell script to be portable, you can run {{bash}} with the {{--posix flag, but it is better to run the script with dash.

Only use external programs in this list:

https://www.gnu.org/software/make/manual/make.html#Utilities-in-Makefiles

Also, only use POSIX mandated flags for thos external programs:

http://pubs.opengroup.org/onlinepubs/9699919799/idx/utilities.html

Debugging

gdb | htop | nmon | strace

gdb

$ gcc -g -o foo foo.c

Three ways to start gdb:

$ gdb CMD
$ gdb CMD PID
$ gdb CMD CORE

On Mac OS X, use lldb instead of gdb.

cmd gdb
help h
list l [first, last]
next statement n
step into function s
set breakpoint b [file:]line
list breakpoints i b
delete breakpoint d num
continue c
show backtrace bt
move up stack u
move down stack do
print expression p expr
(re)run r [arg1[, arg2 ...]]
quit debugger q

htop

Some features require htop to be started as root.

Config file at ~/.config/htop/htoprc.

keys notes
help h ? F1
setup S
cursor control ↓ ↑ → ← F / 0-9 F: stay on process 0-9: search by pid
filter processes \ u \: by command name u: by user
tag processes <SPC> c U c: tag process and children
view toggles H t p + - H: threads p: path t: process tree: + -: show/hide tree parts
sort processes by resource P M T > I P: CPU M: Memory T: Time I: Invert
inspect process e s l e: environment s: strace l: lsof
control process k [ ] i a k: kill [ ]: nice i: i/o priority a: cpu affinity

nmon

strace

Syscalls for a command to be launched; for a running process identified by a PID; also strace children; set size string arguments to be displayed:

$ strace ls
$ strace -p 7191
$ strace -f
$ strace -s120

Timestamp each syscall; give elapsed time for each syscall:

$ strace -ttt
$ strace -T

Summarize counts of syscalls at exit; summarize counts and also suppress normal syscall output:

$ strace -C ls
$ strace -c ls

There is a tool called dtruss, part of dtrace, which is similar in function to strace. To use it, System Integrity Protection must be disabled on the Mac, which involves rebooting in recovery mode (hold down ⌘R), type csrutil disable in the terminal, then reboot again.

Images

For examples of using the image processing tools, see http://clarkgrubb.com/image-tools

ffmpeg

A tool for editing video and audio.

imagemagick

The package provides composite, convert, montage, and identify, which do edits like what Photoshop can do.

potrace

A tool which can convert bitmap images to SVG.

rsvg-convert

A tool which can convert SVG images to a bitmap format.

webp

Convert a webp file to a png.

$ dwebp foo.webp -o foo.png

Documents

asciidoc | aspell | markdown | pandoc | xpdf

asciidoc

http://hyperpolyglot.org/lightweight-markup

A lightweight markup frontend for DocBook.

Create foo.html:

$ asciidoc foo.txt

aspell

Using aspell to interactively fix spelling errors. The original file is modified:

$ echo 'Four score and seven yeers ago' > /tmp/gettysburg.txt
$ aspell -c /tmp/gettysburg.txt

Using aspell non-interactively on a file stream:

$ echo 'Four score and seven yeers ago' | aspell list
yeers

$ echo 'Four score and seven yeers ago' | aspell -a
*
*
*
*
& yeers 85 21: years, yews, yes, year's, yew's, yeas, beers, jeers, leers, peers, seers, veers, yea's, yeahs, yes's, yours, eyes, yeses, Meyers, tees, Ayers, Byers, Myers, dyers, eye's, yaws, yer, queers, yrs, Yeats, yearns, beer's, deer's, jeer's, leer's, peer's, seer's, tee's, veer's, seer, yaw's, yeah's, year, yens, yeps, Yves, bees, ears, fees, gees, hers, lees, pees, sees, wees, yous, cheers, sheers, yeggs, yells, yen's, yep's, yetis, Lee's, bee's, fee's, lee's, pee's, see's, wee's, Meyer's, you's, Dyer's, Yeager's, dyer's, yore's, Er's, queer's, Meier's, ear's, her's, Cheer's, cheer's, sheer's, yegg's

markdown

http://hyperpolyglot.org/lightweight-markup

A Perl script which converts Markdown to HTML.

$ echo '* foo' | markdown
<ul>
<li>foo</li>
</ul>

pandoc

A tool implemented in Haskell for converting Markdown to a variety of formats, including HTML:

$ echo '* foo' | pandoc
<ul>
<li>foo</li>
</ul>

To PDF:

$ echo '* foo' | pandoc -o output.pdf

To man page:

$ cat > foo.1.md
% FOO(1) Foo User Manual
% John Smith
% November 8, 2012

# NAME

foo - a command line tool

# SYNOPSIS

foo [*file*]...

# DESCRIPTION

Foo performs an operation on files specified on the command line.  If no files
are specified the operation is performed on *stdin*.

# SEE ALSO

`bar` (1).

$ pandoc -s -w man foo.1.md -o foo.1

To HTML slideshow:

$ cat > gnomes.txt
# Business Plan

* collect underpants
* ?
* profit

# Phase 1: Collect Underpants

3:30 AM is the optimal time.

# Phase 2: ?

Details TBD

# Phase 3: Profit

Financial results will be discussed in the earnings call.

$ pandoc gnomes.txt -s --webtex -i -t slidy -o gnomes.html

qpdf

$ qpdf --help
$ qpdf --stream-data=uncompress input.pdf output.pdf
$ qpdf --linearize input.pdf output.pdf

xpdf

xpdf is an X Windows PDF viewer. Installing it also installs pdftotext which extracts plain text from a PDF.

$ pdftotext foo.pdf foo.txt

Scientific

ditaa

ditaa converts ASCII art to box-and-arrow diagrams such as produced by Visio and Omnigraffle.

http://clarkgrubb.com/image-tools#ditaa

glpk

gpsol solves linear programming problems.

http://hyperpolyglot.org/misc-math
http://clarkgrubb.com/optimization

gnuplot

gnuplot creates graphics such as histograms, box plots, scatterplots, and run charts.

http://hyperpolyglot.org/misc-math
http://clarkgrubb.com/image-tools#gnuplot

graphviz

dot converts DOT files, which are a notation for graphs, into PNG files.

http://clarkgrubb.com/image-tools#graphviz

mathics

mathematica comparied

Mathics supports a subset of the Mathematica language.

pari

http://hyperpolyglot.org/more-computer-algebra

This package install a command line calculator with arbitrary length integers, rationals, arbitrary precision decimals, complex numbers, vectors, matrices, polynomials. Lots of built-in functions for number theory.

To use it interactively:

$ gp

Non-interactively:

$ echo '1 + 1' | gp -q
2

Web

curl

curl -A USER_AGENT
curl -b COOKIE_NAME=DATA
curl -c COOKIE_JAR_FILE
curl -d KEY=VALUE [-d KEY=VALUE ...]
curl -d @DATA_FILE
curl -D HEADER_DUMP_FILE
curl -e REFERER
curl -F FORM_KEY=VALUE
curl -F FORM_KEY=@FILE
curl -G      (send data as GET)
curl -H HEADER [-H HEADER ...]
curl -i      (put HTTP header in output)
curl -I       (HEAD request)
curl -K CONFIG_FILE
curl -l      (follow redirect)
curl --limit-rate BPS
curl -o OUTPUT_FILE
curl -O      (name local file after path of remote file)
curl -s      (no progress bar or error messages)
curl -sS     (no progress bar, but error messages are shown)
curl -T FILE  (use PUT to upload FILE)
curl -u USER:PASSWORD

nmap

A port scanner.

pup

Use CSS selectors to extract elements from HTML. Implemented in Go.

https://github.com/EricChiang/pup

$ curl http://google.com | pup body

tcpflow

To dump the IP packets:

$ sudo tcpdump [-i INTERFACES] -w dump.pcap [PCAP_FILTER]
$ tcpdump [-v|-A] -r dump.pcap

The first command continues listening until interrupted with ^C.

Run ifconfig to see the available interfaces. INTERFACES can be a comma separated list. If none are specified, the default choice usually includes the main interface.

A filter can be used to reduce the number of packets captured. See man pcap-filter. The basic predicates of the filter language are:

tcp|udp|icmp
[dst|src] host IP_ADDRESS
[dst|src] port PORT
not PRED
PRED and PRED
PRED or PRED
( PRED )

The file dump.pcap is in the pcap next generation capture file format which tcpflow can also read.

The-v flag prints two lines are printed per packet instead of one and includes information about the protocol (UDP, TCP).

The -A flag prints the payload of the protocol in ASCII.

tcpflow does a better job of assembling in data being transmitted in the protocols. It is invoked as:

$ tcpflow [-r PCAP_FILE | -i INTERFACE] [-o DIR] [-a]

tcpflow can read from a PCAP_FILE or an INTERFACE. If no DIR is specified, it creates files in the working directory.

tcpflow creates files with names of the form

SRC_IP_ADDRESS.PORT-DST_IP_ADDRESS.PORT

identifying the TCP connection whose data it contains.

w3m

Start w3m:

$ w3m http://google.com

Use arrows to position the cursor. Type {{<ENTER>}} to click on the link below the cursor. If a form input element is below the cursor, click {{<ENTER> and then type text which will appear at the bottom of the screen. Type q to exit w3m.

w3m can be use to render an HTML document in plain text:

$ curl https://www.google.com > google.html
$ w3m -dump google.html

Cloud

awsclil | packer | pssh | tmux

awscli

Much of what can be done in the AWS console can also be done at the command line.

To get set up, go to IAM in the AWS console and create your key pair. Then use this command to write them to your ~/.aws/credentials file.

$ aws configure

Copy an S3 bucket and its contents:

$ aws s3 cp --recursive s3://MY_BUCKET/images/ .

Getting help:

$ aws help
$ aws s3 help
$ aws s3 cp help

packer

$ cat example.json
{
  "variables": {
    "aws_access_key": "",
    "aws_secret_key": "",
    "branch": "master"
  },
  "builders": [{
    "type": "amazon-ebs",
    "access_key": "{{user `aws_access_key`}}",
    "secret_key": "{{user `aws_secret_key`}}",
    "region": "us-west-2",
    "source_ami": "ami-efd0428f",
    "instance_type": "t2.micro",
    "ssh_username": "ubuntu",
    "ami_name": "ubuntu.16.04 python3.5 openmail {{timestamp}}"
  }],
  "provisioners": [{
    "type": "shell",
    "inline": [
      "sleep 30",
      "cat /etc/*elease",
      "sudo apt-get update",
      "sudo apt-get install -y gcc g++ git make python3.5-dev virtualenv",
      "ssh-keyscan github.com >> ~/.ssh/known_hosts",
      "mkdir ~/.aws",
      "echo '[default]' > ~/.aws/credentials",
      "echo 'aws_access_key_id = {{user `aws_access_key`}}' >> ~/.aws/credentials",
      "echo 'aws_secret_access_key = {{user `aws_secret_key`}}' >> ~/.aws/credentials",
      "git clone git@github.com:THE_USER/THE_PROJECT.git",
      "cd THE_PROJECT",
      "git checkout {{user `branch`}}",
      "virtualenv -p python3.5 ../ve",
      ". ../ve/bin/activate",
      "pip install -r /tmp/requirements.txt",
      "make -k test"
    ]
  }]
}

$ packer validate example.json

$ packer build --var 'aws_access_key=YOUR ACCESS KEY' \
  -var 'aws_secret_key=YOUR SECRET KEY' \
  example.json

pssh

Execute a command on multiple remote hosts:

pssh -i -A -H joe@host1 -H joe@host2 echo "hello world"

The -i flag causes stdtout and stderr to be displayed. The -A flag prompts for a password.

tmux

tmux and screen compared

When using ssh to connect to a remote host, if the network connection is lost, the shell on the remote host will be sent a SIGHUP. The shell will in turn send a SIGHUP to any jobs that are running. The default behavior is for the jobs to exit.

To prevent loss of network connection from interrupting a long running batch job, there a couple of things the user can do:

$ nohup sleep 1000 &

$ sleep 1000 &
[1]  + running    sleep 1000
$ disown %1

nohup writes stdout to the file nohup.out.

A tmux session will survive loss a network connection. Start a tmux session with:

$ tmux

If disconnected, log back in to the remote server and attach to the session with:

$ tmux attach

Cryptography

gnupg

Setup

$ gpg2 --generate-key
$ gpg2 --list-keys
$ cp /etc/passwd passwd.txt
$ gpg2 --clear-sign passwd.txt
$ gpg2 --verify passwd.txt.asc

openssl

$ openssl dgst -sha256 < /etc/passwd
$ openssl base64 < /bin/ls

Compression

xz

suffix compress paths uncompress paths uncompress stdin
.gz gzip gunzip gzcat
.bz2 bzip2 bunzip2 bzcat
.xz xz unxz xzcat
$ tar xf linux-3.17.1.tar.xz
$ find linux-3.17.1 -name '*.[ch]' | xargs cat > /tmp/big.txt
$ wc -c /tmp/big.txt
 482314125 /tmp/big.txt
$ time gzip /tmp/big.txt
gzip /tmp/big.txt  15.19s user 0.19s system 99% cpu 15.425 total
$ wc -c /tmp/big.txt.gz
 105127193 /tmp/big.txt.gz
$ time gunzip /tmp/big.txt.gz
gunzip /tmp/big.txt.gz  1.45s user 0.23s system 98% cpu 1.701 total
size ratio time untime
gzip 105127193 21.79% 15.425 1.701
bzip2 82311952 17.07% 43.773 16.617
xz 70984840 14.72% 3:42.77 6.308
zip 105138420 21.80% 15.632 3.294

Databases

and map/reduce tools

elasticsearch

http://clarkgrubb.com/elasticsearch

hive

jq

jq language syntax

mongo

mysql

pig

postgresql

presto

redis

spark

sqlite

Appendix

$ brew tap dart-lang/dart
$ brew install cabal-install coq dart elixir erlang gforth ghc gnu-apl go groovy guile idris kotlin leiningen \
  lua maxima ocaml opam racket rust sbcl sbt scala scalastyle swi-prolog typescript

$ brew tap caskroom/cask
$ brew cask install pharo sage
brew package apt package command line
cabal-install cabal-install cabal install
clojure1.6 clojure
coq coq coqtop
dart dart
elixir elixir
erlang erlang erl
erlc
fish fish fish
gforth gforth gforth
ghc ghc ghc
ghci
gnu-apl apl
go golang go
groovy groovy2 groovysh
guile guile-2.0 guile
idris idris
kotlin kotlinc
leiningen lein repl
lua lua5.3 lua
maxima maxima maxima
ocaml ocaml ocaml
opam opam opam
perl perl
cpan
php php
racket racket racket
ruby ruby
gem
rake
irb
erb
rust rustc rustc
sbcl sbcl sbcl
sbt sbt
scala scala scala
scalastyle scalastyle
swi-prolog swi-prolog swipl
tcl tclsh
typescript node-typescript tsc