buffer:

command history:

directory: find

file: grep | sed | tr

files in directory: xargs | ag | ack

file system: mdfind | locate

windows:

syntactic: ctags | etags | gtags | cgrep | pygmentize

from an editor: emacs | vim

repository: git | mercurial

package: apt-file | yum provides

web: google

fuzzy: tre-agrep

Buffer

also search and replace

Command History

Directory

find [-L|-x] PATH ... [EXPR] For every file in the specified PATHs print the file name. If any of the PATHs are directories, recursively print the file names of all files contained in the directory. File names are paths up to and including PATH.

If an EXPR is provided, only print the names of files for which EXPR is true. The EXPR is built up of primaries and operators. A primary is of the form -FLAG [ARG]. An EXPR consisting of multiple primaries is true if all of the primaries are true. Primaries which do not have side effects are documented in the following row.

With the -L flag follow symbolic links. The default behavior is to evaluate EXPR on the symbolic link, not the file it points to. With the -x flag don't follow symbolic links or mount points into a file system with a different device number.
find [-E] PATH ...
  [ -empty |
  -gid ID |
  -group NAME |
  -maxdepth N |
  -mindepth N |
  -mtime (-|+)N(smhdw) |
  -[i]name PATTERN |
  -newer FILE |
  -nogroup |
  -nouser |
  -[i]path PATH |
  -perm (-|+)OCTAL |
  -[i]regex PATTERN |
  -size N[k|M|G|T|P] |
  -type (b|c|d|f|l|p|s) |
  -uid ID |
  -user NAME] ...
Print all file names recursively in the specified PATHs if all of the primaries evaluate to true.

The -empty primary is true if the file or directory is empty.

The -gid ID primary is true if the group id associated with the file is ID.

The -group NAME primary is true if the name associated with the group id for the file is NAME.

The -maxdepth N primary is true if the file is no more than an N-fold child of one of the PATHs. If N is 0 then the file must be one of the PATHs. If N is 1 then the file must be one of the PATHs or immediately contained in one of the PATHs.

The -mindepth N primary is true if the file is an N-fold or greater child of the one of the PATHs.

The -mtime (-|+)N(smhdw) primary is true if the last modification time is less than (-) or more than (+) N units of time ago. The unit of time can be seconds, minutes, hours, days, or weeks. Days is the default.

The -[i]name PATTERN primary is true if PATTERN matches the last component of the file name. The file globbing characters *, ?, and [ ] can be used but must be escaped from the shell. The -iname primary performs a case-insensitive match.

The -newer FILE primary is true if the file name has a more recent modification time than FILE.

The -nogroup and -nouser primaries are true if the group id or user id associated with the file is not recognized.

The -[i]path PATTERN primary is true if PATTERN matches the path of the file name. The file globbing characters *, ?, [ ] can be used but must be escaped from the shell. The -ipath primary performs a case-insensitive match.

The -perm (-|+)OCTAL primary is true if all (-) or any (+) of the bits in OCTAL are also set in the file mode. See chmod for a description of the file mode bits.

The -[i]regex PATTERN primary is true if ...

The -size N[k|M|G|T|P] primary is true if ...

The -type (b|c|d|f|l|p|s) primary is true if the file type is block special (b), character special (c), directory (d), regular file (f), symbolic link (l), FIFO (p), or socket (s).

The -uid ID primary is true if if the user id of the file owner is ID.

The -user NAME primary is true if the owner of the file is NAME.
find PATH ... [EXPR]
  (-print|-print0|-ls)
With -print print the file names [for which EXPR is true] one file name per line. This is the default behavior if no primaries which have side effects are specified.

With the -print0 option print the file names [for which EXPR is true] separated by NUL characters. This output can be processed by xargs -0. The -print0 flag is necessary when using find and xargs together and the file names spaces, single quotes, or double quotes.

With the -ls option print a long listing for each file name [for which EXPR is true]. The information in the long listing is almost identical to the information in ls -lis.
find PATH ... [EXPR] -delete Remove the files and directories in the PATHs recursively for which EXPR is true.
find PATH ... [EXPR]
  (-exec|ok) CMD {} \;
Run CMD on each file name in the PATHs recursively for which EXPR is true. Each time CMD is invoked it gets a single argument in the {} slot.
find PATH ... [EXPR]
  (-exec|ok) CMD {} +
Run CMD on the file name in the PATHs recursively for which EXPR is true. Each time CMD is invoked it gets multiple file names in the {} slot.
find PATH ... [EXPR]
  (-execdir|-okdir) CMD {} \;
Run CMD on each file name in the PATHs recursively for which EXPR is true. Run CMD inside the directory containing the file, putting the basename of the file in the {} slot.
find PATH ... [EXPR]
  (-execdir|-okdir) CMD {} +
Run CMD on each file name in the PATHs recursively for which EXPR is true. Run CMD inside the directory containing the file, putting the basenames of multiple files in the {} slot.
find PATH ... \( EXPR \) Print the file names in the PATHs recursively for which EXPR is true.
find PATH ... (\!|-not) EXPR Print the file names in the PATHs recursively for which EXPR is not true.
find PATH ... EXPR1 -and EXPR2 Print the file names in the PATHs recursively for which EXPR1 and EXPR2 are true.
find PATH ... EXPR1 -or EXPR2 Print the file names in the PATHs recursively for which EXPR1 or EXPR2 is true.
_____________________________________

Find all files with .js suffix, but don't descend into any directory named foo or any directory named bar:

find . -name foo -prune -o -name bar -prune -o -name '*.js'

File

grep

grep [-F|-E|-P] PATTERN [FILE ...] Write lines to standard in FILE ... which match PATTERN. If no files are specified read from standard input. With -F -E or -P flags PATTERN is interpreted as a fixed string, extended regex, or Perl regex. Otherwise it is interpreted as a basic regex.
grep [-v] [-i] PATTERN [FILE ...] Grep FILE ... or standard input. With -v flag print lines which do not match PATTERN. With -i flag perform case insensitive match.
grep [-b] [-n] [-h|-H] PATTERN
  [FILE ...]
With -h -n and -b flags the filename, line number, and byte offset of the start of the line are displayed. Filenames are displayed by default with grepping more than one file but this behavior is suppressed with the -H flag.
grep -C NUM PATTERN [FILE ...] Display NUM lines before and after the matching line.
grep [-l|-L] PATTERN [FILE ...] With -l flag only the names of files containing a match are printed. With the -L flag only the names of files which don't contain a match are printed.
grep -q PATTERN [FILE ...] No output. Exit status is 0 if there was a match and 1 otherwise.
grep -c PATTERN [FILE ...] Print the number of lines which match.
grep -o PATTERN [FILE ...] Only print the portion of the line which matches PATTERN. If there are multiple matches on the line, print each match on a separate line.
grep -r PATTERN [PATH ...] If PATH is a directory recursively grep all files it contains.
grep --color=(always|auto) PATTERN
  [FILE ...]
Write lines containing PATTERN to stdout and highlight PATTERN in red When --color flag argument is auto, only highlight when stdout is a terminal.
_____________________________________

use sed to search and replace; how to insert a newline

tr

Files in Directory

find

Three ways to search for the entry point in a C repository:

$ find . -name '*.c' | xargs grep main

$ find . -name '*.c' -exec grep main {} \;

$ find . -name '*.c' -exec grep main {} +

$ grep --include '*.c' -r main .

In the second example grep gets called for each file, so it is slower than the other examples.

xargs [-0] [-I {}] [-n NUM] [-P NUM]
  CMD [ARG ...]
Read arguments from standard input and invoke CMD on them. The arguments parsed from standard input are assumed to be whitespace delimited (i.e. space, tab, or newline), unless the -0 flag is used, in which case they are null character delimited. The -0 flag is used with the -print0 flag of find.

If one or more ARGs are provided, this are given to CMD as arguments with each invocation, and arguments parsed from standard input are appended after them. If the -I {} flag is used, then any occurence of {} within the ARGs is replaced by the arguments parsed from standard input.

The number of standard input arguments passed to each invocation of CMD can be controlled by the -n NUM flag. By default as many as possible will be passed to CMD, as long as the command line does not exceed a large but system dependent number of bytes.

The -P NUM controls the max number of invocations of CMD that can run in parallel. The default is 1.
_____________________________________

ag

Installing ag on Mac OS X; installing ag on Ubuntu; cloning the GitHub repository:

$ brew install ag

$ sudo apt-get install silversearcher-ag

$ git clone https://github.com/ggreer/the_silver_searcher

Running this:

$ ag foo

is equivalent to running this:

grep -P -r foo .

except that by default it ignores binary files and hidden files. Also it will ignore files listed in a .gitignore, .hgignore, or .agignore file.

The {{-G}} flag is used to narrow the files searched to those matching a pattern. It is equivalent to the {{--include flag of grep.

The -Q flag interprets PATTERN as a fixed string and is equivalent to the -F flag of grep.

$ ag [-G FILE_PATTERN] [-Q] PATTERN

The following flags work the same way as with grep:

$ ag -A NUM -B NUM -C NUM  -i -l -L -m NUM -v -w

The following flags control which files are searched. The -f flag causes symlinks to be followed.

The {{--hidden flag causes hidden files (starting with a period) to be searched.

The {{--ignore FILE_PATTERN flag specifies files to be ignored.

The -u flag causes files matching patterns in .gitignore, .hgignore, or .agignore to be searched.

$ ag -f --hidden --ignore FILE_PATTERN  -u

Flags which control output appearance:

$ ag --no-numbers --pager 'less -R'

ack

ack flags that are different from ag:

ag ack
-f --follow
-G none?
--ignore --ignore-files
--no-numbers none?
-u none?

File System

mdfind

On Mac OS X, mdfind provides a command line interface to Spotlight:

$ mdfind foo search for files containing foo
$ mdfind -name foo search for file names containing foo
$ mdfind "foo bar" search for files containing foo and bar
$ mdfind -onlyin ~/Documents foo restrict search to ~/Documents

locate

On Linux, there is a package called mlocate, which provides a utility called updatedb for updating the index and locate for searching it:

$ sudo updatedb
$ locate stdio.h

The config file controlling which files get indexed, and the index itself are here:

/etc/updatedb.conf
/var/lib/mlocate/mlocate.db

Windows

Syntactic

ctags and etags

Exuberant Ctags

ctags is an early BSD tool which created a file, usually named tags, which could be read by vi to find the definition of identifiers. ctags could parse C, and later Fortran and Pascal.

etags was an equivalent tool for Emacs. The file it creates is usually named TAGS.

Exuberant Ctags is a later version of ctags that was initially part of vim. It can parse 41 languages. When run with the -e flag it will produce a TAGS file that can be read by Emacs.

gtags

How to index files; how to search for definitions; how to search for uses of a symbol:

$ find . -name '*.php' | gtags -f -

$ global -x setup_iterator

$ global -xr setup_iterator

cgrep

http://awgn.github.io/cgrep/

pygmentize

$ pip install Pygments

From an Editor

emacs

If grep is run from within Emacs using M-x grep, then Emacs will display the results in a buffer with a links to the location of the matches in their original files. For this to work, the filenames and the line numbers must be displayed by grep, which is why the -nH flags are passed to grep.

This variable contains a regular expression which Emacs uses to extract the filename and line number from grep output:

grep-regexp-alist

M-x find-grep works the same way as M-x grep, except that the default shell invocation is a pipeline with find, xargs, and grep.

grep-mode compile-mode
move to next match in grep (compile) buffer M-n M-n
move to previous match in grep (compile) buffer M-p M-p
open source of match (error) under point in separate buffer C-c C-c C-c C-c
stop grep (compile) C-c C-k C-c C-k
move to next match in grep buffer and open source in separate buffer n
move to previous match in grep buffer and open source in separate buffer p

The results are displayed in a buffer in grep-mode, which is a customization of compilation-mode. Both have C-c C-c to go to the match (error) under the point and C-c C-k to stop the grep (compile) process.

vim

Repository

git

The following two invocations are similar:

$ git grep Kohana::config
$ grep -r Kohana::config .

The difference is that git grep only searches tracked files. In particular, it will not search inside the .git directory. git grep thus behaves like ag.

git log -S, which is called the pickaxe, is used to search the history. To create a small git repository for testing the pickaxe, put these commands into a script and run it:

mkdir test-pickaxe
git init test-pickaxe
cd test-pickaxe

echo '# INTRO' > README
git add README
git commit -m 'initial commit'

echo >> README
echo 'A demo of pickaxe.' >> README
git add README
git commit -m 'add text to README'

grep -v INTRO < README > README.new
mv README.new README
git add README
git commit -m 'rm INTRO header'

Find commits which introduced or removed the string INTRO:

$ git log -S INTRO

Also display the commit diffs:

$ git log -p -S INTRO

Display the commits one per line:

$ git log --pretty=oneline -S INTRO

mercurial

The Mercurial command hg grep searches the revision history like git log -S, not the working tree like git grep.

Package

apt-file

It is possible to search a package manager for a file which is not installed.

On Ubuntu and Debian:

$ sudo apt-get install apt-file
$ apt-file update
$ apt-file search foo.h

yum provides

On CentOS and Fedora:

$ yum provides foo.h

Web

google

synomyms: ~people
exact phrase "foo bar"
number range $300..$500, $300..

related:
filetype:
site:

flight times: united 337
ups and fedex packages:
time ZIP, sunrise ZIP, sunset ZIP, weather ZIP, pizza ZIP, traffic ZIP
time CITY, sunrise CITY, sunset CITY, weather CITY

Fuzzy

tre-agrep

There is a tool called agrep (aproximate grep) which was popular.

Bitap algorithm. Manber-Wu algorithm extends it for fuzzy matching.

tre-grep is an open source version of agrep using the tre library. It can be used to find all matches that are within a certain Levenshtein distance.

$ sudo apt-get install

$ find / -name '*.txt' | xargs tre-agrep -i kaitlyn

Notes

defining the project

  • path: C-c p f
  • recursive grep: M-x ag [FIXME: must be correct directory]
  • symbol definition: M-. [inside project will use gtags and offer to build indices]
  • symbol reference

installing projectile and ggtags

There is an ag.el library, and projectile provides a key-binding which uses it.

I don't use ag.el because I have a hard enough time remembering the flags for command-line ag, and I don't want to also have to remember all the command names provided by ag.el.

I use the following custom function which runs ag and displays the results in grep-mode with clickable links. It gives an opportunity to edit the ag invocation and add flags:

make it prompt for the directory, using the current directory as an editable default:

(defun ag (command-args)
  (interactive
   (progn
     (let ((default "ag --nocolor --literal --smart-case --nogroup --column -- "))
       (list (read-shell-command "Run ag (like this): "
                                 default
                                 'grep-history
                                 default)))))
  (compilation-start command-args 'grep-mode))

building ag, gnu global, exuberant ctags

$ brew install the_silver_searcher
$ brew install ctags-exuberant
$ brew install --with-exuberant-ctags global