3 minute read

If you’ve used grep to search for text or patterns in files, you’ll love ripgrep - a command-line utility tool written in Rust. By default, ripgrep will respect gitignore rules and automatically skip hidden files/directories and binary files. ripgrep is grep on steroids. It’s super fast for searching patterns within single files and huge directories of files. In this blog, I’ll help you get started with using ripgrep and hope it’ll help you become more productive on the command-line.

Installation and sample data

The first thing you’ll do is install ripgrep. It has first class support on Windows, macOS and Linux. Choose one of many installation options or you can build it from source. Fortunately, the binary is not called ripgrep; it’s rg.

I have generated some sample server data which I’ll use to test drive ripgrep. Feel free to download this public gist to play along. Each mock-server-dataX.json file has 1000 random server data and mock-server.conf file has a sample PostgreSQL configuration data.

Searching within a single file

rg port mock-server-data1.json

We’re telling ripgrep to search for the pattern “port” within a single file. You should see a 1000-line output since every object in that JSON array contains the “port” field.

Let’s try something specific:

rg 9778b33b-2c49-4340-a9b5-320339052ec0 mock-server-data1.json

When searching for a particular UUID within the file mock-server-data1.json, you should see only one row as a result. This is helpful. But does this UUID exist in other files?

Since ripgrep searches recursively by default, you don’t have to do anything special other than omitting the individual file name.

rg 9778b33b-2c49-4340-a9b5-320339052ec0

Since you didn’t specify a file name, ripgrep will search the current directory and sub-directories for files that contain the UUID 9778b33b-2c49-4340-a9b5-320339052ec0.

Recursive search

Not only ripgrep color-codes the search result, it also tells us the line number on the files where the pattern appears. Let’s validate this finding:

vim +203 mock-server-data3.json

The above command opens mock-server-data3.json on line 203. Did you know this vim shortcut? :set nu will confirm that the UUID 9778b33b-2c49-4340-a9b5-320339052ec0 is shown on this line.

The test data has some errors. Some of the fields contain emails although there isn’t a field for email. Let’s use regex to find these anomalies:

rg -e "[a-zA-Z0-9_]+@[a-z]+.[a-z]+" .

The -e flag is the regex option and the above is a poorly built email regex. The output is as follows:

./mock-server-data2.json
104:   "password": "test@test.com"

./mock-server-data3.json
152:   "password": "test@test.com"
288:   "password": "info@info.ca"

./mock-server-data1.json
48:       "password":"test@test.com"
112:       "password":"test@test.com"

We can see that some of the password fields incorrectly contain emails. Theoretically, it’s possible that someone might be using the email as the password. But considering that these are randomly generated server passwords, we’ll treat these as errors.

gitignore and hidden files

ripgrep won’t show these file types .rgignore, .gitignore, and .ignore in search results. Although you can disable all ignore-related fltering with the --no-ignore flag. ripgrep will also respect repository specific rules found in $GIT_DIR/info/exclude, as well as any global ignore rules in your core.excludesFile.

Let’s test this out:

rg tcp_keepalives_count

The output:

mock-server.conf
84:#tcp_keepalives_count = 0		# TCP_KEEPCNT;

Let’s add mock-server.conf file to .gitignore, .ignore, or .rignore.

Now if you re-run the previous search, there wouldn’t be any output since ripgrep is filtering the mock-server.conf file out of the search.

Performance

To get the maximum performance, ripgrep runs in a multi-threaded way which means that the result shown will not be in the same order for the same search running multiple times. The way ripgrep sorts the output is based on whichever file gets searched first. You can pass the --sort flag to sort the output which will come at the cost of some performance.

The line number and color cording are not the main selling point (it’s open-source so no one’s selling you anything :wink:) for ripgrep. Its main feature is being extremely fast and the author Andrew Gallant wrote a detailed blog on ripgrep benchmark.


I was inspired by Brodie Robertson and Jay LaCroix to use ripgrep so thank you both. If you’ve given ripgrep a try, please let me know how your experience was.