Announcing pss: a tool for searching inside source code
October 14th, 2011 at 11:29 amWhat tool(s) do you use when you need to quickly search through a set of directories recursively, focusing only on C++ source code files (.cpp, .h, .hh and so on), looking for some string (or regular expression)? Oh, and if this search could also ignore some directories we really don’t want to look into, like .svn, all the better.
I think it would be interesting to see what programmers answer to this question. My guess is:
- Newbies – will have no idea ("err, just manually grep in each directory?), or tell you to use an IDE "find in files" command.
- Disciples of the Unix way will probably quickly produce a concoction of find and grep, connected with pipes and xargs (Quiz: what is the shortest such command to answer all the requirements from above?)
- Experienced users will likely pull a ready made bash (or batch?) script that does this out of their toolbox, or say they use ack.
What is ack? Here’s a short description taken directly from its home page:
ack is a tool like grep, designed for programmers with large trees of heterogeneous source code. ack is written purely in Perl, and takes advantage of the power of Perl’s regular expressions.
Personally, I use ack myself. Or more precisely, have been using it until very recently. That’s when I decided to write such a tool myself, in Python. This tool is called pss and is now publicly available (also on PyPI).
Here are some cool facts about pss:
- It searches directories recursively by default.
- It recognizes known file extensions for source code (for example, .c and .h files for C code) and lets you easily select which files you want to search (whether it’s all Python files, all C files, all C and Python files, etc.)
- You can search for patterns specified with regular expressions, and also use regular expressions to specify the file patterns to look at, in case the defaults aren’t enough.
- It ignores some well known temporary files and directories, as well as source-control directories such as .svn and .hg.
- It produces a terminal-friendly, colored output, on Windows too! Color is used to conveniently set apart file names from the matches within them, as well as the matching portion of each line (in case you hate to scan each line looking for the actual matching string).
- It contains a lot of options particularly helpful for searching source code.
- It plays well as part of the Unix command line, with options that make it suitable for taking part in pipe-connected chains, if required.
- All it requires to run (on Linux and Windows, although almost certainly on other platforms as well) is a Python installation (version 2.6 and up, including 3.x).
pss clones ack’s functionality (implementing most of the features). The reason I decided to write and release it is mainly that Python is my language of choice, and installing Perl to run ack became a chore (chiefly on Windows machines, since on Linux Perl is usually installed by default). Really, the only reason I’ve been installing Perl on Windows boxes I had to work on in the past couple of years was to enable them to run ack.
Moreover, pss comes with a terminal-color library built-in, so unlike ack it doesn’t require to install any additional modules to nicely color its output on Windows (ack requires Win32::Console::ANSI).
I have some ideas for extending pss with extra features, and wanted to be able to do that in Python, without having to dust off my Perl skills. Other Pythonistas may find pss attractive for the same reason. pss is implemented in a very modular manner – the main script is just a thin wrapper over a library which can be used programmatically for a variety of purposes. In other words, pss is quite hackable.
Finally, pss just seemed like a cool project to do. Its existence is not meant to detriment ack in any way. I’ve been using and enjoying ack for many years – thanks to ack’s author Andy Lester for that!
Related posts:

October 14th, 2011 at 12:05
October 14th, 2011 at 12:13
I’ve been using grin (http://pypi.python.org/pypi/grin) for just this purpose – seems there could be some useful cross pollination of the two projects possible.
October 14th, 2011 at 12:25
ori,
Ouch, yeah, something like that. My wrists ache just from looking at that line
Tyberius,
I was actually looking at
grinprior to starting this project. It has some different design goals. I do want ack-like behavior of just knowing which file extensions to look for, given a “type”. grin is closer to vanilla grep in this respect.October 14th, 2011 at 12:33
Well, I guess I fall under your newbie category, because I always use my IDEs “Find in files” / “Find in project”. For Visual Studio and IntelliJ IDEA those find tools have always worked wonderfully for me.
October 14th, 2011 at 12:44
Ron,
Yay, the flame-bait worked!
Seriously, an IDE only solves a part of the problem. I can think of two main reasons:
1. No single IDE will help when you have to work with many languages & file types in a single day (i.e. C++, Python, Javascript, XML and various textual config files)
2. When working from the command-line, it’s a shame to leave it in order to go to an IDE. A tool integrated into the command-line mixes in harmoniously with the working flow. Not that I want this to become an IDE vs. command-line war…
When working mostly on a single large project written in the same environment (i.e. Visual Studio for C++ code), I also use the Find in Files dialog because it’s integrated into the environment.
pssintegrates into a command-line environment.October 14th, 2011 at 14:51
Sounds neat. I mostly use GNU global (and its emacs integration), though it has a different focus. It’s great for navigating through code, but not really meant for full searches. For that I use one of those find/grep/xargs-contraptions you mentioned, sitting in my history and getting bigger over time. Next time I’ll try yours instead.
October 14th, 2011 at 14:53
A colleague just introduced me to ack and having a Python powered version that I can easily install to Windows when needed sounds great. Unfortunately
sudo pip install pssfailed for me on Ubuntu with the following error:Running setup.py install for pss
error: file ‘/path/build/pss/scripts/pss’ does not exist
October 14th, 2011 at 15:11
Pekka
Yes, I’m aware of this problem which AFAIK happens only on Python 2.6, because of some known
pipproblem. I’m working to fix it really soon, but the installation from source should work fine anyway.Update 15:37: it has now been fixed (in version 0.32) –
pipinstallation should now work fine for Python 2.6October 14th, 2011 at 16:15
Fix confirmed also on my system. Thanks!
Seems to work great. Adding few examples to –help might be a good idea, though.
October 14th, 2011 at 16:18
Pekka,
Thanks for checking!
Examples are available here: https://bitbucket.org/eliben/pss/wiki/Usage (this page is linked from the README). I prefer to keep the
--helpas concise as possible to serve as a reference for users after they gain the initial experience with the tool.October 14th, 2011 at 16:59
Look nice!
Unfortunately my first test was not too successful
So it means that the search term is always treated as a regex. That can complicated the process sometimes. Most of the time I’m searching only for normal “words”, not regex.
(Actually, most of the time I’m using TextWrangler/BBEdit to do this kind of search, simple and powerful and I can open the file directly.)
October 14th, 2011 at 17:43
That’s great, Eli. I’m glad you like ack, and I’m glad that you made your own tool to scratch your own itch.
My plan for http://betterthangrep.com/ is to have all sorts of different tools that are better than grep, not just ack. When that time comes, I’d be glad to have pss on there.
October 14th, 2011 at 17:48
ori: Sure, you CAN write out a find/grep pipeline. No one is doubting your Unix chops. But why would you want to?
October 14th, 2011 at 19:23
Besides being Windows friendly, what benefits do I get over ack on Linux/Mac?
October 14th, 2011 at 21:01
Cool to see a python replacement, and cool to see a windows installation-friendly focus. But I think ack’s ability to run as a single file with no “installation” on unix is a killer feature that pss should emulate.
I suppose you can’t get around requiring an entire hg checkout for the script, but you should make sure it is possible to run the utility directly from the checkout, without requiring any installation. Custom system-wide pylib installations suck on systems that have package managers, and user-directory pylib installations are a pain if you use virtualenvs. I prefer to just have a src checkout of a utility and then symlink the main script into my ~/bin folder.
But pss fails when run without installation, because the script can’t find psslib when it’s run out of the checkout. I think all you have to do is move the script out of the ‘scripts’ subfolder to the root, and then the script will be able to import psslib without requiring psslib be installed to a site-directory.
Then people who don’t want to do a permanent install of pss can just do a hg checkout and add a softlink from ~/bin/pss to ~/utils/pss/pss (or wherever they stick the checkout). It’s also super easy to keep up to date then, just a single hg pull -u command.
October 14th, 2011 at 22:34
Etienne,
Yes, the search pattern is regex by default, and as you discovered, the
-Qflag is the cure when all you want is a simple string. I think that the case you present is rare enough to not significantly affect the choice of regex as default – after all, if a literal string was the default, a flag would have to passed each time a regex is required, and that would quickly become tiresome.roy_hu,
There’s nothing more to it than I explained in the post itself, really. If you’re a Python programmer, it may appeal to you to have the tool written in Python thus lending itself to deeper understanding and modification. If you don’t care about that (which is totally legitimate),
ackis a fine tool and does the work for you.October 14th, 2011 at 22:36
Nick,
That’s easily solvable! All you have to do is run:
I actually have this attached to an
alias pss, so whenever I run pss on my box it always uses the latest source version.October 14th, 2011 at 23:18
Can you put in a –emacs or -n flag which makes it output
filename:lineno:…matching line…
that’s the same format used by ‘grep -n’ and ack and ‘ant -emacs’ Then it will work with emacs.
October 15th, 2011 at 02:57
Noticed two limitations:
- There’s no -l switch like in ack and grep.
- Colors should be turned off if sys.stdout.isattty() is False (like ack and grep do)
At least the latter was easy to implement:
https://bitbucket.org/eliben/pss/pull-request/2/highlight-colors-by-default-only-when
October 15th, 2011 at 05:22
alias ag=’ack-grep –python –ignore-dir=migrations’
October 15th, 2011 at 08:20
Leigh and Pekka,
It would be great if you could open Issues on the pss Bitbucket page (https://bitbucket.org/eliben/pss/issues) about these feature requests. Thanks in advance.
Jack,
Could you please elaborate? The context of your comment is unclear.
October 15th, 2011 at 10:47
Eli, you may want to look into including a “__main__.py” file in the top level directory, and potentially even publishing a drop in executable “pss” zip file.
Very cool idea, though.
October 16th, 2011 at 17:28
It does not apply to all projects, but git grep is another solution.
October 16th, 2011 at 20:23
Ummm… eclipse solves this pretty well
October 17th, 2011 at 15:46
I’ve been thinking of making such a program for a LONG time. Glad to see there is some interest in it.
Some requirements: Lightweight, NO indexing, NO background services, Open Source, NO copyright…
Always assumed Microsoft would add that feature to VS, but it’s not, as far as I know (which limits to VS2008 sp1).
Will probably make it some day, code will be available on “the code project”.
October 17th, 2011 at 16:36
cscope (mostly) does this too. It dates back all the way to 80s at Bell Labs.
You put a cool spin on the idea though. Good work!
October 17th, 2011 at 17:21
For anyone who works under Windows, you might like to try Pipelines: http://www.tenfiftytwo.co.uk/pipelines.
October 18th, 2011 at 18:26
Excuberant ctags, anyone?
You can integrate it with Vim, too…
October 18th, 2011 at 22:38
Jon and Slinky,
ctagsandcscopeare great tools, and I actually use them both (integrated into Vim). Butpssdoes not really compete with them directly, but rather co-exists with them.October 19th, 2011 at 09:06
c(tags|scope) these are the tools from God himself!
October 19th, 2011 at 17:15
git grepftwOctober 20th, 2011 at 09:26
I’ve just tried pss with my cpp source code. That’s easy and helpful.
Thank you for the great work.
October 21st, 2011 at 11:37
u and Timo,
git grepappears to be useful for Git-based repositories, but obviously it’s not a general purpose solution. Besides,psshas more featuresOctober 27th, 2011 at 16:36
That’s a really nice tool, kudos!!
November 3rd, 2011 at 22:48
Okay, I’m late to the game. The shortest script is “grep -r ” – recursive grep. No need for find or xargs.
November 4th, 2011 at 07:36
bryane,
And how do you ignore certain directories you’re not interested in?
November 16th, 2011 at 11:54
For emacs users, there are a lot of integrated alternatives. I took (once) the time to install grep-o-matic http://www.emacswiki.org/emacs/GrepMode#toc16 and it works just great. Searching a keyword in a repository is now only a shortcut away.
November 16th, 2011 at 12:08
Olivier,
I agree editor-integrated search tools are sometimes more convenient, but no always – occasionally you’re just working in the terminal. Besides, creating an Emacs plugin for pss shouldn’t be hard – I think someone is already working on one for Vim.