SVN Searcher

Last time, we ended up with this, our pinnacle of achievement, to find all classes in our framework layer that instantiate non-PO public classes:

for c in `find -wholename */main/*.java | xargs grep “public class [A-Z]” | sed -e “s/.*public class ([^ ]*) .*/1/”`; do find -wholename */main/*.java | xargs egrep ” = new $c” | grep –invert-match ” = new [A-Za-z]*PO”; done

With SVN Searcher, we can replace that with this search:

+FileBody:"public class" AND +FileBody:" = new " AND +Name:/src/main/java/


Let’s assess the pros and cons of this:

  • + It doesn’t require checking out the world;
  • + It completes in seconds rather than minutes (though we could optimize our script to not do the same directory search multiple times, which would help it);
  • + It doesn’t require installing Cygwin (I doubt many of our developers have it installed);
  • – This doesn’t allow us to strain out the PO instantiations — SVN Searcher’s underlying searcher techology, Lucene, doesn’t support wildcards in phrases:

    Lucene supports single and multiple character wildcard searches within single terms (not within phrase queries)…Note: You cannot use a * or ? symbol as the first character of a search.  (from the Lucene 2.4.0 Query Parser Syntax document)

    …so all those’ll show up too;

  • – In addition to the limitations on where a wildcard can be used, the Lucene query parsing syntax doesn’t support full regular expression searching.
  • 0 This one is a weakness of both approaches: I think SVN Search uses a non-real-time indexing approach, so the index you’re searching will not be up to date if anyone’s committed since the last time the index was rebuilt; but with the download-the-world approach you similarly always have to remember to do a svn update or you’re searching an out of date working directory, and it’s still not real-time (you never know when someone might have just committed a new thing that depends on the thing you’re querying about…)


SVN Searcher could be a good first resort, for when I want to know where in the system a certain class is being used.  But I think there will be times when I need more power than it offers.

The download-the-world-and-execute-a-line-of-Greek approach is too clunky for many to feel comfortable with, and it’s going to get less workable for me too as we start having branches and tags in the repository.  I need a better solution too.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.