My wife and I each have our computer, plus there’s the old one in the basement that I use for recording projects (no kitchen computer, though!)

The one computer runs a modern flavor of Linux; the one in the basement runs an older flavor of Linux and I’ve cannibalized the CD-ROM out of it so I currently can’t install new programs from its installation CDs; and the other one runs Windows 95.

Sometimes it’s a challenge getting these computers to all be on speaking terms with each other!

The other night, I recorded a tape of my wife’s late grandmother reading James Whitcomb Riley, and rather than copy it up to the Windows 95 machine to burn a CD, I wanted to put it on the newer Linux box first, since I have a newer version of the Audacity sound editor there that has a noise eliminator filter I wanted to try out.

The basement computer makes available an SMB share that I usually access from Windows 95.  I thought there should be some command-line smb* command that would get me the recording file off the other computer’s share point.

There is: it’s smbget, and here’s how I used it:

smbget smb://host/sharename/path/to/file.ext

Then it prompted me for username and password (and maybe workgroup?  Though it defaulted to the right one).


Ultimate power!

I use Eclipse’s External Tools feature to run a lot of Maven goals: clean install, eclipse:clean eclipse:eclipse, etc.  I have an external tool configuration set up for each of these common commands.

Sometimes, though, I want to try out a goal I don’t as often use.  Maven dependency:tree and dependency:list I use almost enough to have a special external tool configuration for each… but from time to time I find out about other Maven plugins that I want to try out but would rather not have to clutter up the external tools list with special entries for each one.

As an alternative, sometimes I try out these goals from the Windows command prompt, but it’s a pain to navigate to my project’s working directory each time before executing the command.  I probably experiment less with such commands than I would otherwise because of the effort to prepare to run them — I just get by without their output, without really thinking about it.

Yesterday, though, Keith was helping me get set up to do JBoss remote debugging in Eclipse (a whole ‘nother topic), and in the process of setting it up, one of us wondered if you could set up an external tool that just dropped you to a Windows command prompt so you could type whatever command you want.

You can!  Here’s my configuration, which I call Ultimate Power:


When I run this, it puts a Windows command prompt sitting in my project directory in a Console window, and now I can type my command:



SVN Searcher

Last time, we ended up with this, our pinnacle of achievement, to find all classes in our framework layer that instantiate non-PO public classes:

for c in `find -wholename */main/*.java | xargs grep “public class [A-Z]” | sed -e “s/.*public class ([^ ]*) .*/1/”`; do find -wholename */main/*.java | xargs egrep ” = new $c” | grep –invert-match ” = new [A-Za-z]*PO”; done

With SVN Searcher, we can replace that with this search:

+FileBody:"public class" AND +FileBody:" = new " AND +Name:/src/main/java/


Let’s assess the pros and cons of this:

  • + It doesn’t require checking out the world;
  • + It completes in seconds rather than minutes (though we could optimize our script to not do the same directory search multiple times, which would help it);
  • + It doesn’t require installing Cygwin (I doubt many of our developers have it installed);
  • – This doesn’t allow us to strain out the PO instantiations — SVN Searcher’s underlying searcher techology, Lucene, doesn’t support wildcards in phrases:

    Lucene supports single and multiple character wildcard searches within single terms (not within phrase queries)…Note: You cannot use a * or ? symbol as the first character of a search.  (from the Lucene 2.4.0 Query Parser Syntax document)

    …so all those’ll show up too;

  • – In addition to the limitations on where a wildcard can be used, the Lucene query parsing syntax doesn’t support full regular expression searching.
  • 0 This one is a weakness of both approaches: I think SVN Search uses a non-real-time indexing approach, so the index you’re searching will not be up to date if anyone’s committed since the last time the index was rebuilt; but with the download-the-world approach you similarly always have to remember to do a svn update or you’re searching an out of date working directory, and it’s still not real-time (you never know when someone might have just committed a new thing that depends on the thing you’re querying about…)


SVN Searcher could be a good first resort, for when I want to know where in the system a certain class is being used.  But I think there will be times when I need more power than it offers.

The download-the-world-and-execute-a-line-of-Greek approach is too clunky for many to feel comfortable with, and it’s going to get less workable for me too as we start having branches and tags in the repository.  I need a better solution too.

Finding the magic non-beans

Someone asked me yesterday what examples we have in the framework services “layer” where we provide functionality through a class that is not exposed as a Spring bean.  I could think of just a couple of examples off the top of my head, but both were kind of odd ones.  I wanted to search the framework services codebase to see if we have other uses of non-beans.

I reasoned that a distinctive characteristic of using a non-bean X is that you tend to see … = new X(... in code using the class.  Here’s what I decided I wanted:

  1. For *.java in the framework services code base, show me the names of the classes that are public (e.g., search for “public class”)
  2. For each class C in this list, search our entire code base for ” = new C(“

1. Finding the Public Classes

1.1. Listing the .java files

From the Windows command prompt I navigated to the directory of my workspace and issued a simple

dir /s /b *.java

This output the filenames in C:… format, which would normally be fine, except that the Cygwin utilities I use for this type of file processing don’t deal well with the backslashes.  They expect a more Unixy output.  I can provide that by using the following instead:

find -name *.java

This output the same filenames in ./…/path/to/the/ format.

1.2. Whittling away the non-public classes

Next, I wanted to search each file in the above list (281java files) for the string “public class”.  (I could probably have skipped this step and just chopped the .java from the filenames to yield the class names except that we have several package private classes that I don’t care about for purposes of this search.)

Xargs to the rescue!

find -name *.java | xargs grep "public class"

This pares our list down to 172 classes.

1.3. Whittling away the test classes

I notice as I examine the output from step 1.2 that several of the public classes are in …/src/test/java/… .  For purposes of this search, I don’t care about those — I only want to see the public classes in production code.  Without bothering to spend time reading the find utility’s manpage,  I modify the search to be like this:

find -name */main/*.java | xargs grep "public class"

At this point, I get one of the most helpful warnings I’ve ever seen (thanks, findutils team!):

find: warning: Unix filenames usually don’t contain slashes (though pathnames do).  That means that ‘-name `*/main/*.java” will probably evaluate to false all the time on this system.  You might find the ‘-wholename’ test more useful, or perhaps ‘-samefile’.
Alternatively, if you are using GNU grep, you could use ‘find … -print0 | grep -FzZ `*/main/*.java”.

Sure enough, no results found.  I fix the search to use the -wholename test, as the warning suggests:

find -wholename */main/*.java | xargs grep "public class"

This works, and now my list is down to 79 public classes, all in src/main/java.

1.4. Just the class names

Actually, what I have is 79 lines like this:

./exceptions/src/main/java/com/ontsys/fw/exception/ class InvalidDataImpl implements InvalidData {

I want just the class names.

A while later…

Here’s our command line now (broken into its three parts for readability; it’s all one line when I run it):

find -wholename */main/*.java
| xargs grep "public class "
| sed -e "s/.*public class ([^ ]*) .*/1/"

To put into words what this is doing:

  • Line 1 lists all production Java files (excludes src/test/java/)
  • Line 2 looks in each file listed and prints the lines that contain the string “public class “.
  • Line 3 looks for public class X (where X is a bunch of non-space characters) and prints just the class name.

1.5. Just the classnames: a minor tweak

The regular expression we’re passing to grep in line 2 above has matched a Javadoc comment line in which the phrase “public class ” was used.  Let’s tweak line 2 to only match lines that have a capital letter after public class:

find -wholename */main/*.java | xargs grep “public class [A-Z]” | sed -e “s/.*public class ([^ ]*) .*/1/”

So: now we have 78 class names printing out.  Now to see who instantiates these.

2. Who Instantiates These?

Now, for each class C in our list (78 of ’em), I want to know all the places in our codebase where ” = new C” appears.

2.1. Who-all instantiates these?

Here’s an approach (O(N^2) at best, but I find it’s often easier for me to make it work fast once it works at all…):


bash-3.2$ for c in`find -wholename */main/*.java | xargs grep “public class [A-Z]” | sed -e “s/.*public class ([^ ]*) .*/1/”`; do find -wholename */main/*.java | xargs grep ” = new $c”; done

Notice that we’re running bash to get the backquote goodness.

For each class in our list-o-78, we search the working directory for production java source files that instantiate that class directly.  This took five-and-a-half minutes on my PC, not searching the whole codebase (which I guess I’d have to check out in its entirety…hmm…) but just framework services.

2.2. Who that we care about instantiates these?

The results generated by step 2.1. include mostly all instantiation of our POs (persistence objects?)  I would like to remove instantiations of POs from our results and see what’s left.

I can tell a PO because its classname always ends with PO.  So here’s our updated command line:

bash-3.2$ for c in `find -wholename */main/*.java | xargs grep “public class [A-Z]” | sed -e “s/.*public class ([^ ]*) .*/1/”`; do find -wholename */main/*.java | xargs egrep ” = new $c” | grep –invert-match ” = new [A-Za-z]*PO”; done

This gets us just the five interesting instantiations.

Next time: How to avoid all this using SVN Searcher!

Using jcFind to find duplicates within an archive

The other day I used jcFind to search for duplicate classes inside my .war file.  Just wanted to jot down how I did that.

  1. I have Cygwin installed, including the Python package.
  2. I downloaded from and extracted the jcfind python script therefrom.
  3. I unzipped my xa-example .war file to a temporary directory (jcfind doesn’t currently support searching recursively inside archives).
  4. Ran the command

The Command

Here’s the command, and then I’ll explain each part:

python d:\path\to\jcfind\jcfind @c:\tempdir\xa-example.war | grep "\.class$" | rev | cut --delimiter=. --fields=2- | rev | cut --delimiter=" " --fields=2 | sort | uniq --repeated

The Explanation

Element Explanation
python d:\path\to\jcfind\jcfind jcfind is a Python script, so we need to start up Python to use it
@c:\tempdir\xa-example.war-dir The syntax jcfind expects is searchstring@directory-to-search. With no searchstring, jcfind lists all contents of the archives.
| grep "\.class$" jcfind lists directories too, but we want just classes, so we search for .class at the end of a line
| rev reverse the string — sometimes it’s easier to do things from the other end…
| cut --delimiter=. --fields=2- chop off “.class”
| rev turn the strings back around frontways
| cut --delimiter=" " --fields=2 chop off the path so we can see which adjacent lines are identical
| sort get any duplicates next to each other
| uniq --repeated only show duplicate classnames

‘Course, for all this, it still doesn’t tell you which libraries the duplicates are in…

Jar searcher research

Recently when I have searched SourceForge for [jrw]ar searcher utilities, there are some projects out there that look like they might be what I’m looking for, but then I realize they’re not, for one reason or another.  I wanted to jot down these project names so I don’t spend time re-evaluating them in the future without meaning to.

(The two utilities that I have tried and found helpful are JarSearch and jcFind.)

Could be helpful

  • File finder – You have to set up an Ant task to use it, but looks really powerful.  Possibly we could set up an Ant task to search the deploy and lib directories of JBoss, and the other teams could use this for a quick report for dependency resolution.
  • Jar Browser – Currently only supports JARs, not WAR, RAR, etc. (would need to unzip the WAR first, same as with jcFind)
  • JarBreaker – searches for a particular class name (supports regex wildcards); sports a whimsical homepage
  • SearchJars – searches for a given class name
  • JavaCheck – C++ source, apparently needs to be built with gcc… seems to need more of a development environment than I currently have installed with Cygwin.

Not Usable at the Moment

These are projects that appeared to be maybe just the thing I was looking for — and then I realized that there were no files available!

  • JClassFinder – no files released (in planning phase since 5/2006)
  • ZipSearch – no files released (though it’s been in “Alpha” status since end of 2007)
  • ZipLister – no files released (though it’s listed in “Alpha” status; project was registered 2008-02-20)

Searching .[jrw]ar files with JarSearch

I decided to look around again to see if someone had already made a .[jrw]ar file searcher so we don’t have to do all that manual searching to find duplicately-deployed classes.  A search of turned up the JarSearch Eclipse plugin, made available under the GPLv2 license.


I extracted the .jar file from the download’s .zip file and copied it into my eclipse\plugins directory, then restarted Eclipse (JarSearch doesn’t seem to be set up to use the fancier method for installing an Eclipse plug-in using the Eclipse Update Manager).  The most recent build of JarSearch is version 1.0 for Eclipse 3.2, but it seems to work ok on my Eclipse 3.3.2 installation.


The Dialog

Now when I hit Ctrl+H, I have a new Jar Search tab in the Search dialog, which allows me to search for a class inside archive files:

Notice that I added the rar extension to the list of filetypes to search inside of.

Search Results

I click Search and JarSearch finds some classes with HAXA in their names inside .jar files that are nested inside two different .rar files.  Wow, that’s nice!

Future Feature Wish: Saved Defaults

This plug-in is going to be so helpful to us.  There’s just one thing I’d like to see improved — it would be nice if the options you pick could be saved as defaults for next time.  As it is, after doing the search mentioned above, if I press Ctrl+H again and navigate to the Jar Search tab, the options are back to the original settings:

This means that I will always be needing to select the folder (which I think will usually be JBoss’s default\deploy folder) and add in the .rar extension. [Update 11/5/2008: Actually it turns out the folder I always want to search is the server\default folder, not server\default\deploy…]

That’s a minor gripe though.  This utility is really going to be a help to us.  Thanks, Alain!