## Using jcFind to find duplicates within an archive

The other day I used jcFind to search for duplicate classes inside my .war file.  Just wanted to jot down how I did that.

1. I have Cygwin installed, including the Python package.
2. I downloaded jcfind-1.0.5.zip from sourceforge.net and extracted the jcfind python script therefrom.
3. I unzipped my xa-example .war file to a temporary directory (jcfind doesn’t currently support searching recursively inside archives).
4. Ran the command

The Command

Here’s the command, and then I’ll explain each part:

python d:\path\to\jcfind\jcfind @c:\tempdir\xa-example.war | grep "\.class$" | rev | cut --delimiter=. --fields=2- | rev | cut --delimiter=" " --fields=2 | sort | uniq --repeated The Explanation Element Explanation python d:\path\to\jcfind\jcfind jcfind is a Python script, so we need to start up Python to use it @c:\tempdir\xa-example.war-dir The syntax jcfind expects is searchstring@directory-to-search. With no searchstring, jcfind lists all contents of the archives. | grep "\.class$" jcfind lists directories too, but we want just classes, so we search for .class at the end of a line
| rev reverse the string — sometimes it’s easier to do things from the other end…
| cut --delimiter=. --fields=2- chop off “.class”
| rev turn the strings back around frontways
| cut --delimiter=" " --fields=2 chop off the path so we can see which adjacent lines are identical
| sort get any duplicates next to each other
| uniq --repeated only show duplicate classnames

‘Course, for all this, it still doesn’t tell you which libraries the duplicates are in…