Using jcFind to find duplicates within an archive

The other day I used jcFind to search for duplicate classes inside my .war file.  Just wanted to jot down how I did that.

  1. I have Cygwin installed, including the Python package.
  2. I downloaded jcfind-1.0.5.zip from sourceforge.net and extracted the jcfind python script therefrom.
  3. I unzipped my xa-example .war file to a temporary directory (jcfind doesn’t currently support searching recursively inside archives).
  4. Ran the command

The Command

Here’s the command, and then I’ll explain each part:

python d:\path\to\jcfind\jcfind @c:\tempdir\xa-example.war | grep "\.class$" | rev | cut --delimiter=. --fields=2- | rev | cut --delimiter=" " --fields=2 | sort | uniq --repeated

The Explanation

Element Explanation
python d:\path\to\jcfind\jcfind jcfind is a Python script, so we need to start up Python to use it
@c:\tempdir\xa-example.war-dir The syntax jcfind expects is searchstring@directory-to-search. With no searchstring, jcfind lists all contents of the archives.
| grep "\.class$" jcfind lists directories too, but we want just classes, so we search for .class at the end of a line
| rev reverse the string — sometimes it’s easier to do things from the other end…
| cut --delimiter=. --fields=2- chop off “.class”
| rev turn the strings back around frontways
| cut --delimiter=" " --fields=2 chop off the path so we can see which adjacent lines are identical
| sort get any duplicates next to each other
| uniq --repeated only show duplicate classnames

‘Course, for all this, it still doesn’t tell you which libraries the duplicates are in…

Advertisements

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s