Using jcFind to find duplicates within an archive
Posted by danielmeyer on July 29, 2008
- I have Cygwin installed, including the Python package.
- I downloaded jcfind-1.0.5.zip from sourceforge.net and extracted the jcfind python script therefrom.
- I unzipped my xa-example .war file to a temporary directory (jcfind doesn’t currently support searching recursively inside archives).
- Ran the command
Here’s the command, and then I’ll explain each part:
python d:\path\to\jcfind\jcfind @c:\tempdir\xa-example.war | grep "\.class$" | rev | cut --delimiter=. --fields=2- | rev | cut --delimiter=" " --fields=2 | sort | uniq --repeated
||jcfind is a Python script, so we need to start up Python to use it|
||The syntax jcfind expects is searchstring@directory-to-search. With no searchstring, jcfind lists all contents of the archives.|
||jcfind lists directories too, but we want just classes, so we search for .class at the end of a line|
||reverse the string — sometimes it’s easier to do things from the other end…|
||chop off “.class”|
||turn the strings back around frontways|
||chop off the path so we can see which adjacent lines are identical|
||get any duplicates next to each other|
||only show duplicate classnames|
‘Course, for all this, it still doesn’t tell you which libraries the duplicates are in…