

This is the scenario you are asking about. I have it delete the duplicate and create a hardlink in it's place (thus saving me HDD space), although you can have it simply output the locations of the duplicate files and not do anything with them. When if finds all the files that are truly identical you can have it do several things. First it compares size, and if the sizes are identical then it creates hashes of the files and compares that, if the hashes are the same then in actually goes through each file byte by byte and compares it. Fdupes will recursively search through directories and compare every file with every other file. I have been using a tool called fdupes to accomplish something similar. I'm using Windows, but I am generally interested in knowing how something like this could be achieved, regardless of operating system. I would then like to have this value as input when searching for files and have the operating system search through a given directory or the entire file system for files with this exact SHA1 hash value and output a complete list of locations where these files are stored. If a file is renamed, but not edited, I could calculate its hash value, e.g. But if the files have been renamed on either one of the hard drives this would probably not work (depending on how much the new names differ from the original). I could then compare them side by side and in case they are the same I could delete the copy I have on the portable hard drive. but I'm not sure of the location! Now if the files are not renamed, I could do a file name search to try to locate the hard copy on the desktop. They are essentially duplicates.įor instance, I could have a bunch of files on a portable hard drive, and also hard copies of those files on the internal hard drive of a desktop computer. I often times find myself in situations where I have a bunch of files that I know I already have stored in some location but I don't know where.

This could be helpful when trying to pin point file duplicates. Is there a way I can have a hash value as input when searching for files and a complete list of files and their locations as output?
