I have a folder (images) in my website which contains, in a rather unstructured way, all the images used in the website. With the time this folder grew in size and many files contained in it are not used anymore. Now it's time to make some file cleaning and I need to choose the best strategy to remove all the unused files and preserve the used ones.
The best idea that came to my mind is writing a software that crawls all the website following all the links and writes in a file all the request that are made to the folder images. And then delete all the files that are not in that list.
Before starting reinventing the wheel I would like to know if this idea makes sense and if there are already libraries that perform part of this task. Any suggestion is welcome!