g1smd

msg:4425040 | 1:32 pm on Mar 5, 2012 (gmt 0) |
Xenu LinkSleuth will do this, but only if you let it also have FTP access to scan the server filesystem.
|
fm86

msg:4425065 | 2:32 pm on Mar 5, 2012 (gmt 0) |
Thanks for the interesting reply! This sounds like a really cool software. I have a question though, in case you are an expert user: I just read the software's specs and I wonder if it is able to see that an image is required using the background-image(url:myUrl) css command. Doesn't looks like it goes checking the GET requests of files. In case it does would you be so kind to give me an advice about how to configure it for my purpose? Besides, my site runs locally only for now. So I guess no FTP is needed.
|
g1smd

msg:4425069 | 2:56 pm on Mar 5, 2012 (gmt 0) |
After Xenu scans the website via HTTP (the site therefore needs to be running on a HTTP server such as Apache) it then asks for the FTP credentials so it can look in all the folders to find any files that were not accessed during the HTTP scan - those are the unused files. I have no idea if Xenu looks for files mentioned in style sheets. I have never considered that possibility. I would hope that it does. It is quite easy to test whether it does or not.
|
fm86

msg:4425438 | 7:51 am on Mar 6, 2012 (gmt 0) |
Hmm... Seems it won't work for JS and CSS: "Please be careful with removing files when listed in an orphan report. Especially navbar mouseover images will be seen as “orphans”, because Xenu cannot find links to it. " (from [integralworld.net...] Xenu is cool but doesn't seem to be the perfect solution in this case. I would need something that goes checking the GET requests done to server. In this way all CSS, JS and AJAX request would be parsed. Any further suggestion?
|
Jonesy

msg:4427922 | 9:30 pm on Mar 11, 2012 (gmt 0) |
linux: I generally use the web site layout ala: ...../documents/ ...../documents/images/ On my workstation (using the command line) I enter the .../images/ sub-directory and: ...../images$ for IMG in * ; do echo $IMG ; grep -l $IMG ../* ; done (That's a lower-case "ell" for the grep switch.) Any image listed without any following lines of files noted by grep is a file not to be found in any document: html, php, css, etc., usw. If you wish, you could get more elaborate with the -r|-R (recursive) switch for grep. Of course it works only for static web pages.... HTH, Jonesy
|
|