Forum Moderators: coopster
I have been using Xenu link checker to generate reports, and site-maps. The problem is that once Xenu hits the first file thats using php, it follows the other links, adding PHPSESS to them. Obviously this is because it generated a session id when the script ran, and while it traverses the other links, it is just carrying them across.
I have a modified .htaccess file that lets php code be processed in .htm or .html files. If I did not have this, and had any file with php code in it just called .php, it wouldnt be passing the ids to the other pages.
So now when I make a report or site map, I get many duplicate entiries because I will have somepage.htm, and then also somepage.htm?SESS=whatever whatever2 etc..
So my question comes down to this. Since all my files can process PHP anyway, is there a command I can put in the start of any file (that does not need to use php), to have it prevent arguments from being passed?
Did I over complicate my explanation? :) Thanks for any info!
the problem with xenu might also affect search engine spiders. you need to detect the user-agent and not append sessionIDs if a certain one is detected, e.g. xenu or google. (some popular forum softwares do this)
i've never done this but it should be enough that you have a look in your logs to see how xenu identifies himself, then write a conditional statement to detect the
$_SERVER["HTTP_USER_AGENT"] and NOT start sessions for particular ones.
hth
It seems like there would have to be some other way. So many sites use php for just part of their site.
I run a php based forum, but I am using mod_rewrite. It seems like that would be too much trouble to include it just for some simple guestbooks and comment forms though.