Forum Moderators: open
Just noticed Google added several "pages" from one of my sites that it shouldn't have found.
It's gone and found some php scripts that are only triggered by a form which asks for location information (which then sends the visitor to the local site for the service). This means that Google can (and probably does) follow every link that you have in forms (and it constructed the links itself too as they are not complete in the form).
Also this method of google finding links does not migrate pagerank at all (certain of this).
I'm sure many people will know about this already and will have found some benefit from this but I'd just like to get back to having the real pages on the site listed so does anyone know a simple fix so Google can't follow urls it constructs from forms?
I've had a look at quite a few sites and can't see the same issue, could this be new?
Thanks
Remember, not all spiders abide by the .htaccess yet google bot usually does
robots.txt is a "voluntary code of conduct" that spiders usually chose to adhere to. Human visitors aren't even aware of such files.
htaccess controls what Apache will grant access to. It makes no difference whether the visitor is a spider or a human visitor - if htaccess blocks it, you can't have it. No question of "abiding by it".
DerekH