Forum Moderators: phranque
For example
http://example.com/test/page.htm
Would have a link to
[cgi.example.com...]
Now I noticed when I ask Google
site:cgi.example.com
that there are more than 40.000 entries instead of expected 10.000 entries
I looked in my log files what googlebot is doing, and discovered, googlebot is indexing pages, which should no more exist.
Now I changed my service.pl script
The service.pl script checks now, does the page exist, for which a contact form should be generated.
If original page for contact form does not exist
print "Location:http://error404.example.com\n$http::p3p\n$http::cookie\n";
The subdomain
error404.example.com
is only created to create a real error 404 for the request
When I test with the browser, I get really an error 404 like expected.
But just some minutes ago, googlebot tried to index some pages, where an error 404 should be returned, but my log file shows return code 302
Any ideas how to solve the problem?
BTW, one client left me, but still used my old page and my scripts. For this, I had the following solution.
if ( $domain eq "client-which-left-me.com" ) { require "not-exisiting-script.pl" }
to produce an error 500
Checking my log files, I discovered, that googlebot still spiders pages, where since 8 month is error 500 returned.
Now I changed to my new method, but instead of error 404, log files shows 302
The "Location" header generates a 302 response, unless you also specify a response code. You need to replace this line with a print "Status:" line that outputs a 404, instead of redirecting to another page to generate the 404 page.
print "Status: 404 Not Found\r\n";
Jim