Forum Moderators: phranque
G, Y and Jeeves all continued crawling the server using its IP address which has resulted in a large number of duplicate pages in their indexes where the urls looks like [128.0.0.0...] but the contents of the page are the same as www.mydomain.com/page-name.html .
The host has given me access to the server and I need to upload an .htaccess file that will generate a 404 error for every request.
I’ve read everything I can find on .htaccess and I can’t figure out what commands to use.
Any help would be appreciated.
So, I'd go for the method #1 or #2.
#1
In the .htaccess, you put these.
AddHandler send-as-is .asis
DirectoryIndex index.asis
Then, create the asis document, index.asis:
Status: 404 Not found
Content-type: text/html<html><body>
<h1>Error 404 Not Found</h1>
</body></html>
Other than that, you can edit and send anything valid you want.
#2
In the .htaccess, you need to put these
DirectoryIndex index.cgi
Then you need to have index.cgi with executable permission 700 (or 755 may do).
#!/bin/sh
echo 'Status: 404 Not found
Content-type: text/html<html><body>
<h1>Error 404 Not Found</h1>
</body></html>'
Basically you need to send the same thing as the asis document.
The advantage of using cgi is you can do other things, such as taking log or sending mail or whatever you want.
But if you do heavy thing, you may tax the server.
asis <== the simplest and lightest way.
cgi (or other dynamic method) <== you can do many things
Probably, there are other ways, too.