Forum Moderators: phranque

Message Too Old, No Replies

404 for any requests for a certain subdomain

         

phpmaven

1:21 am on Feb 18, 2020 (gmt 0)

10+ Year Member



What is the easiest way to force a 404 response for any requests for a certain subdomain?

I accidentally got some urls in Googles index and I want to make sure that any request for then will 404

They all have the same subdomain "new.domainname.com"

Thanks,

Mark

phranque

2:28 am on Feb 18, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



i would suggest using mod_rewrite directives, with a RewriteCond to test for the requested hostname using the ${HTTP_HOST} variable and using a RewriteRule to specify the G flag which will return a 410 status code which is essentially the same as a 404 (Gone vs Not Found)

phpmaven

2:59 am on Feb 18, 2020 (gmt 0)

10+ Year Member



Thank you,

I actually ended up doing the following since I really only want www.domain.com to have access

RewriteEngine on
RewriteCond %{HTTP_HOST} !^(www\.)?domain\.com$ [NC]
RewriteRule ^ - [R=404,L]

phranque

4:30 am on Feb 18, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



the other option is to redirect these requests to the canonical hostname.

tangor

7:44 am on Feb 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just curious ... if the URLS in question no longer exist they will automatically return 404 ... the only "forcing" possible is 410.

tangor

7:46 am on Feb 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Secondary ... expect to see g (or other SEs) request these URLS forever. Even forcing to 410 will not make them go away, merely reduce them.

lammert

8:31 am on Feb 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your rewrite rule suggests that all traffic which is not sent to the www. version of your domain should be 404-ed. There are some issues with that:

All traffic to example.com without a subdomain will also receive a 404. That is probably not what you intended. If I want to access a site directly without going through a search engine, I type the domain name in the browser address bar without the www and expect the webserver to direct me to the correct location.

Second question, why is the webserver listening to these subdomains? It shouldn't if you have your VirtualHost sections properly setup. This rewrite rule is a fix for something which is caused somewhere else. It's a band-aid at best.

Third, if you don't use these subdomains, you could remove them from the DNS. In that case, search engines will remove the entries from the SERPs because they realize the endpoint doesn't exist.

phranque

10:57 am on Feb 18, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Your rewrite rule suggests that all traffic which is not sent to the www. version of your domain should be 404-ed.

the RewriteCond will check for example.com OR www.example.com
RewriteCond %{HTTP_HOST} !^(www\.)?example\.com$ [NC]


if the URLS in question no longer exist they will automatically return 404

Second question, why is the webserver listening to these subdomains? It shouldn't if you have your VirtualHost sections properly setup. This rewrite rule is a fix for something which is caused somewhere else.

this situation can be caused by VirtualHost that is configured for wildcard subdomains.

phpmaven

1:45 pm on Feb 18, 2020 (gmt 0)

10+ Year Member



I appreciate all of the follow up responses. Perhaps I should explain the situation. I put a subdomain out there for testing purposes and somehow Google picked up on it and indexed a bunch of urls from the test domain. Obviously this could create a serious problem with duplicate content. I requested removal through the Google Search Console and I want to make sure that the urls never appear again. The way I have things setup, example.com redirects to www.example.com and new.example.com gets a 404. It sounds like 410 might be a better option. If any of you has any insight into what might be a better way to handle the situation, I would appreciate it. These forums have always been great. This is the first time I've posted here in quite a few years.

not2easy

2:42 pm on Feb 18, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If find/replace is a simple thing for these pages, you could use the old "unavailable_after" meta tag rather than just noindex and a 404 or 410. I have found it to work better than the GSC controls.
<meta name="GOOGLEBOT" content="unavailable_after: 18 Feb 2020 15:00:00 UTC"> 

phpmaven

3:00 pm on Feb 18, 2020 (gmt 0)

10+ Year Member



Would the best thing be to just not have a DNS entry for that subdomain at all? Or now that I've done the removal in GSC, is that going to create issues?

lucy24

6:31 pm on Feb 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



if the URLS in question no longer exist they will automatically return 404
A 404 can also be returned manually. (Note that the L flag in mod_rewrite isn't needed with non-300-class responses. It will do no harm, it's just not necessary.) In some situations, this will make less work for the server, since it doesn't have to go look whether a specific file exists; at most it just has to glance at a RewriteCond.

My personal experience has been that a 410 makes Google stop requesting the URL sooner.

JayDub

1:30 am on Mar 25, 2020 (gmt 0)

5+ Year Member Top Contributors Of The Month



Go with the 410. It's the best "Hey I took it down on purpose, so stop looking" header ... The only way a 410 header is served is with a deliberate/intentional action rather than a default "can't find it" (404) response from the server, which (default) could be caused by something as simple someone requesting a file while it's being replaced.

Example: When you upload a new version of a file the remote version is deleted and the new version is saved in it's place. If there's a delay in the upload/save portion of the process for any reason (including slow connection) after the previous version is deleted a 404 Not Found header will be served, because the server can't find the file, even though it will be back in a second or three.