Welcome to WebmasterWorld Guest from 34.236.171.181

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Wildcard CNAME record = unlimited subdomains.how to fix?

     
10:45 pm on Dec 22, 2014 (gmt 0)

New User

Top Contributors Of The Month

joined:Nov 20, 2014
posts: 31
votes: 0


(I found this WebmasterWorld thread that describes the same issue, but I don't really understand the proposed solution - http://www.webmasterworld.com/google/4635618.htm [webmasterworld.com])

Running a site:mysite.com Google search turned up all sorts of weird subdomains on my site (e.g. test.mysite.com, niq.mysite.com, etc) that are being indexed by Google for some unknown reason. My host recently suggested it's because my 2 CNAME record are set to:
1) *.mysite.com CNAME mysite.com
2) www.mysite.com CNAME mysite.com


The wildcard in #1 above means that I can literally type any prefix into a browser (e.g. yankees.mysite.com) and it will take a user to my site (with yankees.mysite.com showing in the URL field). Normally I wouldn't care too much, but Google is actually indexing some of these gibberish subdomains which can result in bad stuff since they might consider it duplicate content.

I want to know how to fix that so that I don't get dinged. My host suggested I could just delete the wildcard CNAME record (#1 above). But I'm nervous to do that...especially since a poster in the WebmasterWorld thread I linked to above pointed out that even though I might think I only use mysite.com and www.mysite.com, it's possible that I have mail servers set up to use pop.mysite.com or mail.mysite.com...I really have no idea. And if it were as simple as just deleting the wildcard CNAME record, I don't understand why that 'solution' is never suggested in all the threads I've found about this issue. Instead, posters recommend redirects, or other htaccess configuration changes.

In an nutshell I just want to know whether deleting the *.mysite.com CNAME record would do the trick...I don't care if gibberish subdomains typed into a browser don't return any page...I just don't want Google indexing an infinite # of subdomain-versions of my site.

**Also, I'd be really curious to know why Google is even finding & indexing these gibberish subdomains to begin with...I thought it built its index only from backlinks...but there aren't any any actual links online to nonsense subdomains...

[edited by: brotherhood_of_LAN at 12:48 am (utc) on Dec 23, 2014]
[edit reason] fixed link [/edit]

1:11 am on Dec 23, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15934
votes: 887


Normally I wouldn't care too much, but Google is actually indexing some of these gibberish subdomains which can result in bad stuff since they might consider it duplicate content.

I don't understand. What physical content is it finding at ekcjklrjg.example.com? (It has just this instant occurred to me that Google could easily do wild-card-subdomain requests in exactly the same way they check for Soft 404s. Don't know if this is something they actually do.)
1:13 am on Dec 23, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 14, 2008
posts:2910
votes: 62


For Apache in either the httpd.conf [preferred] or the .htaccess:

RewriteEngine on
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule .? http://www.example.com%{REQUEST_URI} [R=301,L]

The above will redirect [consolidate] everything that's not www.example.com to www.example.com. If you use example.com [non-www] rather than www.example.com, then simply remove the www\. and www. from the condition and rule above, respectively, and everything will be consolidated to example.com.

Note: It's likely unnecessary to worry about duplicate content on a wild-card subdomain causing any type of duplicate content issue since Google groups duplicate URLs together and then attributes the ranking signals to the version it determines to be the canonical, which it uses a number of signals to try to determine, but consolidating is a best-practice and may help you sleep better, so I'd just use the above code and not worry about it again.

BTW: Welcome to WebmasterWorld!
1:30 am on Dec 23, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15934
votes: 887


Oh, whoops, you mean you're not actually using any subdomains? Then all you need is the domain-name-canonicalization redirect given above. Don't worry about "mail" or "pop" or similar. Unless your site has a very weird configuration, those wouldn't be used by http requests.

But if you're really not using any subdomains, there's not much point to keeping the wild cards enabled.
1:34 am on Dec 23, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 14, 2008
posts:2910
votes: 62


Note 2: Yes, I'm aware the SEO community often talks about there being a duplicate content penalty, but there's really no such thing. It simply looks that way sometimes because Google picks a different URL with duplicate content to display in the results than we think it should.

Note 3: When using the above code, make sure it comes after any other external redirects and all external redirects are pointed to the version of the domain to be considered the canonical, and also make sure the code comes before any internal rewrites -- This will help prevent "chained redirects", which can be a "not good thing" if there are too many.
7:45 am on Dec 23, 2014 (gmt 0)

New User

Top Contributors Of The Month

joined:Nov 20, 2014
posts: 31
votes: 0


Thanks for the replies, all.

I don't understand. What physical content is it finding at ekcjklrjg.example.com?


Well...it appears to be exactly the same website. The only difference being that the browser URL bar says ekcjklrjg.mysite.com...and if I navigate to other pages on the site, the URL retains the subdomain e.g. ekcjklrjg.mysite.com/internal_page.html

Oh, whoops, you mean you're not actually using any subdomains? Then all you need is the domain-name-canonicalization redirect given above. Don't worry about "mail" or "pop" or similar. Unless your site has a very weird configuration, those wouldn't be used by http requests. But if you're really not using any subdomains, there's not much point to keeping the wild cards enabled.


That's correct, I don't have any subdomains set up...and I don't want any.

Note 2: Yes, I'm aware the SEO community often talks about there being a duplicate content penalty, but there's really no such thing. It simply looks that way sometimes because Google picks a different URL with duplicate content to display in the results than we think it should.


I've read a lot of articles that suggest that the above isn't quite true...ofc because search engine algorithms are a bit of a black, I haven't seen anyone 'prove' that duplicate content issues directly cause SERP penalties, but people do seem to be concerned about it...regardless as you say I'll sleep better knowing that I'm employing best practices.

Lastly, would the 'solution' I asked about in my OP (just deleting the wildcard CNAME record) do the trick? Or is that something I should avoid, and instead address using MadScientist's code?

Thanks!
8:22 am on Dec 23, 2014 (gmt 0)

Senior Member from NL 

WebmasterWorld Senior Member lammert is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 10, 2005
posts:2956
votes: 35


Deleting the wildcard CNAME record will solve your problem with weird domain names in Google, but that record isn't there without reason. One possible reason as already mentioned is the existence of subdomains used for mail (smtp.example.com, pop3.example.com, imap.example.com) or nameservers (ns1.example.com, ns2.example.com). Deleting the record may therefore cause other problems which may be more severe for your online presence than the annoyance of having multiple sub domains in the SERPs. I would therefore advice you to add the code which MadScientist proposed for your httpd.conf or .htaccess file, because that will solve your problem without any other negative effects.
9:47 am on Dec 23, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 14, 2008
posts:2910
votes: 62


I've read a lot of articles that suggest that the above isn't quite true...ofc because search engine algorithms are a bit of a black, I haven't seen anyone 'prove' that duplicate content issues directly cause SERP penalties, but people do seem to be concerned about it...

Certainly what I said isn't "popularly" correct and people are concerned about it, but not being popular != wrong. Look into the meaning of FUD to further understand why people are concerned ;) -- Hint: It works and makes $$$! -- When you know what's going on, don't just blame Google for spreading it, because the SEO community spreads it thick -- Gotta "sound right" and "say the right thing", even if the info and what's presented is not accurate, because even if what someone says is totally wrong, sounding right helps make a buck ya know...

Yes, I'm a bit of "the black sheep", so no need for anyone to point it out, thanks.

Deleting the wildcard CNAME record will solve your problem with weird domain names in Google, but that record isn't there without reason. One possible reason as already mentioned is the existence of subdomains used for mail (smtp.example.com, pop3.example.com, imap.example.com) or nameservers (ns1.example.com, ns2.example.com).

Yeah, leave the cNames alone, unless you really know what you're doing and why you want to change/delete some from being accessed in any way -- The httpd.conf/.htaccess code I posted is a much more "safe" solution to the situation you're stating.

[edited by: TheMadScientist at 10:36 am (utc) on Dec 23, 2014]

10:35 am on Dec 23, 2014 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11870
votes: 245


welcome to WebmasterWorld, domino66!


would the 'solution' I asked about in my OP (just deleting the wildcard CNAME record) do the trick? Or is that something I should avoid, and instead address using MadScientist's code?

if a potential visitor requests http://ww.example.com/ would you rather they get an unresolved IP address or a 301 to the canonical hostname?
i would keep the wildcard CNAME and add a canonical hostname redirect.
11:10 am on Dec 23, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 14, 2008
posts:2910
votes: 62


Note 4 - LOL See the ww rather than www in phranque's question above?

Changing the cName to not allow wild cards will make it so people can't access your site with anything other than what you say [explicitly allow], so people will get an Unresolved IP Error if they make a simple typo [EG ww.example.com instead of www.example.com], but installing a redirect [presented above] to send all traffic to what you would like to be the canonical hostname [www.example.com, example.com, some-subdomain-you-pick.example.com, etc.] makes it so anyone who types w.example.com, ww.example.com, somename.example.com, somejibberish-1234-im-making-stuff-up-now-deal-with-it-lol.example.com or anything else ending in .example.com [note the . (dot) preceding example.com] goes to the right place/page to see what you present on your site.

It's definitely a safer and better solution to redirect all http(s) hostnames to what you want as the canonical http(s) hostname than to delete/change a wildcard cName, unless you really know what you're doing and exactly why you're doing it, IMO.
6:59 pm on Dec 27, 2014 (gmt 0)

New User

Top Contributors Of The Month

joined:Nov 20, 2014
posts: 31
votes: 0


Thanks, all, for the replies.

The one oddity that I still don't quite understand is why Google would index any of these gibberish subdomains at all!? I was under the impression that they'll only index something their crawlers find backlinks to (or perhaps that gets submitted via their URL submission engine.)

But I'm 100% sure that there are no live backlinks to any of these subdomains...e.g. test.mysite.com, wwwmysite.com. So why is Google indexing them?

(Best idea that was suggested to me was that perhaps Google will also index a site *if* someone types a web address into the browser's URL field *and* Google finds an actual website there, as would be the case with my gibberish subdomains because of the wildcard CNAME thing I'm going to solve with canonical redirect...still seems unlikely though...)
8:45 pm on Dec 27, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15934
votes: 887


At the beginning of this thread I had a moment of stream-of-consciousness:
It has just this instant occurred to me that Google could easily do wild-card-subdomain requests in exactly the same way they check for Soft 404s. Don't know if this is something they actually do.

I don't personally know how to verify this. But someone will know how subdomain logs work. See if they've been putting in requests for string-of-random-letters-here.example.com.

The one oddity that I still don't quite understand is why Google would index any of these gibberish subdomains at all!?

Google indexes absolutely everything, unless it has been explicitly told not to.
10:59 pm on Dec 27, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 11, 2007
posts:774
votes: 3


The one oddity that I still don't quite understand is why Google would index any of these gibberish subdomains at all!? I was under the impression that they'll only index something their crawlers find backlinks to (or perhaps that gets submitted via their URL submission engine.)


How do you know that a competitor who has discovered that you are using wildcarding is not placing links around the web pointing to a bunch of random sub-domains? Just because such links to not show up in WMT, Majestic, OSE, etc. does not mean that they do not exist.

I agree with others here that I would use 301 redirects to implement a URL canonicalization policy.
6:11 am on Dec 28, 2014 (gmt 0)

Preferred Member

5+ Year Member

joined:Jan 12, 2012
posts:397
votes: 0


The answer will depend on your host and the cname records they use. But for most sites, the basic cname records include something like:

yourdomain.com
www.yourdomain.com
pop.yourdomain.com
smtp.yourdomain.com
ftp.yourdomain.com

Beyond that, you can manually add more subdomains. Check with them and ask them to help you set things up -- but for the vast majority of sites out there, there isn't really a good use for wildcard subdomains.