Forum Moderators: buckworks & webwork

Message Too Old, No Replies

Domain Sellers Simply Wreck Domains

Search engine flags and domains on sale

         

Chico_Loco

3:39 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've been seeing a lot of domains lately that are for sale, by visiting the domains directly you see the usually stuff "Buy This Domain"..

And every single time I look at my Google Toolbar, what do I see? PR0.. So even if I wanted to buy that domain, it's virtually useless in terms of Google..

Why do they do the things they do, the host all of these domains on one server with the same common selling page, and Google goes and considers them a far, banning them essentiall for life from the index.

In doing this, they are reducing the value of those demains, reducing their profit AND destroting them for others .. I reckon they take a class in stupidity to get to this level.. It's not like they don't know about Google and the PR0 bans, why do they persist?

rfgdxm1

3:49 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Wrong Chico_Loco. Likely these are domains bought by domain name speculators. They are PR0 not because they are penalized, but because they lack inbound links. These are domains which are waiting for a site to find them.

Chico_Loco

3:51 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Na, your wrong .. There is a specific domain that expired a while go, had a PR5, still has inbound links, yet it's now a PR0 since the domain seller gots it's hands on it!

rfgdxm1

4:19 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



But Chico, you wrote "I've been seeing a lot of domains lately that are for sale, by visiting the domains directly you see the usually stuff "Buy This Domain".." There are of course PR0 domains for sale out their that have been penalized. However, the vast majority of these are just held by domain name speculators. Last stats I saw a huge percentage of domain names registered were inactive. A significant percentage held by speculators who (mostly falsely) think thars gold in them thar domains.

BigDave

4:29 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



the host all of these domains on one server with the same common selling page

Looks to me like duplicate content. As soon as it is changed to unique content it should come right back.

jmccormac

4:36 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I run a few search engines and one of the techniques that I use is based on identifying clusters of dodgy domain registrations. Notably one particular company in Hong Kong purchases these expired domains and then points them to its portal. I've just been spidering the UK .coms and this particular company has approximately 81989 websites on these domains (the IP space used is assigned to the UK). Naturally all of these websites will be eliminated from the search index.

As for Google, they may have developed some program to do something similar. It would not be that difficult:
1. generate a checksum based on the web content and the nameservers.
2. if there is a difference in the checksums for this month against last month check registrars
3. if the domain has changed to a cybersquatter nuke the website.

As simple as 1,2,3 really. :) However Google doing something so fundamentally simple is unusual. The inbound links argument is a possibility but the single IP hosting is a stronger one.

Another thing that some SE operators look for when compiling a search index is linkswamps. These are single IPs with large numbers 5000-60000 websites hosted on them. When you find them, it is a good thing because you can eliminate them from your index. These linkswamps are typically 'coming soon' or 'this domain was registered by a client of $hostingcompany' sites. The UK .com website index I am working on at the moment has a number of these linkswamps, the biggest having 50236 websites. Removing about 300,000 or so of these linkswamp websites from an index of 1069947 UK .com websites is a good thing for both the bandwidth and the end user. Though if Google has adopted a similar 'search and destroy' ;) attitude, it will upset domain speculators.

Regards...jmcc

JustSomeLameNewbie

7:00 am on Jan 23, 2003 (gmt 0)



Oh great yeah so every time somebody legitimately sells their site to someone else they would lose ALL pagerank and search engine value. So Google would be suffocating a significant part of the Internet economy.

Or if someone adjusts their domain name record to put a comma in their address when they change hosts, they get dropped from Google?

Or if their registrar, decides to change the format of the whois record slightly, then the next time they change host, they are dropped?

Surely Google are not *that* stupid? Please somebody who knows, tell me they're not ....

joeuz

7:04 am on Jan 23, 2003 (gmt 0)

10+ Year Member



Another thing that some SE operators look for when compiling a search index is linkswamps. These are single IPs with large numbers 5000-60000 websites hosted on them. When you find them, it is a good thing ...

Hello jmccormac,

Are you saying that if you find 5000 websites sharing the same IP, then you would automatically discard them all? I, for instance, run a server that currently has 2000+ websites, all on the same IP. They are all small websites sharing similar graphic templates, but each has its own different content. They were made for a customer who wanted a separate site for each product/service. The sites could potentially number more than 5000. Would a search engine like yours drop all those sites?

Thanks

Shakil

7:32 am on Jan 23, 2003 (gmt 0)



In doing this, they are reducing the value of those demains, reducing their profit AND destroting them for others .. I reckon they take a class in stupidity to get to this level.. It's not like they don't know about Google and the PR0 bans, why do they persist?
============================================================

maybe that when we sell domains, we dont give a damn about Google PR.

I have sold some of the most valuable domains in the UK, and NEVER ever has a prospective client asked about PR.

Shak

jmccormac

7:33 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Oh great yeah so every time somebody legitimately sells their site to someone else they would lose ALL pagerank and search engine value. So Google would be
suffocating a significant part of the Internet economy.

Not necessarily. A legitimate sale of a site would involve largely the same content. As such it would be relatively transparent to Google and other search engines. The links and content would still exist as would the page rank.

Or if someone adjusts their domain name record to put a comma in their address when they change hosts, they get dropped from Google?

A changed host does not necessarily mean a change of registrant. The content would be the same in this case. It is probable that Google would maintain the ranking but there may be some outage while Google picks up the site on the new host. As for checking complete WHOIS data on sites, it would be unlikely that Google would be interested in going that deep on a domain.

Or if their registrar, decides to change the format of the whois record slightly, then the next time they change host, they are dropped?

The bigger cybersquatters tend to use specific registrars and checking the registrar of a domain name does not require a full whois query. Thus if a domain shifts from one registrar to one that is known to be used by cybersquatters and the website IP changes to an IP associated with one of these cybersquatting operations, it is a pretty good indication that the domain is being sqatted. Therefore Google and any other search engine would be wasting user's time and bandwidth in sending them to the website of the squatted domain.

Surely Google are not *that* stupid? Please somebody who knows, tell me they're not ....

A squatted site would involve a massive change in content from a SE point of view especially if that site had hundreds of pages in Google. A reduction to a single page with 'this domain is for sale' on it would probably indicate something was up. Actually for Google to remove cybersquatted domains would be a good thing for the end user unless the end user is specifically looking for domains to purchase. Though the building of specific profiles for large cybersquatters is increasingly going to be a part of all SE work, the smaller one or two domain squatter is probably going to escape for a while. The SEs are always going to be playing catch-up and can be 6 to 8 weeks out of date. There used to be a buy/deny problem with some registrars allowing customers to buy recently expired domains and then deny or void the purchase a few days later thus getting the traffic for that site for a few days. That option has been stopped (I think).

At a guess, Google would probably apply some 'known bad' IP routines. It already sorts websites for countries based on IPs so if it could tie a particular IP or range of IPs to a large cybersqatting operation, it would make good sense to re-examine the sites associated with these domains to see if there is a significant change in content.

I'd hate to think of the number of com/net/org domains that are just lying there unused. The dot.bomb bubble created massive speculation in domain names though this has largely subsided. The purchase of an expired domain name purely for its Google PR is a dubious thing at best and ultimately Google will find a way to eliminate such websites (even if it does involve comparing the cached version against the latest version for fundamental differences. The domain resellers help by destroying the content that the domain's website had and replacing it with a for sale sign.

Regards...jmcc

JustSomeLameNewbie

7:55 am on Jan 23, 2003 (gmt 0)



A legitimate sale of a site would involve largely the same content.

It is not a question of "largely". It's a question of tripping up your arbitrary checksum triggers.
e.g. "<TITLE> Acme CoolWebHosting.com" gets changed to "<TITLE> Smith's CoolWebHosting.com". Bet alarms would go off all over the place for that simple change.

A changed host does not necessarily mean a change of registrant.

No of course not .... but it might. So be very cautious of tidying up your whois record when you are changing hosts then? Lame.

The content would be the same in this case.

So never do a site layout change or add new content to your page then? I don't think so. You can see these workarounds penalizing more and more innocent people, and the algorithms becoming more and more like cancer surgery can't you?

The bigger cybersquatters tend to use specific registrars and checking the registrar of a domain name does not require a full whois query

"Tend to use" is just not good enough. I see where you are coming from - you take the view that "I am trying to stamp out bad guys here so if a few of you innocents get killed along the way, it's all for a good cause". Clearly after learning about your scheme, squatters will start choosing more mainstream registars, and then according to your algorithms, even more casualties will occur.

These workarounds have major holes in them. I predict at some point, Google hires a large team of manual editors. There is no other robust way.

jmccormac

8:20 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Are you saying that if you find 5000 websites sharing the same IP, then you would automatically discard them all? I, for instance, run a server that currently has 2000+
websites, all on the same IP. They are all small websites sharing similar graphic templates, but each has its own different content. They were made for a customer who
wanted a separate site for each product/service. The sites could potentially number more than 5000. Would a search engine like yours drop all those sites?

Nothing is ever automatic joeuz :)
What sites like mine would be specifically looking on the first run (the pre-index) is for is the 'coming soon' and 'future home of' sites. Some of these sites even divert robots.txt to their 'coming soon' page. For a large registrar like Network Solutions or Register.com these 'coming soon' sites would have very high website counts. The biggest single IP count on UK .com websites was 50236 and it appears (on first check) that it is a holding page IP. The next one is 46003 and this is the IP of a Hong Kong operation that buys up expired domains and points them to its portal.

The large registars/hosting companies tend to have a few specific IPs for these on-hold websites. If these can be identified reliably (and often that means checking all of the websites) then a website on that IP would not make it into the main index. Some of these high website count IPs could be redirectors/web forwarding hence the need for checking each website. If the sites on the the single IP have a lot of different content then they would be included. The fact that there are so many sites hosted on a single IP would flag it for closer checking. But in global terms it is not that high.

The whole idea of producing a good search engine is to give the user what they want when they want it. If the content on those 2000+ sites is good and varied then they should make it into any search engine. (There may be crosslinking issues with some PR based search engines though.) While the website count may be on the high side, the grouping of sites is very common with geographical/tourist sites where the end of each domain name would be the name of a particular region or county. A high website count on an IP can sometimes indicate a co-located box belonging to a webdev company hence the need to often check everything just to be sure. From a search engine operator point of view, knowing what not to spider is as important as knowing what should be spidered. Google's method is, as far as I know, to use a crawler and then analyse the data. However it is important for Google and any other SE to ignore some IPs that are just used for 'holding page' websites as it could end up with a few million requests for what is essentially the same 'on hold' page. The squatted domains will generally escape under Google's radar for a while but they will give themselves away because of the massively changed content. The pre-index process is intended to increase the usability of the search engine and removing 'on hold' and 'coming soon' pages is a good method.

Regards...jmcc

onlineleben

8:27 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>even if I wanted to buy that domain, it's virtually useless in terms of Google<
Usually you don't buy a domain for google but to fit your cause and your content. Even if it has a PR0 (what also newly created, virgin domains have), you can start to build PR by getting links.
And for the instant success some of us always want, there still are some well known PPC engines.

heini

8:59 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Interesting insights from "the other site", thanks jmcc.

In any case this practice makes it harder for people to decide if a domain on sale is flagged by the SEs for past sins or not.