homepage Welcome to WebmasterWorld Guest from 23.20.63.27
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 78 message thread spans 3 pages: 78 ( [1] 2 3 > >     
Google's CN Domain Spam Plague - now noted by John Dvorak
tedster




msg:3460549
 5:59 pm on Sep 25, 2007 (gmt 0)

In our September Google SERP Changes [webmasterworld.com] discussions, members have been discussing the plague of spam sites using a cn domain extension. This has gone on long enough that now John Dvorak of PC Magazine published an article about it. Pretty high profile exposure for an embarrassing problem that is apparently quite challenging for Google's infrastructure!

Our members here also note that many of these domains use an odd character instead of a dot - possibly part of the scheme. Also, these spam domains may be from anywhere, and not neccesarily China at all.

Warning for those doing detective work - many of these spam domains will attempt to install malware on your computer.

...nine of the top ten results are these weird Chinese sites....the more specific and detailed the search request, the more likely Google is to list these Chinese sites. The issue has apparently been reported to Google, but if the basic algorithms allow this sort of result, even banning the specific sites will not stop this sort of abuse.

[news.yahoo.com...]


 

tedster




msg:3460595
 6:42 pm on Sep 25, 2007 (gmt 0)

I noticed Dvorak's suggestion: "Right now the motives behind this phenomenon are obscure, unless it's being done just for testing purposes. You know, like underground nuclear testing."

Anytime there's a major malware campaign, the purpose may well be establishing enough outposts on remote computers to launch a successful DDOS attack, potentially undermining business for a major corporation. Even overwhelming the internet's infrastructure is a possible target. This could be the sign of such a campaign. The scale is certainly massive.

In the past, other such efforts had some success with a lot less insertion into the SERPs. But I'm not enough of a malware forensic worker to dive into the code on my own.

[edited by: tedster at 9:03 pm (utc) on Sep. 26, 2007]

whitenight




msg:3460603
 6:53 pm on Sep 25, 2007 (gmt 0)

Ahh. Been waiting for mainstream media to report this.

Let's see how quickly this gets fixed now that the spotlight is on the 'Plex.

tedster




msg:3460617
 7:01 pm on Sep 25, 2007 (gmt 0)

It looks like this is, at its base, an issue with internationalized domain names - those using unicode, non-ascii characters. The way recent browsers are handling these domain names always sounded like a kludge to me, and now apparently it is an issue for Google's infrastructure as well.

For example, IE7 is using something they call "punycode" [blogs.msdn.com] to guard against domain spoofing in phishing attacks. But Google's challenge apparently goes beyond what browser makers face. At its worst, this might require Google to change core routines on every server they're running!

Lord Majestic




msg:3460630
 7:08 pm on Sep 25, 2007 (gmt 0)

The problem is that domain names have become too cheap - this allowed squatters to grab millions of them to have them parked (and Google encourages this by allowing AdSense on parked domains) and spammers also buy lots of throw-away domains.

There are maybe 100 people responsible for 90% of search engine spam - it is them, the root of the problem, that need to be dealth with, not the symptoms (sp?) of this decease - their actions.

pageoneresults




msg:3460664
 7:42 pm on Sep 25, 2007 (gmt 0)

Interesting close to the story...

On that note, I should add that entering "reset mp3 player m240d" on Yahoo! yielded worse returns, with all the results being weird Chinese sites—including one that tried to load a Trojan (caution!) that AVG killed immediately. When the same search terms were used on MSN, there were no results at all.

So, its not only Google but Yahoo! too. And what's up with MSN? Not enough market share to infiltrate their SERPs or they just haven't gotten around to it yet?

whitenight




msg:3460672
 7:48 pm on Sep 25, 2007 (gmt 0)

At its worst, this might require Google to change core routines on every server they're running!

Seems Goog better put all their resources into fixing this.
Before some enterprising Public Relations person at Ask, Yahoo! or MSN decides to "leak" this story to the national outlets that Google...err...results in Google are installing malware and people should be using "safe" results. (their engine, of course)

Beachboy




msg:3460710
 8:38 pm on Sep 25, 2007 (gmt 0)

How about Google just block all .cn domains -- with certain specific exceptions for well-known, established .cn sites -- from the USA-displayed SERPs until they come up with an algo fix? That should be a cinch to do.

g1smd




msg:3460720
 8:47 pm on Sep 25, 2007 (gmt 0)

>> The problem is that domain names have become too cheap <<

Add to that the problem of "domain sampling" and there are real issues.

acemi




msg:3460731
 9:01 pm on Sep 25, 2007 (gmt 0)

How about Google just block all .cn domains

Taking a look at Yahoo results for the search mentioned, #11, 12 and 13 appear to be of similar origin but with .info urls.

The .cn seem to be 525 - 550k while the .info are around 300k, all being very large file sizes. The file size may be used to trigger some alarm bells in the algos.

pageoneresults




msg:3460733
 9:01 pm on Sep 25, 2007 (gmt 0)

I think this is a good example of how "fragile" the "algo" really is. Remember in 2006 June someone dumped a few billion pages into Google's indices? That was chalked up to a "bad data push". Is this what we are seeing now?

outland88




msg:3460738
 9:10 pm on Sep 25, 2007 (gmt 0)

Congrats to Gehrlekrona he kept this issue on the front burner.

m1t0s1s




msg:3460778
 9:37 pm on Sep 25, 2007 (gmt 0)

google is working on it, in an infrastructure update, so it will take longer.

Achernar




msg:3460784
 9:50 pm on Sep 25, 2007 (gmt 0)

Yahoo displays no results for the search terms used in the example.
Is it already fixed on their side?

jimbeetle




msg:3460799
 9:59 pm on Sep 25, 2007 (gmt 0)

I get 15 results from Yahoo, including three dot infos.

Achernar




msg:3460803
 10:02 pm on Sep 25, 2007 (gmt 0)

Sorry, it was automatically redirecting to a local search with no result.

europeforvisitors




msg:3460817
 10:22 pm on Sep 25, 2007 (gmt 0)

Before some enterprising Public Relations person at Ask, Yahoo! or MSN decides to "leak" this story to the national outlets that Google...err...results in Google are installing malware and people should be using "safe" results. (their engine, of course)

I think the average user is more likely to blame China than Google, especially after the melamine-in-dog-food and lead-paint-on-toys scandals.

whitenight




msg:3460833
 10:54 pm on Sep 25, 2007 (gmt 0)

I think the average user is more likely to blame China than Google, especially after the melamine-in-dog-food and lead-paint-on-toys scandals.

Now, let me give you a basic marketing lesson.

MSN/Yahoo!/Ask PR department simply rewrites the above story under one of their many umbrella outlets and releases to NBC, ABC, FOX, CBS that Google is allowing these virus-infecting domains to rank - they aren't...

They don't even need to mention in the PR release what the TLD is (as the general public won't know or won't care) and even the most ignorant of persons will figure out that Google is the site that is allowing crap to be displayed. not another SE.

Btw - in the lead paint scandal the name that comes to mind to most people is Mattel not the 100 other companies that get toys from China.

Welcome to the wonderful world of market dominance and branding.
"Have you Googled your virus today?" :P

[edited by: encyclo at 2:00 am (utc) on Sep. 26, 2007]

gibbergibber




msg:3460835
 10:56 pm on Sep 25, 2007 (gmt 0)

Interesting to hear .info mentioned. I've had a serious problem with spammers on my site's forums, and almost all of it comes from members registered with .info addresses.

--There are maybe 100 people responsible for 90% of search engine spam - it is them, the root of the problem, that need to be dealth with, not the symptoms (sp?) of this decease - their actions. --

If the conditions exist for them to make money though (or gain some other benefit from this activity), then their arrest or whatever would simply make way for a new generation of crooks.

Whatever loophole or flaw they're exploiting, that's the root cause. I agree the crooks need to be dealt with too, but the opportunity to profit from this crime has to be squashed or we'll just replace one set of crooks with another.

tedster




msg:3460837
 10:59 pm on Sep 25, 2007 (gmt 0)

I'd like this thread, at least, to clearly state that these domains could be owned and served from literally anywhere - and at any rate, they are operated by individual people and not by "countries". The choice of a "cn" extension seems to relate to the ability to get a domain with unicode characters.

But how on earth do they get away with replacing the "dot" in "dot cn". Way too clever for my taste!

[edited by: tedster at 11:51 pm (utc) on Sep. 25, 2007]

Lord Majestic




msg:3460846
 11:11 pm on Sep 25, 2007 (gmt 0)

If the conditions exist for them to make money though (or gain some other benefit from this activity), then their arrest or whatever would simply make way for a new generation of crooks.

They engage in these activities because they don't get arrested and they don't go to prison - the worst thing that happens is that their cheap 50 cent .CN domain gets blacklisted. Big deal! Some of those guys are breaking laws by cracking sites and putting there links to their own spamsites in hope to get ranked high - this is happening with .EDU domains a lot: what Google needs to do is to put a price on their heads, basically a bounty payable to anyone who would bring those guys to justice - Al Capone got in jail for tax evasion rather than murder, same approach can be used here - Google/Yahoo/MSN just need to pool up and put bounties on their heads.

tedster




msg:3460879
 11:32 pm on Sep 25, 2007 (gmt 0)

Here is a copy of an informative post from m1t0s1s in our September Google SERPS thread [webmasterworld.com], where he points out the ideographic variant for the dot, or "full stop" that these domains are using:

According to
[lcweb2.loc.gov...]
it is:
212B34 FF0E EFBC8E &#129;D Ideographic variant full stop

He goes on to comment:

This does appear to be a google attack and not a china attack. Notice the difference between a site search for one of these sites:

[google.com...]
[google.com...]

Thanks for that, m1t0s1s. Normally we do not allow specific Google searches to be published in posts here, but we will make an "educational exception" in this case. I can certainly see why this kind of domain variant might have Google scrambling to fix some deeply embedded routines throughout their system.

gehrlekrona




msg:3460886
 11:42 pm on Sep 25, 2007 (gmt 0)

Thanks outland88 :)
I recently started calling them "chinese" spmsites instead since they can come from anywhere. I tracked them down to an IP address here in US, email the guy about it but never heard anything back, so he is either in on it or doesn't know what to do about it. It might not even be there, even if a lot if the spam sites points to the same IP address.

The "dot" thing I think is just a translation error. They probably registered the domains with chinese (?) characters that got translated to a domain name. I have seen some weird looking .cn domain names where the n in cn is not really an n. It LOOKS like an n but is different looking than a real on. I wish I could find it again to show you guys....

And yes, they are still all over the place even if you want to visit europe sometimes :)

Miamacs




msg:3460893
 11:48 pm on Sep 25, 2007 (gmt 0)

Forget the obligatory character from a "different charset".
( besides it's not always the dot. nor do they make any sense in neither Chinese encodings )

...

Forget it.
Most of these domains don't even exist.
( Go and check their whois. You did think of that, didn't you? )

Yeah, that means the URL you see on the SERPs is not a real URL.

Someone feeds Google with data which transforms into something else by the time it gets stored in its database / displayed.

...

Not sure how much of this has *anything* to do with China either.
I'm yet to see a Chinese ( .cn ) landing page.
All of them are in the US, UK, NL, RU... no real pattern there.

...

I'm not sure if the fake non-existent domain names aren't saying .cn to divert your attention from something really important.

And boy did they do a good job with it, I was surprised at the level of uh... hmm... well, interesting emotions and remarks vented towards East Asia all of a sudden.

Tracking the activity and remnants of previous batches of the same cr@p... this kind of an attack has been tested during the summer, and this latest rush was initiated only like... 3 weeks ago. All the domains that were registered to be the landing pages are about a month old at most...

...

Have seen some hosting companies removing subdomains by the bulk on which the same plague was spreading. By the time I got there they got rid of everything, still the SERPs would show the pattern you see with the "almost .cn" spam pages... scraped content, almost exact same filesize, varying everything and anything and coming up for just as much. ... where TrustRank isn't a factor. Meaning obscure, non-competitive, not so monitored stuff. Yahoo! and Google have their thresholds for trust set high enough at the moment for 1,2,3,4 word searches to scare away all the SEO wannabes for a lifetime *smirk*

...

Well, whatever, just stop bashing China and Japan... ( at least stop saying it as if they were like... evil twins plotting to take over the world together. Although it'd be real fun news to see them developing new ties of this depth. )

Stop and think for a moment: Check the IPs, redirects and landing pages. As I said, I haven't really seen a single Chinese IP / real .cn domain name so far.

...

*cough*

But then again, you better stay away from the fire.
Just don't click a title that's a URL ( Yahoo! ) or uses English words yet... it doesn't make sense in English.

[edited by: Miamacs at 11:53 pm (utc) on Sep. 25, 2007]

gehrlekrona




msg:3460902
 12:00 am on Sep 26, 2007 (gmt 0)

Miamacs,
You're right! I guess I should have said "with a chinese TLD" instaed of chinese spam sites when I first reported it.
I do not want to put any blame on chinese or japanese people at all.

I did check their IP's and traced them back, checked their Whois and everything. Reported it here but my post got removed since we are not allowed to post IP addresses (which is fine!).

I'd like to get down to some understanding of how they did it..... That would be interesting. Is is a DNS server thing?

Miamacs




msg:3460903
 12:05 am on Sep 26, 2007 (gmt 0)

uh, Gehrlekrona... I didn't mean -you- btw...

*grin*

Just that others in the SERP changes thread did say some pretty nasty things.

callivert




msg:3460946
 1:04 am on Sep 26, 2007 (gmt 0)

If the conditions exist for them to make money though (or gain some other benefit from this activity), then their arrest or whatever would simply make way for a new generation of crooks.

The conditions always exist for criminals to make money. The day that banks don't exist, bank robbers will be out of business. That doesn't mean that we have to be fatalistic and just give up all hope.

gehrlekrona




msg:3460995
 2:12 am on Sep 26, 2007 (gmt 0)

I wanted to check with MC to see if he had anythig to say about it, but I keep getting database error :(

He said 2 weeks ago or so that they knew about the spma sites and that they were testing things, but it needed a new infrastructure to get rid of it, which is amazing.

Most, if not all of them, probably has a check to see if it is a crawler or not and if it is then don't do a redirect. I have looked at the cached page and it looks fine, but if you refresh the cache then it redirects you....

Infrastructure change? Reload the page more than once would do it. These things have happened before and I don't understand how this can pass GOOG so easy, and the "sandbox", where did that one go when you can get new domains in the index this quick while people here complains about not getting indexed?

crobb305




msg:3461004
 2:27 am on Sep 26, 2007 (gmt 0)

And what's up with MSN? Not enough market share to infiltrate their SERPs or they just haven't gotten around to it yet?

There isn't room because of all the blogspots. LOL

tedster




msg:3461015
 2:49 am on Sep 26, 2007 (gmt 0)

Infrastructure change? Reload the page more than once would do it.

You're thinking like a browser user here - spidering is a different creature.

This 78 message thread spans 3 pages: 78 ( [1] 2 3 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved