site: Operator Shows Duplicates Due To ?ref= Query Strings

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

site: Operator Shows Duplicates Due To ?ref= Query Strings

figure3

9:32 pm on Dec 9, 2008 (gmt 0)

Anyone seen this or know what I'm faced with?

It makes it look like I have duplicate content.

So if my page's URL is http://www.example.com/page1.html

And I search for that page with this:

site:http://www.example.com/page1.html

I get results like this:

http://www.example.com/page1.html?ref=whatever.org
http://www.test.com/page1.html?ref=spammer.net

and so on...

it appears someone is duplicating my site and getting spidered which is hurting me.

marc

[edited by: tedster at 10:06 pm (utc) on Dec. 9, 2008]
[edit reason] switch to exampe.com - it can never be owned [/edit]

g1smd

10:10 pm on Dec 9, 2008 (gmt 0)

Set up a 301 redirect to force www, and strip the parameters off.

Two lines of code. Fit and forget.

figure3

10:16 pm on Dec 9, 2008 (gmt 0)

Actually, I did this two months ago...

now when someone visits http://www.example.com/test1.html?ref=whatever.org, it just pops them to the correct url.

However, google has LOADS of these pages indexed... any search for one of my subpages returns up to 3 variations of this per page.

Would I be wrong to assume that this sudden influx of these over the past 5 months would have caused my slow downturn in the SERPS? Does google treat these as duplicates.

I sent google a reinclusion request to hopefully have this corrected, but it hasn't helped thus far.

Should I painstakingly try to submit each of these for removal from google's serps?

thanks for the help.

marc

figure3

11:08 pm on Dec 9, 2008 (gmt 0)

Do you think it's better to redirect those queries as 301 to the right page, or 404 them and submit a removal request to google?

marc

tedster

11:12 pm on Dec 9, 2008 (gmt 0)

Actually, I did this two months ago... However, google has LOADS of these pages indexed

Have you verified that your server actually returns a 301 status in the http header? Google would normally have dropped those urls by now if it's really a 301 status rather than some other kind of redirect.

Should I painstakingly try to submit each of these for removal from google's serps?

I'd say no - that's not compatible with the 301 redirect approach, since you would need a robots.txt disallow rule or a robots meta tag. And to switch strategies would lose any backlink juice that is there.

it appears someone is duplicating my site and getting spidered which is hurting me.

They don't need to duplicate your content. I've seen this kind of thing a bit, and it's often related to a kind of log spamming. The site just adds the query string to the end of url on your site to identify their domain as the source of the referral. By any chance are your logs open to public view?

You could also do a search like [site:spammer.net keyword string] to see if your content is also being served on that domain. But what you've described is just creating duplicate URLs on your domain.

figure3

11:40 pm on Dec 9, 2008 (gmt 0)

Tedster, thanks for the reply...

Yes, I'm seeing the following when checking headers with an external tool:

HTTP Status Code: HTTP/1.1 301 Moved Permanently

In case it means anything, when you view the cached page on one of the offending URLs within google, it goes all the way back to September... these are quite old. One of my subpages has 4 different ref= instances... all of them show a timestamp of a different day within September.

No, running this query does not find anything relevant:
[site:spammer.net keyword string]

And lastly, there are no public log files available. I've never used them.

One other thing of note... I started using a sitemap 3 weeks ago... hopefully that will help google figure out what is important.

Oh, and I almost forgot... a couple of months ago, when I first figured this out, Google Webmaster Tools was reporting "multiple duplicate title tags" which revealed these ref= spammers... That dissapeared from WMT quite a while ago... I just wish they would remove those pages from the SERPS and restore my rankings to the way they used to be before this started!

Thanks again for thinking about this... I'll ask around and find the proper location to send your Xmas gift :-)

marc

[edited by: Robert_Charlton at 2:34 am (utc) on Dec. 10, 2008]

figure3

6:55 pm on Dec 10, 2008 (gmt 0)

How long does it normally take for google to wipe something out of it's system that has been 301 redirected? 2 months seems kinda long?

marc

g1smd

7:03 pm on Dec 10, 2008 (gmt 0)

*** In case it means anything, when you view the cached page on one of the offending URLs within google, it goes all the way back to September... ***

It looks like all those extra URLs have been dropped into the Supplemental Index. Leave them alone now. Google will quietly drop them in their own timescale. Where they show in the SERPs they will bring traffic. Over time, they will drop out, often in large blocks, a month or two apart.