Forum Moderators: Robert Charlton & goodroi
I develop sites for not-for-profits and political campaigns and organizations.
I was brought in to take over a site for a statewide caucus.
Several years ago, the caucus had a college student develop the site as a technology demonstration for his school work. He's graduated and moved on. (It's difficult to describe the problem without being able to use the actual terms or URLs)
I have created an entirely new site and it has an excellent google rank and we are doing just fine thank you.
However, this young man has kept a complete copy of his version of the site live on his server space as a demonstration of his skills as a web designer. He has promised to use "no follow" tags and such, but has failed to do so.
So if you type in the obvious search terms, the real version of the site is the first result, but his bogus copy shows up next.
To the uninformed web surfer, it's confusing.
Is there anything that I can do to get Google to quit returning his (bogus) copy of the site in their SERPs?
(It's outdated, inaccurate, etc)
There are legal issues here which I cannot comment upon both because I don't know the full details of the situation, nor am I a lawyer :-). But with that said, you may wish to determine whether the DMCA applies here:
[google.com...]
Google in more instances than not, totally ignores robots.txt, as well as, htaccess files, and spiders the duped content anyway, and typically penalizes the original
contenet developer.
Google cannot ignore stuff in the .htaccess file. That file controls exactly what Google actually "sees". It cannot be over-ridden.
How can I explain this without resorting to URLs or the actual search terms?
OK, imagine this. Fearless (myself) runs for congress and an unpaid volunteer creates a website called "fearlessforcongress.com" Several years later, he still has a complete copy of the site posted to his server space. And if anybody does a Google search of the terms "fearless congress" my current site is number one in the SERPS but his old, outdated and incorrect copy of the site is listed second.
Of course, I have absolutely no idea how much traffic he is getting as a result.
A couple of quick notes...
1. I doubt very much if Google is penalizing us. Our pagerank is too high. Also, it is not a duplicate copy of the site. It's the site a few years ago when the student in question was an unpaid intern.
2. I tried to file a complaint with his current ISP and THAT got me an email from the young man saying that the content of the site is "his" and that I "have no right to ask him to take it down."
Now he is blocking my emails to him. how mature...
Even if he was an unpaid volunteer, the site was created for us. The problem all started way back then when he was allowed to post a "testing" version of the site on his server space.
However, at this point, he has taken a number of measures which are all positive. I can't locate a robot.txt file for the site but he has added noindex meta tags and apparently has used the google automatic exclusion tool to stop listing the content... which is great as long as NOBODY has an indexed site that is linked to his version of the site.
There are only 76 days until the election and my only hope is that Google will respond ASAP.
I don't even want to THINK about the other search engines...
Again, thanks for your help.
If, on the other hand, he was given permission to place the stuff on hos site, there's nothing you can do.
You could ask him to use robot.txt - but I guess he wants his site to be seen (or there's not much point him making it!)
However, if your's is a campaign site, and he is promoting the campaign, that could be good - right?
he was an unpaid intern at the time and isn't being entirely co-operative, this smells to me like he has bad feeling about the time he was there, why don't you just try and resolve that issue and i think you'll find he happily removes the site altogether at your request... at least during the election period.
Google in more instances than not, totally ignores robots.txt, as well as, htaccess files, and spiders the duped content anyway, and typically penalizes the original contenet developer.
That was my first approach. Didn't work.
He did put a TON of "heavy lifting" into this and without going into too many details it was one of those "a miss is as good as a mile" situations. He seems upset that we just aren't using his project anymore. As a technolgy tour de force- it was great. I'm certain that he got an "A+"
As a real world website, OK, not great. Plus since he created everything from scratch -I have no hope of updating or maintaining it.
Plus, what he doesn't realize is that if his intention in leaving it online was to land future clients, I could have provided some AWESOME references...
[edited by: Fearless at 10:36 pm (utc) on Aug. 24, 2006]
He has made an impassioned defense of the copyright issue...
which of course, as everyone has pointed out, is utterly specious.
He's not running for office, it's not "his" site.
Sometimes, very, very smart people are their own worst enemies.
I know because I married into a family of (literally) mad rocket scientists and nutty professors, so this all has a very familiar feel to it.
Sometimes, there is no substitute for a little (un)common sense.
Gee, many people here will be happy to get as many sites as possible into the first 10...
He put "no index" tags - that's pretty much covers his part... I see it as somewhat unfair to him - he is helping you (with SERPS, free traffic, and tags) and you are bashing him... yep, don't sound right to me at all...
He put "no index" tags - that's pretty much covers his part
I can't locate a robot.txt file
has used the google automatic exclusion tool to stop listing the content
as long as NOBODY has an indexed site that is linked to his version of the site.
The exclusion tool will not work without a robots.txt in place as far as I know. But if there is a robots.txt and he (or you) then submits a removal request, the urls will be removed for 6 months, and that removal happens pretty fast -- backlinks will not enter into it.
The exclusion tool will not work without a robots.txt in place as far as I know.
If you use the "noindex" meta tag, the URL removal tool does work. You however have to enter every URL in the removal tool, one by one, which is can take some time with a larger site.
Some things still don't add up.
My experience is that when "noindex" is used, URLs are removed from the index in the next crawl. This is depending on the site within a few weeks to a month. Also Fearless mentiond that the person used the exclusion tool. Two questions about this: How does he know if he is not on speaking terms anymore with the site owner (he is blocking all Fearless' emails), and why are the URLs still in the SERPs?
There is still missing a percentage of the whole story IMO. Or maybe the "exclusion tool" is not the "URL removal tool" as I and others assume?
Just give him twenty bucks or something... and turn a bad thing into good one - tell him to foward all the traffic to your site with some redirect. This may actually boost your PR too...
And he wants so badly, he may recreate his old site somewhere else with "no index" tags... (or as suggested above in some directory which is restricted for SEs by robots.txt)
Lawyers... shmoyers... often it's easier just to pay few bucks to get it over with... saves time and everybody's happy...