homepage Welcome to WebmasterWorld Guest from 54.226.18.74
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 277 message thread spans 10 pages: < < 277 ( 1 2 3 4 5 6 [7] 8 9 10 > >     
How to Remove Hijacker Page Using Google Removal Tool
8,058,044,651 page indexed (now minus 1)
Idaho




msg:756514
 6:19 pm on Mar 17, 2005 (gmt 0)

Continued from: [webmasterworld.com...]


With the help of posts from crobb305 and others, I was able to remove a hijacker's page from the Google index.

My site was doing very well in the SERPs. For over 2 years it had been on the first page for a competitive term (1.2 million listings). Then during the first week in January my site disappeared and traffic tanked for no obvious reason.

When searching for "site:www.mydomain.com" I noticed that my index page often wasn't listed or it appeared on about page 3 or 4 of the results after all my supplimental pages.

A search for "allinurl:mysite.com" often didn't show my index page at all but instead showed somebody else's domain (located in Turkey). When I clicked on this link, my site came up. When I clicked on the cached version of the site, it showed a very old cache of the page. This same site also showed up after all my results when doing a "site:www.mydomain.com"

Using a header checker tool on the site's URL I was able to see it was using a 302 link to my site.

Last night after reading some posts by crobb305 and others I went to Google.com and clicked on "About Google." Then I clicked on "Webmaster Info." Then I clicked on "I need my site information removed." Then I clicked on "remove individual pages." Where I found instructions on how to remove the page.

(Here's the exact page where I ended up. If mod needs to remove then snip away:) [google.com...]

I then clicked on the "urgent" link.

Then:
1. I signed up for an account with Google and replied back to them from an email they sent me;
2. I added the "noindex" meta tag according to their instructions and uploaded it to my site;
3. Using the instructions to remove a single page from the Google index, I added the hijacker's URL that was pointing to my site. (copy and paste from the result found on "allinurl" search)

This didn't work the first time because I had to remove a space from the url to get it to work.

4. I got a message back saying that the request would be taken care of within 24 hours. The URL that I entered showed on the uppper right hand part of the screen saying "removal of (hijacker's url)pending."
5. I then removed the "noindex" meta tag from my page and re-uploaded it to my site.

This morning the google account still shows the url removal as "pending" but when I do "site:" and "allinurl" searches the offending URL is gone and my index URL is back.

Conclusions and Speculations:
At some point last September, Google cached the hijack page's url pointing to my site. In January, Google penalized my site for duplicate content because it found both URL's and compared them. Mine got penalized because it was the only page that really existed. The hijacker's page didn't get penalized because it only existed as a re-direct to my site.

Because my index page was now penalized, it dropped almost completely from the SERPs. (Some of my suppliement pages showed up for obscure searches) but none of my money terms.

Because I haven't been able to get a response from the hijacker's webmaster, the 302 is still in place but it is buried deep in his site and the last Google cache of the page was sometime in September. Therefore with some luck Google won't re-index it any time soon.

Will my site return to the SERPs? I don't know. Any thoughts?

 

Lorel




msg:756694
 8:39 pm on Mar 23, 2005 (gmt 0)


If we put up a noarchive the 302 can't hijack your cache because good ole Google doesn't store it anymore. Unless they do it before the tag goes up.

I had noarchive on all my pages and that didn't stop a multitude of redirects and stolen content. I took it back off so I can make a COPY of Google's cache so I have a 3rd party witness when I find stolen content.

ken_b




msg:756695
 8:48 pm on Mar 23, 2005 (gmt 0)

Googles response to this issue has been less than encouraging.

zeus




msg:756696
 8:52 pm on Mar 23, 2005 (gmt 0)

Ken_b - its a kick in the face, they say they dont care I cant get anything els out of it, oh one thing maybe they love scraper sites with adsense.

I realy hope yahoo,wisenut,MSN gets more popular, they do get more visits here the last years time, hope the best.

Nosmada




msg:756697
 10:15 pm on Mar 23, 2005 (gmt 0)

I still think there is an advantage to the noarchive meta tag. If your content is constantly updating then every time the go to compare a page with itself (as duplicated throught the hijacked URL) they will see a different page. Every time my page reloads it is different so they could never see the same page twice! What do you guys think? Having a third party cache in Googel when Google doesn't care if you have been hijacked or not seems hardly worth it. I am not saying the noarchive will prevent hijacking but if your content is constantly changing then you just might not get killed by the duplicate content filter and then we may not need to care about the hijackers at all - but only if you content is constantly changeing with each page reload.

Reid




msg:756698
 10:30 pm on Mar 23, 2005 (gmt 0)

Well one encouraging comment by googleguy is that they did identify some test sites that they can see what effects their algo changes are making.
Googleguy seems to have a problem with his own communication skills and doesn't really seem to fully understand the problem personally but he has made some engineers aware of the issue and I'm sure they understand it a lot better. So there is something being done. From reading his posts about this issue though I don't think googleguy himself has fully grasped the issue, it's not his feild after all, his job is to bring the issue to the attention of google engineers which he has done.
Give the guy a break, he is doing his part, it's not his job to report to us everything going on in googles back rooms.
He did provide a way to communicate our concerns and if they only got 30 VALID complaints which demonstrate the issue then that should be enough for them to figure out how to deal with it.
Keep the e-mails going though, if you have a VALID hijacking issue then make every effort to explain fully and completely how it appears in the SERP's, and every bit of info you know about the URL.

incrediBILL




msg:756699
 10:43 pm on Mar 23, 2005 (gmt 0)

GoogleGuy has a big 302 explanation post on \. today:

Looked more like PR damage control than an explanation.

I was pretty skeptical about the 302 hijacking claims until I actually saw a couple of them. What appears to be happening seems fairly obvious to me and I wouldn't expect it to have anything to do with canonical pages, it looks like a plain and simple BUG, but I could be wrong.

The real problem is how Google handles inquiries about hijacked sites. Google should just FIX the links as reported, not telling people to go chasing after the other web site and begging them to remove your listing, assuming you can find them in the first place and they even read your email.

Very bad support policies, very bad PR situation, they will either learn quickly or someone will step up to the plate and take their business away.

[edited by: incrediBILL at 10:51 pm (utc) on Mar. 23, 2005]

crobb305




msg:756700
 10:43 pm on Mar 23, 2005 (gmt 0)

Googleguy seems to have a problem with his own communication skills and doesn't really seem to fully understand the problem personally

I am not going to defend Google per se, because I am very frustrated that my site is still MIA since tracker2s/302 took over in May 2004. But Googleguy has been very helpful here for years. It's sad that he has gone silent...perhaps under pressure from Google, or because of numerous insults from this forum.

Chris

[edited by: crobb305 at 10:50 pm (utc) on Mar. 23, 2005]

Atticus




msg:756701
 10:44 pm on Mar 23, 2005 (gmt 0)

Reid,

I agree that GG doesn't seem to get it at all. And based on his comments in that other forum, I am not encouraged about G's commitment to fixing this anytime soon.

GG says that the sites that are getting hijacked are spammer sites about viagra and texas hold em! He seems to think that only spammers get hijacked and so he graciously offered amnesty to them if they report the problem -- he's talking about amnesty for the hijacked, not the hijackers!

Unbelievable.

Meanwhile the index pages of two of my sites are duplicated in the SERPs, one for me and one for the hijackers (which I can't remove because one is a dead account sans 404, the other is a 302 backed up by a metarefresh=0).

Vec_One




msg:756702
 10:58 pm on Mar 23, 2005 (gmt 0)

I also think GG has underestimated the scope of this problem. This thread has grown to 188 posts in less than a week. Japanese's is up to 710. I'm sure there are a lot more than 30 sites affected.

Most people are probably oblivious to the hijacking problem though. They might know that their sites have tanked but they have no idea why. Some of the more enlightened people might be aware of GG's ‘canonicalpage’ instructions but they are not all willing to submit their handiwork directly to Google engineers for examination.

Anyway, back to the search for a solution. I have a few things that change on their own, such as the date and a random quote. It apparently isn't enough, however, because I'm the victim of multiple hijackings. Does anyone have any clever ways to automatically randomize pages?

snoremaster




msg:756703
 11:00 pm on Mar 23, 2005 (gmt 0)

We simply need a new tag telling search engines what domain the content is from.

<meta name="basecontent" value="mysite.com">

Then whenever google indexes a page it knows what the main site is and all 302s will be invalidated.

It's that simple isn't it?

Google is our business partner; theyre a lot bigger than me and I can't afford to alienate them. I really wish the negativity, the spreading of rumours and the unfounded accusations would stop. GoogleGuy seems to be helpful but each time he/she/they show up there's this deluge of complaining. We should really be having a constructive discussion about how to improve things.

cornwall




msg:756704
 11:09 pm on Mar 23, 2005 (gmt 0)

I was pretty skeptical about the 302 hijacking claims until I actually saw a couple of them.

Being, you might say, a little involved in this issue today, I was pretty sceptical about 302 hijacking claims till I discovered 30 or 40 of them, and that I could zap the ****'s that were doing it to me ;)

Anyway, back to the serious issue. As the gentleman "claiming" to be Googleguy says in his SlashDot post

it often boils down to wanting to choose the url with the most reputation. PageRank is a pretty good proxy for reputation, and incorporating PageRank into the decision for the canonical url helps to choose the right url

I am deeply grateful to him for that post, this gets to the very core of the problem. One can only deduce from this thata scraper site that can afford to buy links(come on Google, you know a lot of them are), and hence PR, can then bang on the 302's and that their site will grow and yours (if it has less PR) will shrink to nothing.

Hence Google, the problem! Tell me if I am wrong.

g1smd




msg:756705
 11:12 pm on Mar 23, 2005 (gmt 0)

>> Then whenever google indexes a page it knows what the main site is and all 302s will be invalidated. <<

So, the spammers put that tag on their site too and where does that leave you?

Atticus




msg:756706
 11:15 pm on Mar 23, 2005 (gmt 0)

snoremaster,

I'm all for being objective and looking on the bright side. Thing is, GG has not commented HERE regarding this issue. He posted in another forum today (if that's really him, cuz the more I think about how stupid and insulting his comments were, I think it must be a fake).

Assuming the comments are from the genuine GG, I'm not sure why you are interested in having G implement a new tag to fix this 'problem.' According to GG this situation only effects 30 spammy web sites about viagra. If your premise is that G is communicating in a open, honest manner, then the situation as described by GG is really no problem at all.

snoremaster




msg:756707
 11:31 pm on Mar 23, 2005 (gmt 0)

g1smd:

So, the spammers put that tag on their site too and where does that leave you?

They can't because its a 302 so they cant change the content.

Atticus:
why you are interested in having G implement a new tag to fix this 'problem.' ... If your premise is that G is communicating in a open, honest manner, then the situation as described by GG is really no problem at all.

I'm not prepared to believe google isn't taking this seriously. They'll probably start to refrain from informal communications with the field through "googleguy" because the forums and blogs are simply pushing them in a defensive position where they have to do damage control and have lawyers and politicians write their communications. I don't want to see that happen and hopefully google can find some way to continue informal communications.

Atticus




msg:756708
 11:36 pm on Mar 23, 2005 (gmt 0)

snoremaster,

I also think G is working on it -- another reason that that GG post seems bogus to me. Stuff going on behind the scenes has my site going up and down like a yo-yo. Seems like there's a dark force trying to drag me down and some unknown agent snatching me from the jaws of complete obscurity.

Ever noticed how hard it is to type with your fingers crossed?

Emmett




msg:756709
 11:41 pm on Mar 23, 2005 (gmt 0)

I just watched my site hit the serps for a "my company name" search for the 1st time since Feb 2nd. It's in almost all data centers now. Will be interesting to see if I start getting traffic again. The 302 is still there, however I changed a lot of content on my homepage to get out of the duplicate content penalty.

incrediBILL




msg:756710
 11:44 pm on Mar 23, 2005 (gmt 0)

it often boils down to wanting to choose the url with the most reputation. PageRank is a pretty good proxy for reputation, and incorporating PageRank into the decision for the canonical url helps to choose the right url

See, that's a load of happy horse droppings if I ever heard it.

Come on GoogleGuy, Page Rank hasn't got a thing to do with content ownership. If I write and publish a page, it's MY page and nobody else's reputation or page rank should outrank my content, especially when a specific search should only pull up my content, just because they link to me and have better PR.

If that's how Google truly works then it's flawed from the bottom up.

steveb




msg:756711
 12:00 am on Mar 24, 2005 (gmt 0)

More to the point, higher pagerank does not guarantee (and seems to have nothing to do with)preventing your page from being pagejacked.

crobb305




msg:756712
 12:02 am on Mar 24, 2005 (gmt 0)

I just watched my site hit the serps for a "my company name" search for the 1st time since Feb 2nd. It's in almost all data centers now. Will be interesting to see if I start getting traffic again.

I am seeing the same thing. For the first time in 9 months, my site is showing up at position 1 or 2 for its name (mydomain) on a few datacenters. Before this past weekend my site was showing at position 65 or higher. Furthermore, if I seach snippets of my content my site is showing up #1 on those same datacenters, whereas before I was in the supplemental basket. My site is not showing up for any search terms yet, but these are some good signs.

C

Trawler




msg:756713
 12:02 am on Mar 24, 2005 (gmt 0)

incrediBILL

If that's how Google truly works then it's flawed from the bottom up.
____

From everything I see concerning 302's there is no doubt that their algo is flawed. Most of the crap that is at the top of the serps is there because of 302's The reason they are having a hard time admitting to a problem is because their whole palace is built upon the premise that links are supreme to page rank, position, and relevancy. Of course if you have the links who cares what the content on the page is.
That's the way the algo now works.

To admit that people have now figured out how to exploit their dream, is an admission that the whole house of cards can be dealt from under the table. They will never admit that, they will just play around with the algo and hope it all goes away.

Sad to see a once great search engine reduced to what it now is.

steveb




msg:756714
 12:10 am on Mar 24, 2005 (gmt 0)

I had a domain hijacked, by a far lower PR page.

I submitted the clear example to the address GoogleGuy posted that same day. That was over a month ago.

About a week ago I made the one change to the domain (that I should have had in place previously) that was among the things suggested in one of these darn threads (absolute links, redirect non-www to www, etc.).

The domain now appears on all datacenters (though ranking waaaaaaaaaaaaaay worse than before).

I don't know if there is something unique in my situation, but I would suggest people take all those steps, plus use the URL removal tool when possible. Perhaps there is hope.

I'd also add that it would seem to make sense to self-link all your pages.

Reid




msg:756715
 12:33 am on Mar 24, 2005 (gmt 0)

We simply need a new tag telling search engines what domain the content is from.

We already have that META tag
<base href=""> tells the canonical url of the page itself

theBear




msg:756716
 1:06 am on Mar 24, 2005 (gmt 0)

Google has that information as well, it is the target of the redirect that is where it exists and that is where it should be under its target URL.

This ain't rocket science folks.

I know several middle school kids years ago who could have figured it out.

zeus




msg:756717
 1:12 am on Mar 24, 2005 (gmt 0)

Bear are we saying here that all links on future sites shall be [domain.com...] all ways with the whole url in all internal links.

Canonical url I never realy got what that ment.

g1smd




msg:756718
 1:14 am on Mar 24, 2005 (gmt 0)

Hmm. Having read the threads, seen the SERPs, and thought about it some more....

If there are two pages URLa and URLb, Google would cache, index, and rank both of them. If one provided a normal link to the other then it would "pass" some PR too. Both could appear in SERPs.

If URLa provided a 301 redirect to URLb then I would assume that just the URL for URLa would be stored internally in Google, and marked as being a redirect. I also assume that URLa would be dropped from the SERPs for the period that it returned that status, and that URLa would be respidered occasionally to see what its status was. The content residing at URLb would be spidered and indexed and would appear in the SERPs with the URL for URLb against it. If at any time URLa went 404 then it would be dropped from the index, likewise URLb.

If URLa did a 302 redirect to URLb, then this is a temporary redirect. URLa is saying that the content temporarily resides at URLb. There is no reason to include URLa in the search results though. Google could quite easily include URLb in the results with its associated content being cached and indexed. However, Google could also keep an internal note that it had been redirected from URLa, and if the status of URLb ever changed from 200 to 404 then Google would know to go back to URLa and ask it for the new location of the information. That is, Google "remembers" URLa as being the starting point for the 302 redirect but does NOT show URLa in the SERPs as there never was any content AT that location.

Does this make sense? What flaws would there be in that?

Lorel




msg:756719
 2:12 am on Mar 24, 2005 (gmt 0)

Having a third party cache in Googel when Google doesn't care if you have been hijacked or not seems hardly worth it.

When you are writing the hosting company, after contacting the copyright theif to no avail, to tell them their client is breaking copyright infringement rules (that every host must uphold or loose their license) then having Google's cache of original content will dispell any he-said, she-said arguments.

I've gotten a site totally removed more than once when there is 3rd party evidence.

idoc




msg:756720
 3:31 am on Mar 24, 2005 (gmt 0)

<self snip> On second thought, I can't bring myself to post how I think this is actually being done. But, I don't think 302 hijack happens as a *general* rule from normal non-malicous use of 302's such as from normal tracking script use.

Emmett




msg:756721
 4:50 am on Mar 24, 2005 (gmt 0)


About a week ago I made the one change to the domain (that I should have had in place previously) that was among the things suggested in one of these darn threads (absolute links, redirect non-www to www, etc.).

steveb,

Thats exactly what I ended up doing along with severly changing my homepage content and I'm in the SERPS today.

geekay




msg:756722
 5:28 am on Mar 24, 2005 (gmt 0)

It has been suggested here that webmasters should add some certain code to their pages in order to avoid becoming hijacked. I hope such a code will not be necessary, because in reality only a small fraction of the world's webmasters will ever learn about this solution.

It's like you having to opt-in in case you want to secure your site's rightful position in Google SERP's. It should be: if you don't care about how Google index your site, then you could proactively opt-out of fair treatment. Hijacking is not a SEO problem. Google created this mess and Google must solve it.

Reid




msg:756723
 6:02 am on Mar 24, 2005 (gmt 0)

On second thought, I can't bring myself to post how I think this is actually being done. But, I don't think 302 hijack happens as a *general* rule from normal non-malicous use of 302's such as from normal tracking script use.

Normally there is no problem with a normal use of a 302 tracking script, although several people have reported that a high PR page can hijack a low PR.

The real 302 hijack is more sinister and purposefull.
A tracking script does a 302 redirect but not at your homepage.
It points to a blank page with a 0 second META refresh pointing at your page. Google is assigning your PR to that blank page. That is the type I have pointing at me. I'm having dark thoughts recently about studying hacking tecniques and open proxys and stuff.
<self-snip>

[edited by: Reid at 6:12 am (utc) on Mar. 24, 2005]

4crests




msg:756724
 6:07 am on Mar 24, 2005 (gmt 0)

4 out of my 6 sites had all or part of their pages being hijacked. I agree with everyone that this is a problem that Google needs to fix.

But, I am happy to say that the Google Removal Tool worked great for me.

But, along the way I really came to realize how big this problem is. Someone above said that most people don't even realize their sites are falling victim. I really have to agree with that statement. I had no idea until I started digging into it.

This 277 message thread spans 10 pages: < < 277 ( 1 2 3 4 5 6 [7] 8 9 10 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved