Welcome to WebmasterWorld Guest from 22.214.171.124
My site was doing very well in the SERPs. For over 2 years it had been on the first page for a competitive term (1.2 million listings). Then during the first week in January my site disappeared and traffic tanked for no obvious reason.
When searching for "site:www.mydomain.com" I noticed that my index page often wasn't listed or it appeared on about page 3 or 4 of the results after all my supplimental pages.
A search for "allinurl:mysite.com" often didn't show my index page at all but instead showed somebody else's domain (located in Turkey). When I clicked on this link, my site came up. When I clicked on the cached version of the site, it showed a very old cache of the page. This same site also showed up after all my results when doing a "site:www.mydomain.com"
Using a header checker tool on the site's URL I was able to see it was using a 302 link to my site.
Last night after reading some posts by crobb305 and others I went to Google.com and clicked on "About Google." Then I clicked on "Webmaster Info." Then I clicked on "I need my site information removed." Then I clicked on "remove individual pages." Where I found instructions on how to remove the page.
(Here's the exact page where I ended up. If mod needs to remove then snip away:) [google.com...]
I then clicked on the "urgent" link.
1. I signed up for an account with Google and replied back to them from an email they sent me;
2. I added the "noindex" meta tag according to their instructions and uploaded it to my site;
3. Using the instructions to remove a single page from the Google index, I added the hijacker's URL that was pointing to my site. (copy and paste from the result found on "allinurl" search)
This didn't work the first time because I had to remove a space from the url to get it to work.
4. I got a message back saying that the request would be taken care of within 24 hours. The URL that I entered showed on the uppper right hand part of the screen saying "removal of (hijacker's url)pending."
5. I then removed the "noindex" meta tag from my page and re-uploaded it to my site.
This morning the google account still shows the url removal as "pending" but when I do "site:" and "allinurl" searches the offending URL is gone and my index URL is back.
Conclusions and Speculations:
At some point last September, Google cached the hijack page's url pointing to my site. In January, Google penalized my site for duplicate content because it found both URL's and compared them. Mine got penalized because it was the only page that really existed. The hijacker's page didn't get penalized because it only existed as a re-direct to my site.
Because my index page was now penalized, it dropped almost completely from the SERPs. (Some of my suppliement pages showed up for obscure searches) but none of my money terms.
Because I haven't been able to get a response from the hijacker's webmaster, the 302 is still in place but it is buried deep in his site and the last Google cache of the page was sometime in September. Therefore with some luck Google won't re-index it any time soon.
Will my site return to the SERPs? I don't know. Any thoughts?
An interesting aspect is that, according to your experience, Google are removing the submited URL and not the destination URL. This does make sense, given that Google fixed the "remove competitor's home page" exploit last Summer.
The next question, is how quickly the benefit of the backlinks will be applied to the rightful URL.
Can you confirm that you were able to remove the hijacker without inflicting any collateral damage on your own URL? If so that's great news and we finally have a decent way to fight this.
To do this, you use the removal tool and set the meta robots tag to "noindex" just long enough to get the url submitted. Then, instantly return the metatag to "index". If you forget to change the tag back, you obviously risk having the intended url removed next time Googlebot checks your site. When you submit a url for removal via the removal tool, the program will instantly check to make sure the tag is set to "noindex" (for your protection), but it will not check again. That is why you are able to immediately return the tag to index after you get the submission "success".
The only thing I am not sure about is if Google still knows about the url(s) that are removed and uses them in ranking calculations. Does Google only remove the url from visible index? If Google removes the urls from visible index but retains the url/information somewhere else for it's own purpose, then our efforts to remove the urls and help Google clean up it's horrific mess are in vain. This would not surprise me in the least.
This is a *very* interesting post. It's the first thing I've seen about 302's that seems like it actually would work. Here's hoping.
In November 2004, there were as many as 20 urls that were NOT mine showing in a site:mysite.com search. These urls were mostly tracker2's. But, after Google had associated those tracker2s with my site, it then began associating all redirects it found with my site. Incidentally, the site: search is supposed to show only urls that are truely part of your site. If it shows unrelated urls (and certainly 20 unrelated urls) then there is a problem.
So, I began submitting those urls to Google Removal tool. Since the redirects ultimately landed on my page (destination page), I had control of it's removal using robots metatag. The last one was removed in late January. Unfortuantely, nothing has changed. My site is still MIA. There are still some unrelated urls showing in the site: search since the redirect was removed prior to my learning about the removal tool. Those remaining urls were last cached on Nov 2, and until Googlebot revisits them it will never know they no longer redirect to me. I am convinced that there is nothing we can do. Google is just broken and they don't care. When I search for my company name, my home page is no where to be found. Rather, dozens of scraper/directory style sites with 0 pagerank are listed. Very pathetic and sad.
Been lurking here for years, but with this 302 fun, I just gotta join in.
By searching for text unique to my sites, I found 5 URLs in Google (not mine) using my page title and having my info in the cache. I was able to use methods detailed above to initiate removal for 3 of these URLs. However, one of these URLs is not just a 302, but also uses a meta refresh set to 0, so the Google removal tool is not seeing the meta no index tags which I temp. added to my page (Google sees a page that has nothing but the hijacker's meta refresh tag on it). The second URL I could not remove goes to a page that says "This Account Disabled...," the 302 no longer redirects, but the URL is in the SERPS with my cache.
Is there a Google e-mail or other method that can help rectify these problems?
I have submitted those urls AND I have linked to them from various pages. But, Googlebot has yet to revisit and update it's cache. I linked to them because when the webmasters removed the redirect to my page, they simply redirected the url back to their own homepages. But, as I said, Googlebot is not interested in visiting, and Google would rather have old, outdated cache in it's database.
On one of them, the removal tool wouldn't work because it couldn't recognize the characters. I think it got hung up on the %2F. I don't know why Google can index and display a URL but can't remove it.
macdave, that seems like a good suggestion. I would be interested in knowing if anyone has had success with this.
In addition to removing all those urls, I rewrote the content. You are right, we are approaching the 90 day mark. But, it is not right for Google to penalize an innocent site. It is not my fault that webmasters copied my content, or set up malicious redirects to my homepage. Google should have a way of manually removing penalties such as these.
I still have my doubts that anything will change in the next 6 months.
Does anybody know how to get to the other 8900 results?
I can't use this remove feature if I can't see all of the results!
Can you confirm that you were able to remove the hijacker without inflicting any collateral damage on your own URL?
I just got confirmation from the Google control panel that the offending URL removal is "complete."
An "allinurl:www.mydomain.com" search shows the hijacker gone and it shows my index page alive and well.
Just dropped by to say that you can also serve a 404 or 410 code, that works just as fine. (no need to serve it to all, just do a little .htaccess magic and serve it to Google for a few minutes, then take it down again)
Q: What was the IP and/or User-Agent of the script that checked your page (ie. the URL removal tool)?
Altough i'm glad that you managed to get some redirect scripts removed, i'm also a bit worried as this should really not be the responsability of the hijacked webmaster. This does not fix the problem. Google should plainly fix this, so that we could get on with building and maintaining our sites in stead of fixing their errors for them.
[edited by: claus at 11:26 pm (utc) on Mar. 17, 2005]
Trying to understand...
Not following "case of template.." I use SSI headers and footers but not getting your meaning.
[edited by: ciml at 8:07 am (utc) on Mar. 18, 2005]
[edit reason] Examplified [/edit]
Altough i'm glad that you managed to get some redirect scripts removed, i'm also a bit worried as this should really not be the responsability of the hijacked webmaster
While I too am happy to see the Google removal tool working to an extent it simply is impractical.
I have over 20 sites which have all been hijacked by over 100 web sites all using the same template.
Let me rephrase that.
What I mean is that each site has been hijacked a 100 times, so I would have to submit 2000 pages to the Google removal tool and temporarily remove my sites or place the no index tag in there and then put it all back to normal again afterwards.
That would take a good deal of time.
What's to stop me from getting them to take all of my competitor's pages out of their index? How are you being authorized to remove a page from their index?
It doesn't sound like there is any authorization going on at all ... just requests for removal of someone else's page being granted. Worrisome.
Idaho, it didn't work in the past for crobb305 but it worked for you just now.
I'm not saying my site has come back in the SERPs. Judging from what Crobb305 has said, this probably won't happen until:
1. Google re-indexes my page;
2. The next update; or
3. The next update after some duplicate content penalty expires.
All I'm saying is that I sucessfully removed the offending url without removing my own page from Google's index.
You submit a page to be removed, G checks to see if it has a noidex tag and if it does, it removes the page. So, yes, you could submit URLs not your own, but only pages with a noindex tag get dropped -- so you couldn't do it maliciously to anyone who hasn't included that tag on their page.
It works in this case, because G thinks the page is yours -- G fetches YOUR page with the noindex tag, when it checks the "hijacking" URL.
This is a case of removing a page that is using a 302 redirect to one of YOUR pages, stealing your content and pretending that it is theirs.
By temporarily removing your page and signaling to Google that you'd like to remove the hijacker's page, Google mistakenly believes that the hijacker's page is no longer online and wants it to be removed.
The same flaw in the algorithm which is responsible for indexing the hijacker's page (which is stealing your content) also can remove it.
The problem here is that first off it's difficult to discover all the pages that are hijacking your pages, and secondly it requires a substantial amount of time directly proportional to the number of total pages to be removed.
It works like this:
Google has indexed my own page as belonging to some other site. This is called a hijack. It's really just a Google glitch because it's really my page, it resides on my site; I control the content, etc, but Google thinks it belongs to some other site.
So what you do is put a meta tag on the page that tells Google not to index the page. Then you have Google go have another look at the page through the offending url. When it does, it sees the "noindex" meta and removes the offending url. It doesn't remove the page because you told it to, it removes the page because the author of the page has a meta tag on the page that says to remove it.
After Google looks at the page, it removes the URL from its cache of pages belonging to the other site. The trick is to remove the meta tag before Google comes along through your URL and notices the tag. If this happens it will also remove it from your site.
The 302 is resolving to YOUR page, hence either G has the 302 page AS BEING your page or it doesn't. In either case, the site is NOT within a domain you are authorized to manage, and G doesn't ask you for any authorization ... does it?
In the latter case, G sees BOTH pages ... and you are authorized to manage only YOUR page. In the former case, G sees only the 302 page ... which you are not authorized to manage.
How can G remove a page at your request when you are not authorized to manage that domain? Are we in agreement that G is too stupid to realize what it has indexed and who is asking it to take a page out of that index?
The "trick" described above only works if G does not validate authorization to manage the offending page's domain. If they go ahead and remove a page from someone else's domain from their index because you ask them to, that just doesn't sound right.
The ends do not justify the means, and this leaves a lot of issues on the table ... issues far more serious than what the 302 perpetrator did in the first place.
All the tool does is tell google to look at the page and read the meta tag. If the meta tag is there it removes the page. If it isn't there it won't remove the page. You couldn't possibly use the tool on a competititor's page to remove his content unless you could somehow get your competitor to also include the meta tag.
It does seem interesting to think that maybe you could use the tool to reindex your page back into Google's index by setting the metas to "index."