|If they go ahead and remove a page from someone else's domain from their index because you ask them to, that just doesn't sound right. |
That's not how it works.
In essence, all the URL removal tool does is tell Googlebot to visit a URL sooner that it normally would. The actual removal is only accomplished if the webmaster has put that URL into robots.txt, or added
<meta name="robots" value="noindex">, or the URL returns 404 Not Found.
I can submit your URL all day long, but it won't be removed unless you've specified (via robots.txt, meta tag, etc.) that it shouldn't be indexed.
|Are we in agreement that G is too stupid to realize what it has indexed and who is asking it to take a page out of that index? |
That's the root of the 302 problem. Google is incorrectly crediting the content of the hijacked page to the hijacker's URL. Since the webmaster of the hijacked site (e.g. Idaho, crobb305, myself) still controls the content, we've been able to take advantage of the bug to remove the hijacking URL.
The same logic that allows the hijack also enables the removal of the hijacker.
StupidScript: They aren't really removing a page from the other site without authorisation, because the page doesn't even exist on the other site. Google only thinks that a page exists there.
So I just noticed that someone did this to one of my sites. I just followed the steps and got 3 visits from "googlebot-urlconsole"
126.96.36.199 - - [17/Mar/2005:21:44:48 -0500] " "googlebot-urlconsole"
188.8.131.52 - - [17/Mar/2005:21:45:08 -0500] " "googlebot-urlconsole"
Should I now take down the noindex for googlebot, or wait for another visit from this bot?
If you got a message in the console that the removals are "pending," then it's time to remove the noindex tag. Take it out soon, before Gbot visits the page directly rather than through the redirect. If the tag is still there when Gbot goes directly to the page via your own URL, you will lose that listing in the SERPS.
I will read through this tonight. First question - any risks to doing this? (vs. just waiting for google to clean up their own act)?
It amazes me that this has made headlines here at WW serveral times now and yet google still has not addressed this problem.
|I'm curious how Google lets someone who does not own an offending domain remove it from their index? |
With all due respect, this is not that complicated. If a url REDIRECTS to YOUR page, then in a sense, you own the control of that url from a Google-URL-Removal-Tool point of view. Google will look at YOUR metarobots tag before removing the url, as it should (since your page is the destination page of the redirect url).
If a url is redirecting to my page (homepage, etc) and I don't want it to, I simply set my metarobots to "noindex", submit that unwanted redirect through the google url removal tool, and POOF, it's gone. Simple as that.
As I stated earlier, whether or not the Google algorithm notices/recognizes the url's disappearance is a different story. It is quite possible that the urls are removed only from the visible serps.
|I will read through this tonight. First question - any risks to doing this? (vs. just waiting for google to clean up their own act)? |
To my knowledge, the only risk is forgetting to set your metarobots tag back to "index" as soon as you get the "success" notification.
This problem is no where close to being resolved. We only hope our emails to Google will ultimately cause some good content sites to one day reappear.
I agree with you Crobb305 and Gala_Nixon there are many quality content sites missing until this issue is resolved.
This site removal tool should not be what we need to use to get this done. Google should be taking care of this.
...she had a good point about sibmitting urls that still exist but have no real means for Google url removal. I.E., Gloogle will not remove a page that "still exists" even if the page returned is a "server not available"
I have just used this method to remove 2 offending URLs.
Both had caches dating from mid-September which seems to be a common thing.
I have a situation where the hijacker is not redirecting only to one of my pages but is redirecting to 10 different pages that they took from overture listings.
Even if I did modify my page temporarily I don't believe it would result in the removal of the hijacker's page.
just curious. If the original page is higher in pagerank, will that prevent it from being jacked?
why is everyone still using methods that are not 100% accurate all the time to get rid of this crap. Simply use mod_rewrite to redirect anything coming from the site that has 302'd you to a page that basically says that site is stealing other peoples rankings using a known google exploit, and submit the page to googles addurl. I am quite sure if that was done by enough people google would fix the issue quickly. Also doing so would stop that page from ranking using your ibl's and keywords.
|Simply use mod_rewrite to redirect anything coming from the site that has 302'd you to a page that basically says that site is stealing other peoples rankings using a known google exploit, and submit the page to googles addurl. |
Unfortunately, there's no way to determine that a specific page request came through a redirect.
In most cases apache still attaches a referer header to redirects.
I have site doing 301 redirect to mine, is this fine or should i have this removed?
The mod_rewrite/redirect/referrer angle has been beaten to death in the 700-post thread and a couple other places. Bottom line: it doesn't work because Googlebot never sends a referrer.
boredguru has posted a few valiant attempts involving tracking databases and on-the-fly URL rewriting, but the ones that *might* work are pretty cumbersome to implement and would require creating a lot of duplicate content on your own site. I'd hardly consider that preferable to zapping hijackers with the removal tool.
If you don't know whether or not you've been hijacked, please read this thread for complete background info:
Let's keep this one on-topic.
> In most cases apache still attaches a referer header to redirects.
A referrer header is a browser setting - not apache.
illusionist: The 301 is not a problem and you should thank the 301 linker for providing you with a link to your site. 302's shouldn't be a problem either and normally you should say thanks for those too... it's google's fault that they are a problem.
Okay this is encouraging:
Today my site started to move up in the results. I was in the top 10 until Dec 16, then dropped to high 90s. After removing the hijacker, my site just moved to number 68 today. It's the first time it's shown better than 97 since December 16.
In my experience, the <base href="http://www.wherever.com/it-really-is.htm"> tag in the header seems to have a positive effect. It's not best practice to do in normal terms, because it then turns all your links (from the visitor perspective) from relative to absolute (I won't delve into why this is a Bad Thing).
A question really aimed at those using the Google Removal Tool then.. what else did you try that *didn't* work? I get the impression that mileage may vary with some of these techniques.
What should I do If I've already had the site owner remove the 302 redirect pointed to my page?
As of now, allinurl: lists the offending site ahead of mine with my title and my description, my site is URL ONLY beneath it.
The old 302 url NOW points to the site owners template page. When ever I click on the cache it shows that template page but Google still hasn't changed the title and description to match it.
What should I do?
pcarlow, in that situation I have linked to the URL to ensure timely fetching by Google. If the link to the redirect URL was removed, then it could stay in Google for a long time.
Submitting the redirect URL has also been suggested over the last couple of days - that is one of the few positive reasons to submit a URL to Google in my opinon.
Re "any risks to doing this"
Isn't there a risk of getting your own site/page removed if you don't do it correctly?
I would really appreciate it if someone would post a step by step process to remove those redirects.
ciml, thanks for the advice. I have purchased a PR5 link pointing to it and also have linked to it in several other places for about 3 weeks now. I assume that's why it shows the cache of the new page it points to (the owners template). The problem is its still showing above my URL ONLY and maintains my description and title when using the allinanchor: and site: commands. The other strange thing is it shows supplemental. Maybe I should try submitting the URL to Google also, not sure that will do any good though.
if you use only absolute referencing throughout your site, do you gain any additional benefit by using the base href tag?
I have just tried this method on an internal page that was hijacked and it seemed to be successful. Not sure if I have the gumption to try it on my home page yet.
I have three other internal pages being hijacked by another site, and I don't think the removal tool will work in this instance.
The link found during the allinurl search looks like this:
but click it and you go here:
mouse over [ed.namechangedtoprotectheguilty.com]
I tried both urls in the removal tool, and both came up with this message:
|We could not detect any meta tags on that page. Please verify that the URL and page are correct. |
When I clicked the first link, my blocker stopped a popup, and I can't seem to get it to appear. I'm assuming it's an ad - almost like an interstitial - between the first link and the hijacked page of my site.
[edited by: Brett_Tabke at 7:50 pm (utc) on Mar. 18, 2005]
[edit reason] fix side scroll [/edit]
I believe that you have hit a wall with those URLs. Had similar experience where one hijacker URL has my cache, but goes thru a page with nothing but metarefresh=0 tag to my site before the redirect. So G can't see the noindex tag I temp. added to the page, it just sees the 'in between' page.
The other example I came across was a hijacker URL with my cache (Nov. 04) but the page is gone and replaced with "This Account has been terminated..." So, although it is still cached, there is no redirecting page currently in existence for me to shoot down.
Short of contacting the hosting companies or the hijackers themselves in such cases, I don't think there's anything we can do but continue to wait for G to fix this, or wait for G to become so full or irrelevant bs that people start using engines that list our sites correctly.