homepage Welcome to WebmasterWorld Guest from 54.166.128.254
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 277 message thread spans 10 pages: 277 ( [1] 2 3 4 5 6 7 8 9 ... 10 > >     
How to Remove Hijacker Page Using Google Removal Tool
8,058,044,651 page indexed (now minus 1)
Idaho

10+ Year Member



 
Msg#: 28612 posted 6:19 pm on Mar 17, 2005 (gmt 0)

Continued from: [webmasterworld.com...]


With the help of posts from crobb305 and others, I was able to remove a hijacker's page from the Google index.

My site was doing very well in the SERPs. For over 2 years it had been on the first page for a competitive term (1.2 million listings). Then during the first week in January my site disappeared and traffic tanked for no obvious reason.

When searching for "site:www.mydomain.com" I noticed that my index page often wasn't listed or it appeared on about page 3 or 4 of the results after all my supplimental pages.

A search for "allinurl:mysite.com" often didn't show my index page at all but instead showed somebody else's domain (located in Turkey). When I clicked on this link, my site came up. When I clicked on the cached version of the site, it showed a very old cache of the page. This same site also showed up after all my results when doing a "site:www.mydomain.com"

Using a header checker tool on the site's URL I was able to see it was using a 302 link to my site.

Last night after reading some posts by crobb305 and others I went to Google.com and clicked on "About Google." Then I clicked on "Webmaster Info." Then I clicked on "I need my site information removed." Then I clicked on "remove individual pages." Where I found instructions on how to remove the page.

(Here's the exact page where I ended up. If mod needs to remove then snip away:) [google.com...]

I then clicked on the "urgent" link.

Then:
1. I signed up for an account with Google and replied back to them from an email they sent me;
2. I added the "noindex" meta tag according to their instructions and uploaded it to my site;
3. Using the instructions to remove a single page from the Google index, I added the hijacker's URL that was pointing to my site. (copy and paste from the result found on "allinurl" search)

This didn't work the first time because I had to remove a space from the url to get it to work.

4. I got a message back saying that the request would be taken care of within 24 hours. The URL that I entered showed on the uppper right hand part of the screen saying "removal of (hijacker's url)pending."
5. I then removed the "noindex" meta tag from my page and re-uploaded it to my site.

This morning the google account still shows the url removal as "pending" but when I do "site:" and "allinurl" searches the offending URL is gone and my index URL is back.

Conclusions and Speculations:
At some point last September, Google cached the hijack page's url pointing to my site. In January, Google penalized my site for duplicate content because it found both URL's and compared them. Mine got penalized because it was the only page that really existed. The hijacker's page didn't get penalized because it only existed as a re-direct to my site.

Because my index page was now penalized, it dropped almost completely from the SERPs. (Some of my suppliement pages showed up for obscure searches) but none of my money terms.

Because I haven't been able to get a response from the hijacker's webmaster, the 302 is still in place but it is buried deep in his site and the last Google cache of the page was sometime in September. Therefore with some luck Google won't re-index it any time soon.

Will my site return to the SERPs? I don't know. Any thoughts?

 

ciml

WebmasterWorld Senior Member ciml us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28612 posted 7:55 pm on Mar 17, 2005 (gmt 0)

I've used the manual removal tool to remove redirect URLs, but only after using robots exclusion for the redirect URL (which requires control of the URL or cooperation from its owner).

An interesting aspect is that, according to your experience, Google are removing the submited URL and not the destination URL. This does make sense, given that Google fixed the "remove competitor's home page" exploit last Summer.

The next question, is how quickly the benefit of the backlinks will be applied to the rightful URL.

macdave

10+ Year Member



 
Msg#: 28612 posted 8:23 pm on Mar 17, 2005 (gmt 0)

Yesterday I removed a hijacker using the method you described. Since the hijacker was redirecting to a slightly different URL than the URL I have indexed (e.g. mysite.com/index.html vs. mysite.com/), it was a low-risk move. I probably wouldn't have done it if the hijacker had been redirecting to a URL I needed to keep in the index.

Can you confirm that you were able to remove the hijacker without inflicting any collateral damage on your own URL? If so that's great news and we finally have a decent way to fight this.

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28612 posted 8:27 pm on Mar 17, 2005 (gmt 0)

Yes...you can remove a url that redirects to any page on your site without causing harm to the intended url for that page.

To do this, you use the removal tool and set the meta robots tag to "noindex" just long enough to get the url submitted. Then, instantly return the metatag to "index". If you forget to change the tag back, you obviously risk having the intended url removed next time Googlebot checks your site. When you submit a url for removal via the removal tool, the program will instantly check to make sure the tag is set to "noindex" (for your protection), but it will not check again. That is why you are able to immediately return the tag to index after you get the submission "success".

The only thing I am not sure about is if Google still knows about the url(s) that are removed and uses them in ranking calculations. Does Google only remove the url from visible index? If Google removes the urls from visible index but retains the url/information somewhere else for it's own purpose, then our efforts to remove the urls and help Google clean up it's horrific mess are in vain. This would not surprise me in the least.

Chris

martingale

5+ Year Member



 
Msg#: 28612 posted 8:35 pm on Mar 17, 2005 (gmt 0)

Please update this thread periodically to let us know how your site is doing in the search results; have you lost ground? And how long does it take you to get back to where you should be?

This is a *very* interesting post. It's the first thing I've seen about 302's that seems like it actually would work. Here's hoping.

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28612 posted 8:47 pm on Mar 17, 2005 (gmt 0)

Well I have a bit of information regarding the success of the url removal.

In November 2004, there were as many as 20 urls that were NOT mine showing in a site:mysite.com search. These urls were mostly tracker2's. But, after Google had associated those tracker2s with my site, it then began associating all redirects it found with my site. Incidentally, the site: search is supposed to show only urls that are truely part of your site. If it shows unrelated urls (and certainly 20 unrelated urls) then there is a problem.

So, I began submitting those urls to Google Removal tool. Since the redirects ultimately landed on my page (destination page), I had control of it's removal using robots metatag. The last one was removed in late January. Unfortuantely, nothing has changed. My site is still MIA. There are still some unrelated urls showing in the site: search since the redirect was removed prior to my learning about the removal tool. Those remaining urls were last cached on Nov 2, and until Googlebot revisits them it will never know they no longer redirect to me. I am convinced that there is nothing we can do. Google is just broken and they don't care. When I search for my company name, my home page is no where to be found. Rather, dozens of scraper/directory style sites with 0 pagerank are listed. Very pathetic and sad.

Chris

macdave

10+ Year Member



 
Msg#: 28612 posted 8:59 pm on Mar 17, 2005 (gmt 0)

Chris: Have you tried prompting Googlebot to visit those no-longer-existent redirects by submitting their URLs to [google.com...] and/or linking to them from a frequently-spidered page?

Atticus



 
Msg#: 28612 posted 9:30 pm on Mar 17, 2005 (gmt 0)

Greetings All;

Been lurking here for years, but with this 302 fun, I just gotta join in.

By searching for text unique to my sites, I found 5 URLs in Google (not mine) using my page title and having my info in the cache. I was able to use methods detailed above to initiate removal for 3 of these URLs. However, one of these URLs is not just a 302, but also uses a meta refresh set to 0, so the Google removal tool is not seeing the meta no index tags which I temp. added to my page (Google sees a page that has nothing but the hijacker's meta refresh tag on it). The second URL I could not remove goes to a page that says "This Account Disabled...," the 302 no longer redirects, but the URL is in the SERPS with my cache.

Is there a Google e-mail or other method that can help rectify these problems?

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28612 posted 10:03 pm on Mar 17, 2005 (gmt 0)

macdave

I have submitted those urls AND I have linked to them from various pages. But, Googlebot has yet to revisit and update it's cache. I linked to them because when the webmasters removed the redirect to my page, they simply redirected the url back to their own homepages. But, as I said, Googlebot is not interested in visiting, and Google would rather have old, outdated cache in it's database.

Chris

Vec_One

10+ Year Member



 
Msg#: 28612 posted 10:07 pm on Mar 17, 2005 (gmt 0)

My situation is almost exactly the same - redirects from old pages with old caches.

On one of them, the removal tool wouldn't work because it couldn't recognize the characters. I think it got hung up on the %2F. I don't know why Google can index and display a URL but can't remove it.

macdave, that seems like a good suggestion. I would be interested in knowing if anyone has had success with this.

Idaho

10+ Year Member



 
Msg#: 28612 posted 10:16 pm on Mar 17, 2005 (gmt 0)

Vec_One

The URL I removed had the %2 in it also but this wasn't the problem. I initially got an error message from Google saying not valid becuause of the character " "

This turned out to be a hard to see space in the offending URL. When I removed it and resubmitted it worked.

Idaho

Emmett

5+ Year Member



 
Msg#: 28612 posted 10:22 pm on Mar 17, 2005 (gmt 0)


The last one was removed in late January. Unfortuantely, nothing has changed.

crobb305,

I read somewhere that the duplicate content panalty ranges from 30-90 days depending on how long the dupe content exists. You should be getting close to the 90 day mark. I've got my fingers crossed for you.

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28612 posted 10:26 pm on Mar 17, 2005 (gmt 0)

Emmett,

In addition to removing all those urls, I rewrote the content. You are right, we are approaching the 90 day mark. But, it is not right for Google to penalize an innocent site. It is not my fault that webmasters copied my content, or set up malicious redirects to my homepage. Google should have a way of manually removing penalties such as these.

I still have my doubts that anything will change in the next 6 months.

C

rehabguy

10+ Year Member



 
Msg#: 28612 posted 10:55 pm on Mar 17, 2005 (gmt 0)

When I do a "inurl:mysite.com" search on google it will only show me 1,000 of 9,900 results! (My site only has 1,700 pages, so I know that there are more hijackers then the few I see in the first 1,000 results....

Does anybody know how to get to the other 8900 results?

I can't use this remove feature if I can't see all of the results!

Tnx;
Rehabguy

Atticus



 
Msg#: 28612 posted 11:03 pm on Mar 17, 2005 (gmt 0)

rehabguy,

Don't know how to see them, but I agree that there are probably alot more bogus URLs effecting our rankings than those that can be found by methods posted here.

I think that this problem is far bigger than it appears to be on the surface.

Idaho

10+ Year Member



 
Msg#: 28612 posted 11:10 pm on Mar 17, 2005 (gmt 0)

Can you confirm that you were able to remove the hijacker without inflicting any collateral damage on your own URL?

I just got confirmation from the Google control panel that the offending URL removal is "complete."

An "allinurl:www.mydomain.com" search shows the hijacker gone and it shows my index page alive and well.

Idaho

Emmett

5+ Year Member



 
Msg#: 28612 posted 11:18 pm on Mar 17, 2005 (gmt 0)


Does anybody know how to get to the other 8900 results?

Rehabguy,

Try this:

inurl:yousite.com "unique text from your home page"

That should narrow it down to your home page and anyone else 302'ing to it.

claus

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28612 posted 11:23 pm on Mar 17, 2005 (gmt 0)

>> I added the "noindex" meta tag

Just dropped by to say that you can also serve a 404 or 410 code, that works just as fine. (no need to serve it to all, just do a little .htaccess magic and serve it to Google for a few minutes, then take it down again)

Q: What was the IP and/or User-Agent of the script that checked your page (ie. the URL removal tool)?

Altough i'm glad that you managed to get some redirect scripts removed, i'm also a bit worried as this should really not be the responsability of the hijacked webmaster. This does not fix the problem. Google should plainly fix this, so that we could get on with building and maintaining our sites in stead of fixing their errors for them.

[edited by: claus at 11:26 pm (utc) on Mar. 17, 2005]

Hanu

10+ Year Member



 
Msg#: 28612 posted 11:24 pm on Mar 17, 2005 (gmt 0)

Idaho, it didn't work in the past for crobb305 but it worked for you just now. Could this mean that Google did at least a partial fix in the meantime?

blend27

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 28612 posted 11:53 pm on Mar 17, 2005 (gmt 0)

Atticus,

pay attention to the Case of template your are using on your site, this might be your issue.

Template.html and tempLATE.HtML at 2 pages in the G's Index.

https://www:mydomain.tld/template.html and
[www:mydomain.tld...] count as 2 pages.

Atticus



 
Msg#: 28612 posted 12:05 am on Mar 18, 2005 (gmt 0)

Blend27,

Trying to understand...

Not following "case of template.." I use SSI headers and footers but not getting your meaning.

I typed https://example.com into my browser and my web host's front page comes up with that. G shows no listing for https://example.com, but does list [example.com...]

[edited by: ciml at 8:07 am (utc) on Mar. 18, 2005]
[edit reason] Examplified [/edit]

Bobby

10+ Year Member



 
Msg#: 28612 posted 12:29 am on Mar 18, 2005 (gmt 0)

Altough i'm glad that you managed to get some redirect scripts removed, i'm also a bit worried as this should really not be the responsability of the hijacked webmaster

Good point.

While I too am happy to see the Google removal tool working to an extent it simply is impractical.

I have over 20 sites which have all been hijacked by over 100 web sites all using the same template.
Let me rephrase that.
What I mean is that
each site has been hijacked a 100 times, so I would have to submit 2000 pages to the Google removal tool and temporarily remove my sites or place the no index tag in there and then put it all back to normal again afterwards.

That would take a good deal of time.

StupidScript

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28612 posted 12:38 am on Mar 18, 2005 (gmt 0)

I'm curious how Google lets someone who does not own an offending domain remove it from their index?

What's to stop me from getting them to take all of my competitor's pages out of their index? How are you being authorized to remove a page from their index?

It doesn't sound like there is any authorization going on at all ... just requests for removal of someone else's page being granted. Worrisome.

Idaho

10+ Year Member



 
Msg#: 28612 posted 12:42 am on Mar 18, 2005 (gmt 0)

Idaho, it didn't work in the past for crobb305 but it worked for you just now.

Hanu
I'm not saying my site has come back in the SERPs. Judging from what Crobb305 has said, this probably won't happen until:
1. Google re-indexes my page;
2. The next update; or
3. The next update after some duplicate content penalty expires.

All I'm saying is that I sucessfully removed the offending url without removing my own page from Google's index.

Atticus



 
Msg#: 28612 posted 12:43 am on Mar 18, 2005 (gmt 0)

SS,

You submit a page to be removed, G checks to see if it has a noidex tag and if it does, it removes the page. So, yes, you could submit URLs not your own, but only pages with a noindex tag get dropped -- so you couldn't do it maliciously to anyone who hasn't included that tag on their page.

It works in this case, because G thinks the page is yours -- G fetches YOUR page with the noindex tag, when it checks the "hijacking" URL.

Edouard_H

10+ Year Member



 
Msg#: 28612 posted 12:46 am on Mar 18, 2005 (gmt 0)

What was the IP and/or User-Agent of the script that checked your page (ie. the URL removal tool)?

216.239.38.136 "googlebot-urlconsole" ...in one case.

Bobby

10+ Year Member



 
Msg#: 28612 posted 12:47 am on Mar 18, 2005 (gmt 0)

StupidScript,

This is a case of removing a page that is using a 302 redirect to one of YOUR pages, stealing your content and pretending that it is theirs.

By temporarily removing your page and signaling to Google that you'd like to remove the hijacker's page, Google mistakenly believes that the hijacker's page is no longer online and wants it to be removed.

The same flaw in the algorithm which is responsible for indexing the hijacker's page (which is stealing your content) also can remove it.

The problem here is that first off it's difficult to discover all the pages that are hijacking your pages, and secondly it requires a substantial amount of time directly proportional to the number of total pages to be removed.

Idaho

10+ Year Member



 
Msg#: 28612 posted 12:58 am on Mar 18, 2005 (gmt 0)

StupidScript; if you follow the original post you'll understand.

It works like this:
Google has indexed my own page as belonging to some other site. This is called a hijack. It's really just a Google glitch because it's really my page, it resides on my site; I control the content, etc, but Google thinks it belongs to some other site.

So what you do is put a meta tag on the page that tells Google not to index the page. Then you have Google go have another look at the page through the offending url. When it does, it sees the "noindex" meta and removes the offending url. It doesn't remove the page because you told it to, it removes the page because the author of the page has a meta tag on the page that says to remove it.

After Google looks at the page, it removes the URL from its cache of pages belonging to the other site. The trick is to remove the meta tag before Google comes along through your URL and notices the tag. If this happens it will also remove it from your site.

Idaho

StupidScript

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28612 posted 1:14 am on Mar 18, 2005 (gmt 0)

Yes, I see what you are saying, and I understand the "trick": You put the "noindex" instruction on YOUR page, not the offender's page. G looks at your page via the redirect page ... and removes the redirect page only? But ... YOUR page is the one with the "noindex" on it.

The 302 is resolving to YOUR page, hence either G has the 302 page AS BEING your page or it doesn't. In either case, the site is NOT within a domain you are authorized to manage, and G doesn't ask you for any authorization ... does it?

In the latter case, G sees BOTH pages ... and you are authorized to manage only YOUR page. In the former case, G sees only the 302 page ... which you are not authorized to manage.

How can G remove a page at your request when you are not authorized to manage that domain? Are we in agreement that G is too stupid to realize what it has indexed and who is asking it to take a page out of that index?

The "trick" described above only works if G does not validate authorization to manage the offending page's domain. If they go ahead and remove a page from someone else's domain from their index because you ask them to, that just doesn't sound right.

The ends do not justify the means, and this leaves a lot of issues on the table ... issues far more serious than what the 302 perpetrator did in the first place.

IMHO.

Idaho

10+ Year Member



 
Msg#: 28612 posted 1:28 am on Mar 18, 2005 (gmt 0)

Yes, it is a glitch with Google. We all agree that its a huge problem that Google needs to fix. It is looking at one page and indexing it as two pages; one for me and one for the hijacker.

All the tool does is tell google to look at the page and read the meta tag. If the meta tag is there it removes the page. If it isn't there it won't remove the page. You couldn't possibly use the tool on a competititor's page to remove his content unless you could somehow get your competitor to also include the meta tag.

It does seem interesting to think that maybe you could use the tool to reindex your page back into Google's index by setting the metas to "index."

This 277 message thread spans 10 pages: 277 ( [1] 2 3 4 5 6 7 8 9 ... 10 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved