Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Is There Any Method To Remove Old Site Info From Google?

         

RedBar

2:41 pm on Oct 25, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



.uk (not co.uk) domain name first registered when allowed, 2014? Never registered before and I own the .co.uk

Used for 3 years until 2017 and then 301'd to a .com

New site launched on this domain July 2021, Bing / DDG / etc all have it correct. G has some pages correct however I was shocked to find today that it still has several hundred of the original pages from 2014 indexed which obviously go to a 404.

Will G EVER de-indes these pages?

Will G EVER rank this site well again?

I'm at the point tof removing it since all it seems to attract is bots from everywhere ... TBH, I've never seen so many single page visit bots.

Any recommendations? Should I just let it die?

It's not a brand name however is an important widget trade keyword but I'm getting to the point of I simply don't care about this confusion any more.

not2easy

4:04 pm on Oct 25, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If it were just a few I'd put up a tiny temporary page to redirect to a "Gone" response. Several hundred is a lot more to keep up with though. IF by any chance they have common URL structures that could be handled with rewrites, maybe it is not too much to consider. They really make it hard to change things using only common sense.

engine

4:39 pm on Oct 25, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



How about using this permanent removal process.

[support.google.com...]

RedBar

4:51 pm on Oct 25, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Aha, thanks, I'll have a good read through that but just why would G retain stuff that hasn't been there for at least 4 years?

engine

5:01 pm on Oct 25, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



why would G retain stuff that hasn't been there for at least 4 years?


Google never forgets! ;)

RedBar

7:16 pm on Oct 25, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Agree however what's the point for something that no longer exists?

phranque

8:38 pm on Oct 25, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Will G EVER de-indes these pages?

not until they are at least recrawled...
have you checked to see when they were last crawled?

Any recommendations? Should I just let it die?

you might try creating and submitting a sitemap containing those several hundred urls to encourage and track googlebot's crawl of those 404 pages.

If it were just a few I'd put up a tiny temporary page to redirect to a "Gone" response.

i wouldn't recommend any kind of redirect under these circumstances, but a 410 (Gone) status code in the response is a stronger signal than a 404.
assuming apache, this is easily accomplished with mod_rewrite directives.

martinibuster

8:46 pm on Oct 25, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Agree however what's the point for something that no longer exists?


It's a sign of extra crawl capacity. They will go out and check if old URLs returned just in case the business made a mistake or something is back in stock, etc.

Robert Charlton

10:42 pm on Oct 25, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Other questions to consider here are "what's the query?" ...and "where does Google try to send you when you click on the result?". And "did you retain ownership of the old domains?"

If the query is the old domain name or old url, it makes much more sense for Google to return that url as a result than if the query is a competitive term Searches for the old url may stick around in the SERPs for a long time, for reasons phranque mentions.

They're only really a problem, though, if someone is searching for a competitive term and your old domain or url still ranks above your new address for the term. Even here, the question of where Google tries to send you when you click on the result is key.

Several other thoughts...
- One theory has it that Google rechecks old databases when a major update is in the works. I've felt that this is true. Google has denied this, at least as the question is sometimes asked, but FWIW I mention it now.

- One thought I've toyed with occasionally is that Google might display such results routinely to discourage the buying and redirecting old domains for their PageRank/"link-juice" etc.... It is a sign that they keep track of these things.

I'd try 410s, and after that I'd forget it. I've long felt that it's unfortunate that these are called 404 "errors", which IMO gets everybody worried that they've done something wrong. John Mueller has gone into this at length, that 404s are perfectly normal, they don't hurt you, etc... that they're there for your information in case they weren't intended.

John also mentioned in an old Google+ post that I haven't been able to find elsewhere, that if Google returns, say, 10 pages of 100 results each of 404 "errors", if you don't find anything applicable to you in the first 100, you're not likely to find anything in the remaining nine, so simply disregard them.

Google seems to have an aversion to contextual help, which is too bad... but may rediscover it some day.



Robert Charlton

1:31 am on Oct 26, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



PS: engine.... re your comment above...
How about using this permanent removal process.

I remember in one of our old discussions that this method was not to be used to get 404s out of the index. That appears to be still the case. I remember that there were some elaborate reasons for that, not mentioned in this Google Support article... but I would get with previous and current advice and not use it...

Removals Tool
Temporarily block search results from your site, or manage SafeSearch filtering
[support.google.com...]

When not to use this tool...
...To clean up cruft, like old pages that 404. If you recently changed your site and now have some outdated URLs in the index, Google's crawlers will see this as we recrawl your URLs, and those pages will naturally drop out of our search results. There's no need to request an urgent update.

tangor

4:11 am on Oct 26, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If it's a 404 I don't usually get hot and bothered ...

If g (or bing) keep asking for pages that disappeared 10 years ago ... yet another 404, I can only wonder what is their business plan.

Either way, it's just a bit of bandwidth for me, wasted time on their end, and likely not sufficient to impact rankings one way or the other.

RedBar

1:54 pm on Oct 26, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm with tangor on this, it's simply annoying on my part and, as such, I've checked all my other domains that I have removed information from and 301'd to new sites, all those domains are clean, not on single old page whatsoever.

In response to Robert
Other questions to consider here are "what's the query?" ...and "where does Google try to send you when you click on the result?". And "did you retain ownership of the old domains?"

What product - a standard widget trade product
Where to - to my customised 404 page which gives the full site navigation
Ownership - I still own both .co.uk and .uk
Google cache - October / November 2014

@MB - Personally I feel 7 years of 404s is long enough surely:-)

engine

3:59 pm on Oct 26, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



..To clean up cruft, like old pages that 404. If you recently changed your site and now have some outdated URLs in the index, Google's crawlers will see this as we recrawl your URLs, and those pages will naturally drop out of our search results. There's no need to request an urgent update.


Well, clearly, in this case, that doesn't appear to be working.

Robert Charlton

6:44 pm on Dec 30, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Well, clearly, in this case, that doesn't appear to be working.

Google removes pages/urls from the index in an iterative way... that is, if it's a 404 rather than a 410, and as long as there's a link to it somewhere on the web, Google will try the url several times to make sure that it wasn't either accidental... or temporary as might happen with, say, with a server glitch, or with an out of stock page, etc etc,. There's nothing you can do, short of using a 410 to communicate to Google that it was intentionally removed.

Google will try repeatedly... and it never forgets. My thought is that their are certain times when Google wants to go all the way back to square one... or whatever it currently regards as square one... and re-spiders the web from that point, to establish what's kind of like a new reset point.

Again, the 410 is the best way I know to get Google to stop trying a particular url... and even then, if there's an existing link still pointing at that url, from time to time it might pop up again.

Robert Charlton

3:52 am on Dec 31, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Where to - to my customised 404 page which gives the full site navigation

Another possibility, worded in perfectly good English, but unfortunately leaving some ambiguity about how this has been set up on a server, as there are several ways to accomplish this. .

RedBar... please know that even if you've got this set up completely "right", since we're giving the topic some attention, it can often be helpful to have a thread that covers the topic in several situations for a variety of user expertise, and it's with that spirit that I'm digging further here. Several questions (some of which might be redundant) come to mind...

- are you running on Apache, or are you running IIS?
- on your final customized error page, the one with the full site navigation, what server header response is returned?
- and what URL is displayed in the address bar for that page?

If you like, please include the code used in your server configuration file... eg, .htaccess or equivalent... to call up the error page.