Do you think I'm heading for a loss at all with my 301 clean up? - Google Search and SEO forum at WebmasterWorld - WebmasterWorld

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Do you think I'm heading for a loss at all with my 301 clean up?

shaunm

8:53 am on Apr 26, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

Hello All,

I need your help in how you would go about validating a particular backlink. I've been cleaning my redirect file as it has become unmanageable with so many rules. I notice that there are hundreds of 301 urls on high authoritative sites/pages which haven't got a hit in last 6 months.

Let's keep aside the link metrics, if a particular link hasn't got a hit in over last 6 months especially from Googlebot, do you think it's still worth having the redirect rule in my .config file? I believe that If Google hasn't crawled a particular link which I suppose to be highly authoritative and then eventually find my link, then it simply means that Google considers that particular external link source as a 'less value' page even if the DA, PA, TF and any other link metrics says otherwise. Because it's obvious that Google's crawl depends on the PR of a site, mostly.

So, just say I'm done with such 301 links since I don't get a single referral traffic or any bot hit in over last six months. Am I heading for any loss or nothing at all? Also, is how often Google visits a link directly relates to the link juice it could be passing to the links from it?

Thanks much for the clarification!

aakk9999

12:38 pm on Apr 26, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I believe that If Google hasn't crawled a particular link which I suppose to be highly authoritative and then eventually find my link

I am wondering how do you know this? I mean how do you know that Google has not crawled the third-party page on which is your link? The fact that it did not crawl your "linked to" page does not mean that it did not crawl the third-party page where there is a link to your page and saw that link.

shaunm

1:44 pm on Apr 26, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

Thanks :-)

That's a critical question for me.

I mean how do you know that Google has not crawled the third-party page on which is your link?

Ok, what do you think makes Google to not crawl my link? There's no 'nofollow' on the page level nor link level. Isn't Google supposed to crawl all the links on a link considering there are not that many outward links and that it may not be running out of budget? - On a side note, is external links counted in calculating crawl budget of a site?!

The fact that it did not crawl your "linked to" page does not mean that it did not crawl the third-party page where there is a link to your page and saw that link.

If we are going by the brand, popularity and authority of that third-party page, it makes it a great place for a backlink. So, with no 'hits' for my link in my IIS logs, I came to the conclusion that it's not getting crawled. Again, if Google crawled that particular third-party link and not mine for some strange reasons, how's the link juice going to flow through that link to my link? I'm a just confused as I never thought of a scenario like the one you just said.

Can you please explain me? Thanks!

aakk9999

2:41 pm on Apr 26, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I think we may be talking cross-wired :)

Of course you DO know whether Googlebot crawled or not your own page - because you have your IIS logs - at least you know this for the period you are checking your IIS logs for.

Regarding crawling other site's page, if it has been indexed, and it does not show in SERPs the message that it has been blocked by robots.txt, then it has been crawled. If it has not been indexed, then you do not know for sure that it has been crawled.

For the avoidance of the doubt, search for a sentence in quotes from the third party page where is your link - and if the page shows in SERPs for the quoted text (i.e. it is indexed) then almost certainly it has been crawled too.

Regarding your own page - try the same - put a sentence from your page in quotes in Google and search for it. If it shows your page, then almost certainly it has been crawled.

If you do not find a googlebot hit in your IIS logs, this means that it has not been crawled in that period you are looking in. Being out of crawl budget is one of the reasons why this may happen, the other may be that Google considers it a low value page and therefore does not visit that often.

If a third party page on which it is this link is a low value page, I would not think twice on removing the redirect (and consequently having a response 404 served to googlebot when it next time requests it). But since you say that a third party page is an authoritative and popular site, then I would not risk removing this link because link juice not just arrives to that page of your site, but from there circulates to other pages on your site.

Again, if it is only single such link, then removing the redirect will probably not have much difference. But if you have a lots of such cases (authoritative third party page linking to your content not crawled by googlebot for a while), then I would probably not remove these out of prudence - or remove them slowly one by one, and watch my logs and when googlebot eventually does recrawl the old URL (which now gives 404), then watch for some time afterwards whether there are any adverse effects.

aristotle

4:39 pm on Apr 26, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

googlebot usually doesn't "follow' the links it finds, at least not directly. Instead, it creates a record of the link and eventually visits the linked-to page later, but normally doesn't tell you where it found the original link.

Walt Hartwell

5:59 pm on Apr 26, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

I need your help in how you would go about validating a particular backlink. I've been cleaning my redirect file as it has become unmanageable with so many rules. I notice that there are hundreds of 301 urls on high authoritative sites/pages which haven't got a hit in last 6 months.

Let's keep aside the link metrics, if a particular link hasn't got a hit in over last 6 months especially from Googlebot, do you think it's still worth having the redirect rule in my .config file?

This sounds to me like you are redirecting on your site from hundreds of pages that have outside links to other undefined pages on your site. If those original pages were authoritative, and the redirected pages were the exact same content in a different location, Google shouldn't have an issue with it.
If the redirects point to pages that have significantly different content than the original pages, then Google may consider it a "sneaky redirect". They mention it specifically in their guidelines.

tangor

8:29 pm on Apr 26, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

What are you redirecting?

301 is usually used to tell a SE, browser, or link that content that appeared at example,oom/oldpage has MOVED to example.com/newpage

This has nothing to do with third party links exterior to your site. Any links from a third party to your site will also see that redirect, thus you need do nothing further to keep that relationship alive.

lucy24

8:38 pm on Apr 26, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

if a particular link hasn't got a hit in over last 6 months especially from Googlebot, do you think it's still worth having the redirect rule in my .config file?

The day after you remove the redirect, the Googlebot will crawl the old URL. Fact.

If there is a typo in your redirect and you fix it after one hour, the Bingbot will have crawled the incorrect URL in the meantime. Fact.

;)

:: detour to raw logs ::

Picking at random one rarely-visited, almost-never-changing page in an obscure subdirectory on a low-traffic site (er, mine): The Googlebot has requested the page 60 times in 5 years, which averages out to once a month. So are you sure about the "last 6 months"?

Unless you have thousands upon thousands of lines of redirect-- which is awfully unlikely unless your redirects themselves are badly written-- it can't possibly affect server performance.

shaunm

6:59 am on Apr 27, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

Thanks @aakk9999

Being out of crawl budget is one of the reasons why this may happen, the other may be that Google considers it a low value page and therefore does not visit that often. But since you say that a third party page is an authoritative and popular site, then I would not risk removing this link because link juice not just arrives to that page of your site, but from there circulates to other pages on your site

Yeah, that's what we wanted to accomplish; letting the link juice circulate to other pages. But the no 'hit' status indicates that it could be the linking page or linking to page that Google may consider as low value page. In either cases, I don't really see a reason to keep that rule in the .config file. If it's not finding my link from that third-party page, it's less likely that Google is going to pass the link juice (my opinion). Isn't my assumption correct?

then I would probably not remove these out of prudence - or remove them slowly one by one, and watch my logs and when googlebot eventually does recrawl the old URL (which now gives 404), then watch for some time afterwards whether there are any adverse effects.

Good one. But in my case I don't think it's possible to trace whether the 'adverse effects' is because of the link removal, being a fortune 500 company with so many quality editorial backlinks from .edu, .gov, .org and other sites and also with Google's usual ups and downs.

Thanks @aristotle

googlebot usually doesn't "follow' the links it finds, at least not directly. Instead, it creates a record of the link and eventually visits the linked-to page later, but normally doesn't tell you where it found the original link.

Isn't how often Google visits a link based on its PR? In that case, it could be obvious that Google didn't want to crawl my link because it consider it to be of less value.

Thanks @Walt Hartwell

If the redirects point to pages that have significantly different content than the original pages, then Google may consider it a "sneaky redirect". They mention it specifically in their guidelines.

I think you misunderstood the original intention of the question. It's not about whether a redirect is sneaky or genuine (yes it's), it's about discontinuing a URL which has a few authoritative backlinks but with no 'hit' or whatsoever for a period of six months.

Thanks @tangor

thus you need do nothing further to keep that relationship alive.

Ok. 404s are pretty acceptable web standards too as far as Google is considered. A visitor lands on page A which is linking to my site and clicks on my link which is page A and gets a 404 - that's bad as for as user experience and referral traffic is concerned. Or it's that Google is crawling my link from any other sources and gets a 404 again it's bad because it could be finding my links from external sites and therefore I'm losing some link love. But if you read my question, it's about a hit or miss situation. And I don't have any other alternative link that I can suggest the external sites to replace.

Thanks @lucy24

The day after you remove the redirect, the Googlebot will crawl the old URL. Fact.

If there is a typo in your redirect and you fix it after one hour, the Bingbot will have crawled the incorrect URL in the meantime. Fact.

Haha. As usual a witty and profound answer. How does G knows that I removed the link?

So are you sure about the "last 6 months"?

Yes, didn't you help me with 'what's-so-great-about-log-file' thing? :-)

Unless you have thousands upon thousands of lines of redirect-- which is awfully unlikely unless your redirects themselves are badly written-- it can't possibly affect server performance.

Yeah, that's the intention - cleaning the bad redirects along with unnecessary redirects.