Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Tons of 301-redirects in website and Googlebot's crawl rate dropped

         

guarriman3

9:33 am on Jan 5, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi,

After deciding to remove all my AMP pages (https://www.webmasterworld.com/google/5021884.htm) and merging thousands of redundant URLs (https://www.webmasterworld.com/google/5019617.htm) of my website, there are currently tons of 301-redirects implemented:
- from 'amp.example.com/whatever' to 'www.example.com/whatever'
- from 'www.example.com/product-X/photos' to 'www.example.com/product-X/main'
- from 'www.example.com/product-X/opinions' to 'www.example.com/product-X/main'

I started with these changes on Dec 23, and on Dec 27 the Googlebot's crawl rate seen on Search Console's stats dropped suddenly from 100k/day to 12k/day.

I'm worried that Googlebot has become upset about seeing so many 301-redirects. Any similar experience is welcome. Thank you.

not2easy

2:33 pm on Jan 5, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



From your examples it appears you are redirecting to unrelated .html URLs. You show 'www.example.com/product-X/photos' and 'www.example.com/product-X/opinions' both redirecting to 'www.example.com/product-X/main' which wouldn't seem to be the same content that visitors might expect to find. When there is no equivalent content, Google prefers to get a 404 response. Redirecting to completely different content is considered a soft 404 and worse for a site that plain 404. Better is 410 (gone) but if not, then a 404.

The slowdown is likely due to the non routine findings that they need to sort through. When little changes from one crawl to the next they move faster. Processing 'new' content slows it down for a while in my experience.

lucy24

6:18 pm on Jan 5, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm worried that Googlebot has become upset about seeing so many 301-redirects.
When Googlebot encounters a lot of redirects, its immediate reaction is to make sure you’re not serving up soft 404s. It does this by requesting files with nonsense names that it can be confident don’t really exist. Your logs should therefore show a steady stream of

66.249.79.xyz - - [19/Apr/2020:20:07:37 -0700] "GET /exftbbclrvcbdsu.html HTTP/1.1" 404 6636 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
where the request is any random string of 15 letters.


Parlor game: Make up sequences of 15 lower-case letters that could plausibly be legitimate URLs.

guarriman3

9:27 pm on Jan 5, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



@not2easy
Thank you very much for your answer
From your examples it appears you are redirecting to unrelated .html URLs.

Maybe I didn't make myself clear. I'm trying to merge the contents of several webpages to consolidate the content.

The contents of 'www.example.com/product-X/photos' and 'www.example.com/product-X/opinions' are now included within 'www.example.com/product-X/main'. This is, inside the main page of each product I placed now the photos and the opinions of each product. For that reason, I inserted 301-redirects from 'www.example.com/product-X/photos' to 'www.example.com/product-X/main' and from 'www.example.com/product-X/opinions' to 'www.example.com/product-X/main'.

As far as I read on several websites, and even in WebmasterWorld, 301-redirects is the best option to consolidate the contents of my pages. IMHO, if I create 404 responses for visitors of the 'photos' or 'opinions' pages, I'm not giving them a good experience, and it's better to redirect them to a webpage with the same information, along with extra data.

Processing 'new' content slows it down for a while in my experience.

Ok, I'll wait then for a while.

@lucy24
Thank you very much for your answer
It does this by requesting files with nonsense names that it can be confident don’t really exist. Your logs should therefore show a steady stream of

I've been checking the Apache logs of yesterday and the 404 responses created by Googlebot, and no pattern like that was found. I'll check it again in a few days.

lucy24

2:26 am on Jan 6, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you're consolidating pages, be aware that it is possible to redirect to a fragment, so that a human who requests a former page is sent directly to the part of the new page that has the content you want. In Apache, it looks like.

RewriteRule ^oldpage\.html https://www.example.com/newpage.html#fragment [R=301,NE,L]
where the [NE] flag (“no escape”) is to prevent the # character from turning into something the browser won't recognize.

If you have full control over naming your fragments, sometimes it's even possible to do things like

RewriteRule ^(page1|page2|page3)\.html https://www.example.com/newpage.html#$1 [R=301,NE,L]
where you're cleverly using the name of each former page as the exact name of the fragment, so it can all be lumped into one rule. This is something I have actually done, combining multiple thin pages into one slightly plumper one.

guarriman3

7:37 am on Jan 6, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



@lucy24 Thank you again for your answer and smart tip.
If you're consolidating pages, be aware that it is possible to redirect to a fragment, so that a human who requests a former page is sent directly to the part of the new page that has the content you want.

Yes, it makes sense, and I can implement it via PHP within (e.g) the old 'photos.php' script

$newLink = $mybaseURL . "/" . $productLabel . "/main#photos";
header("HTTP/1.1 301 Moved Permanently");
header("Location: $newLink");
exit();


However, I have two questions (they may be a bit naive):
- Is the link juice of 'www.example.com/product-X/photos' passed to 'www.example.com/product-X/main' if I use the '#' to link to the fragment of the main webpage?
- Can Google consider that I have two different ULRs ('www.example.com/product-X/main' and 'www.example.com/product-X/main#photos') and interpret it as duplicate content?

phranque

8:24 am on Jan 6, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



However, I have two questions (they may be a bit naive)

in short:
- yes
- no