Welcome to WebmasterWorld Guest from

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Why does Google value Directory A versus Directory B?

7:14 pm on Sep 11, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 14, 2008
posts: 80
votes: 4

Hello again!

It's been a while since I've started my own thread, but this concern has been an ongoing issue for the last two years and we could really use some outside perspective. I'd appreciate any brainstorming, suggestions or general ideas especially from the old-timers like myself ;-) (this is my 2nd account- I lost my first one from waaaaay long ago). Here's the basics:
*Note: Numbers are fictitious and are used just for simplicity sake.

Brief Overview of Website
- This website has a high volume of pages, many backlinks, is made up of User Generated Material & augmented with information from verified sources, and has been online since around 2000.
- The website has three primary sections, but only two are in question. Lets call them directory A and directory B.
- Directory A is the 'main' directory that contains say 10,000 pages that acts as a hub for all the pages contained in directory B. Directory B contains lets say 500,000 pages
- Directory A URL = /directoryA/UniqueID
- Directory B URL = /directoryB/UniqueID
- Directory B honestly contains the most important information to our users. In other words, people searching can be looking for Directory A information, but 90% of the time are looking for information on the pages in Directory B.

The Problems/Challenges
- Directory A has a 99% index rate and monopolizes about 90% of our crawl rate.
- Directory B is not being crawled nearly as much, and until recently went from a 98% index rate to bleeding indexed pages until we are down to .5%.
- The weird thing is that all of the pages contained in Directory B are unique text (there are very near duplicates based on the type of information these pages contain), while Directory A pages (as they are a hub) are mainly filled with generic text. The value to our users is clearly in Directory B pages, but Google continues to value Directory A over Directory B.

Changelog details:
- We fouled ourselves over by changing the url structure of Directory A and Directory B prior to seeing a decline in indexing, but again it has been now 2 years so the big G should have changed over to our new url structure (and we setup 301 redirects to the new pages), plus Directory A is fully indexed and it also went through the url change.
- We fouled ourselves over again with incorrectly handling canonicals and ?tid tracking in urls which caused duplicate content but these pages are now all removed from Google's index. Again, time has passed enough for G to catch up.
- We've verified that robots.txt is accurate and that we do not block any of the pages we want indexed.
- We've done a great job (if I do say so myself) on updating sitemaps with a site this size we use .xml.gz for children sitemaps.
- We even engaged SearchBros to get search quality ex-googlers insights and followed all the basic clean-up items they have found.
- It's obviously easier for internal links for Directory A (as there are 100x more pages in Directory B) but we have gone through and utilized multiple solutions for making sure that there are at least a few internal links to every page in Directory B.

I could go on for days with additional information, but know you don't have the time to read every nuance. If you've made it this far thank you for reading! =-) Questions? Clarifications? Random insights or suggestions? Anything is welcome and appreciated at this point.

Thanks all! If we are able to resolve this issue I'll make sure to update this thread with our findings.
6:13 am on Sept 12, 2017 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 7, 2006
posts: 1066
votes: 108

there are at least a few internal links to every page in Directory B

What about link structure? You say that Directory A acts as a hub, and if the internal linking structure presents Directory A as Section pages, and Directory B as Subsections, that is probably how Google sees it. The directories are irrelevant (they are file-paths, not hierarchy). If you are effectively ranking the pages yourself, Google may just be ranking them as you do.

Link count is also important: it every page in Directory B links back to its "parent" in Directory A, the fact that "there are at least a few internal links to every page in Directory B" may be of little consequence if there are thousands of internal links to every page in Directory A.
5:02 pm on Sept 13, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 14, 2008
posts: 80
votes: 4

Thank you Wilbur. Very good food for thought. Directory A is made up of section pages, and Directory B pages are indeed the 'meat and potatoes' that people are looking for.

Definitely correct with the link count. And your suggestion does correspond to a push we are about to commit where we will be linking all of Directory B pages from Directory A (with a maximum 3 clicks to get to the actual page contained in Directory B). I'll make sure to update this thread post-launch to report on:
1.) Googlebot crawl in Directory B pages.
2.) Index changes.
1:57 am on Sept 26, 2017 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 14, 2008
posts: 80
votes: 4

This is interesting. Seems like even though I can verify specific pages in Directory B have been indexed (using a site: query), the search console continues to say "-" (aka 0) pages indexed. I look forward to the new beta search console release so perhaps we get more insight into why Search Console says pages aren't indexed even though we can verify that they are via a site query.

Will continue to update this thread as I find out more details in the hopes that this will help others.
11:13 am on Sept 26, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
votes: 781

Very nice overview of ... what?

What is the solution sought? A Long litany of fails, missteps, and boo boos. What is the end desire?

A group of URLS (Dir A) scores well. The actual content is in Dir B? Promote that (as you should, it is the CONTENT) and not worry about A.

Foot shooting is part of the web. I've done it myself from time to time.

View it from the USER point of view first, G certainly will (as will B and others, too)