Welcome to WebmasterWorld Guest from 54.156.58.187

Forum Moderators: Robert Charlton & andy langton & goodroi

Why does Google value Directory A versus Directory B?

     
7:14 pm on Sep 11, 2017 (gmt 0)

Junior Member

5+ Year Member

joined:Aug 14, 2008
posts: 69
votes: 3


Hello again!

It's been a while since I've started my own thread, but this concern has been an ongoing issue for the last two years and we could really use some outside perspective. I'd appreciate any brainstorming, suggestions or general ideas especially from the old-timers like myself ;-) (this is my 2nd account- I lost my first one from waaaaay long ago). Here's the basics:
*Note: Numbers are fictitious and are used just for simplicity sake.

Brief Overview of Website
- This website has a high volume of pages, many backlinks, is made up of User Generated Material & augmented with information from verified sources, and has been online since around 2000.
- The website has three primary sections, but only two are in question. Lets call them directory A and directory B.
- Directory A is the 'main' directory that contains say 10,000 pages that acts as a hub for all the pages contained in directory B. Directory B contains lets say 500,000 pages
- Directory A URL = /directoryA/UniqueID
- Directory B URL = /directoryB/UniqueID
- Directory B honestly contains the most important information to our users. In other words, people searching can be looking for Directory A information, but 90% of the time are looking for information on the pages in Directory B.

The Problems/Challenges
- Directory A has a 99% index rate and monopolizes about 90% of our crawl rate.
- Directory B is not being crawled nearly as much, and until recently went from a 98% index rate to bleeding indexed pages until we are down to .5%.
- The weird thing is that all of the pages contained in Directory B are unique text (there are very near duplicates based on the type of information these pages contain), while Directory A pages (as they are a hub) are mainly filled with generic text. The value to our users is clearly in Directory B pages, but Google continues to value Directory A over Directory B.

Changelog details:
- We fouled ourselves over by changing the url structure of Directory A and Directory B prior to seeing a decline in indexing, but again it has been now 2 years so the big G should have changed over to our new url structure (and we setup 301 redirects to the new pages), plus Directory A is fully indexed and it also went through the url change.
- We fouled ourselves over again with incorrectly handling canonicals and ?tid tracking in urls which caused duplicate content but these pages are now all removed from Google's index. Again, time has passed enough for G to catch up.
- We've verified that robots.txt is accurate and that we do not block any of the pages we want indexed.
- We've done a great job (if I do say so myself) on updating sitemaps with a site this size we use .xml.gz for children sitemaps.
- We even engaged SearchBros to get search quality ex-googlers insights and followed all the basic clean-up items they have found.
- It's obviously easier for internal links for Directory A (as there are 100x more pages in Directory B) but we have gone through and utilized multiple solutions for making sure that there are at least a few internal links to every page in Directory B.

I could go on for days with additional information, but know you don't have the time to read every nuance. If you've made it this far thank you for reading! =-) Questions? Clarifications? Random insights or suggestions? Anything is welcome and appreciated at this point.

Thanks all! If we are able to resolve this issue I'll make sure to update this thread with our findings.
6:13 am on Sept 12, 2017 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 7, 2006
posts: 974
votes: 77


there are at least a few internal links to every page in Directory B


What about link structure? You say that Directory A acts as a hub, and if the internal linking structure presents Directory A as Section pages, and Directory B as Subsections, that is probably how Google sees it. The directories are irrelevant (they are file-paths, not hierarchy). If you are effectively ranking the pages yourself, Google may just be ranking them as you do.

Link count is also important: it every page in Directory B links back to its "parent" in Directory A, the fact that "there are at least a few internal links to every page in Directory B" may be of little consequence if there are thousands of internal links to every page in Directory A.
5:02 pm on Sept 13, 2017 (gmt 0)

Junior Member

5+ Year Member

joined:Aug 14, 2008
posts: 69
votes: 3


Thank you Wilbur. Very good food for thought. Directory A is made up of section pages, and Directory B pages are indeed the 'meat and potatoes' that people are looking for.

Definitely correct with the link count. And your suggestion does correspond to a push we are about to commit where we will be linking all of Directory B pages from Directory A (with a maximum 3 clicks to get to the actual page contained in Directory B). I'll make sure to update this thread post-launch to report on:
1.) Googlebot crawl in Directory B pages.
2.) Index changes.