Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Does google decide not to rank really deep pages?

         

SmAce

9:05 am on Nov 17, 2011 (gmt 0)

10+ Year Member



Hiya,

I am doing some work on a website whereby the URL has up to 9 subdirectories which are essentially the CMS's way of doing search.

An example URL is:
www.domain.com/dir1/dir2/a/a/a/a/a/a/a/

The a's are various search segments that need to be there to ensure the correct content is showing on the page.

However, these urls are not showing up within the site: operator and I'm wondering if Google is dismissing them as they are too long?

toplisek

9:21 am on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is any official explanation for this.

Marketing Guy

10:50 am on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Depends on the size of the site - I've seen very large sites have trouble ranking deep pages. Generally down to internal link structure / link equity rather than an indexing issue though.

toplisek

11:03 am on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



what do you mean by deep pages?
too many internal links or deep pages like too many directories?

SmAce

11:08 am on Nov 17, 2011 (gmt 0)

10+ Year Member



Thanks guys.

Marketing Guy - the site is around 450 pages but increasing everyday.

toplisek - I mean deep directories as in the url gives the impression the content is 9 directories deep...so seen as pretty unimportant?

toplisek

11:29 am on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I use the following:
en-GB/products-sales/category/subcategory/brandname/page1.html

As you see deep pages are at 5th level. This will be tolerated. You have 9 level if I understand. This can not be tolerated in my opinion but I have never found any article to ban such behaviour.

Let us know if this was helpful.

Marketing Guy

11:46 am on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



450 pages shouldn't cause any issues with crawl depth - I was talking more in the region of a million pages. :)

This begs the question - why does a 450 page site require such a directory structure?

Blogs only need /category/postname, forums generally need /forum/category/thread, ecommerce sites maybe a bit more categorisation with /category/productrange/brand/product.

You say the subdirectories are search segments - could it be the case that a number of your pages are versions of other pages with similar content?

For example you might have;

> Blue widgets category (4 products)
>>> Fluffy blue widgets (2 products)
>>> Ugly blue widgets (2 products)

In this case the parent category is essentially an amalagamation of the two child categories. This small example isn't a big deal, but if you're pulling off a similar thing on a larger scale to cover multiple category / product / brand / etc variations then it's likely some of your deep content won't rank.

Is a lot of your content duplicated across multiple pages?

topr8

12:01 pm on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>>why does a 450 page require such a directory structure?

i would assume because this is the way the internal redirect is managed and the directories are infact just parameters ... this would be the fault of the cms, because if this is the case then:

www.example.com/dir1/dir2/a/a/a/a/a/a/a/

could just as easily be

www.example.com/dir1/dir2/a-a-a-a-a-a-a

however if that is the way it is set up, then that is how it is.

toplisek

12:03 pm on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Duplicated across multiple pages is issue with INTERNAL links (content) structure not deep pages which is by definition directory structure. I agree with Marketing Guy in the other parts.

You see that Google sees your URL content. Too many directories and double content will not be tolerated...

In the past we had rules like Keywords (not more than 20 or number of characters inside Title). I have never seen defined official deep pages rule. Domains .COM are checked when you register domain by search engines. This is also not official explanation...

scooterdude

2:02 pm on Nov 17, 2011 (gmt 0)

10+ Year Member



Submit a sitemap,

Marketing Guy

2:31 pm on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Submit a sitemap

Yeh and completely ignore any potential site architecture problems. A 450 page website shouldn't need an XML sitemap to get indexed properly.

topr8

2:52 pm on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



i agree with marketing guy.
it is certainly possible to get thousands of pages indexed without a sitemap.
we've never used xml sitemaps and haven't so far had an indexing problem.

toplisek

3:03 pm on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



We use sitemaps for even smaller sized websites.
Sitemap will influence speed of index (30 days to check ALL web) and accuracy. I would not advise to eliminate this option even 450 pages is not too big...

Marketing Guy

3:58 pm on Nov 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes but the crucial point you are missing is that if the site has any technical issues that might hampering indexing (and subsequent link juice distribution and therefore rankings), using a sitemap to get deep content indexed just masks that problem and does nothing to resolve it.

The absolute worst thing a webmaster can do if their website is suffering from indexing problems, is submit an XML sitemap to Google. Best case scenario it doesn't help - worst case scenario it makes identifying the real problem much more difficult.

In the case of the OP - if it is indeed duplicated content / partially duplicated content / low PageRank / combination of all 3 that is causing his indexing issues, then using an XML sitemap will not help him. For whatever reason, Google has made the decision to not include certain content within the index - a sitemap isn't a way to circumvent that decision.

Don't get me wrong, sitemaps have their uses, but an easy alternative to understanding the indexing process isn't one of them.

Robert Charlton

5:38 pm on Nov 17, 2011 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



SmAce - As has been suggested, you may have duplicate content issues.

If your CMS is creating different pathnames to a product page depending on your navigational priorities, that would be one kind of duplication.

www.example.com/dir1/dir2/a1/a2/a3/a4/a5/a6/a7/

...is not the same, eg, as...

www.example.com/dir1/dir2/a1/a3/a4/a7/

Given content should be returned under one url only.

Your quickest fix, if you have the ability to control meta content on an individual page, would be to use the Canonical Link Element in the head sections...

<link rel="canonical" href="http://example.com/page.html"/>

As a rough rule of thumb, I'd generally pick the shortest pathname as your "canonicalized" form. For some basic info on the Canonical Link Element, see this Matt Cutts blog post [mattcutts.com...] and see discussions on WebmasterWorld (use site search) for relevant discussions.

If your CMS is creating different page content or combinations of content on a page depending on your navigation path (eg, if it's a form driven website), then you likely have more complicated issues.

With regard to page depth and indexing... the issue is PageRank distribution, which can fall off to "deeper" pages. Note that the directory path is not what decides "depth" of a page. Rather, the important factor is the number of clicks from the pages that attract inbound links. Most discussions assume that your default home page is going to be your most important link magnet... so the consideration is link depth from home, not directory depth.

Inbound links to deep pages and an intelligent navigation structure can help with PageRank distribution to those inner pages.

Though keywords in a pathname may get highlighted on a serps page and attract the eye, keywords in a path have only miniscule effect on ranking itself.

It sounds like you should be looking for a new CMS as a longterm solution.