Forum Moderators: open
Something to the effect of blue-widgets-230.html blue-widgets-2938.html red-widgets-2338.html etc? It seems that, although the toolbar "guesses" PR based on how far down from / a page is, that once true PR is calculated, its more a factor of the link structure rather than the directory structure.
Or, is it some of both?..
But..when is TOO flat?
In any case I think back to 10 years ago, going to a computer novices house whose computer was "broken" because he had too many files in the root directory. He didn't understand that you could create subdirectories and therefore never did. :) All his files were in / (well \ ).
I imagine a finicky googlebot who thinks that a site is too messy because it has 15000 files in the root directory instead of nicely sorting them into proper directories. Am I crazy? :)
The permance gain or loss of having that many files in a single directory on a machine is irrelevant to this excercise.
Opinions? What is too flat? Is there such a thing?
Would get a little messy also in your root.
[edit] "PageRank is on a page-by-page basis" so this means that every html will be indexed? or not?
Anyway would never do this. At least i would make easy named subs.
If you have enough PageRank to encourage Googlebot to spider deep through the link structure, then I don't see why you couldn't get indexed.
Does anyone here have experience of thousands of URLs in one directory? (or that might look like they are)
But..when is TOO flat?
Never, I guess: I really can't think of any good reason why Googlebot should have any problem with a "flat" site. I don't think a site with many subdirs is bad, either, as long as the link structure is good and the URLs fit in your browser. ;)
In fact,
Internet Explorer has a maximum uniform resource locator (URL) length of 2,083 characters, with a maximum path length of 2,048 characters. This limit applies to both POST and GET request URLs.
However,
RFC 2616, Hypertext Transfer Protocol -- HTTP/1.1 [ietf.org], does not specify any requirement for URL length.
:)
Does anyone here have experience of thousands of URLs in one directory? (or that might look like they are)
i have millions of pages in one directory. google has indexed about 120,000 of them, those that have a relevant amount of inbound links and PR (the rest is a bunch of pages of lesser importance).
no problems whatsoever. i think i have noticed a slight preference of googlebot on pages with shorter filenames on my site. so all pages in the root directory shouldn't be a disadvantage IMO. however, as ciml points out, how many of your pages will be indexed will rather depend on PR issues.
no problems whatsoever. i think i have noticed a slight preference of googlebot on pages with shorter filenames on my site. so all pages in the root directory shouldn't be a disadvantage IMO. however, as ciml points out, how many of your pages will be indexed will rather depend on PR issues.
When you say filenames, do you mean filenames? hehe, sorry. I just want to be clear..you mean short filenames contained in that one flat directory, right?
Interesting.
Reading what has to be said by GG about dynamic URL's [webmasterworld.com].....and thinking of how easy it is to create one page and many dynamic URL's you can maybe see why they encourage the showing of dynamic URL's rather than something that looks like a file :) Well, it's a bit 2+2....
I see the use of folders good for humans at least...don't see why Google should bother.
The site structure that is important to Google is the site link structure, not the site directory structure.
Of course, a popular way to organize your site is to have your site link structure shadow your site directory structure. If you are doing this, then we are really discussing the same issue.
But, the question of how flat can a site link structure is an important we have had to deal with. It arises for us when the data we work with is inherently of a flat nature (such a big list of names) and there are a large number of pages to present. In this regard, it comes down to this:
Yes, a site (link) structure can be too flat.
If the number of child pages for a parent page is greater than the number of links that googlebot will successfully read off that page, then the site structure is too flat.
How many links can a page have? Here's how you can get a handle on that. Technically, the maximum page size visible to Google is 100K. So, the question is: How many links can you put on the page so the total page size will not exceed 100K? If the other page contents are more brief, you use relative addressing, your page names are short, and your link descriptions are short, you will be able to maximize this number. My experience shows 1000-2000 links are reasonably possible. Extreme measures (no non-link content, a simple page, and short page names descriptions) could do a lot better.
So, if this works out to 1000 links per page for you, then a site of a million pages will require only two levels below the home page. The home page links to 1000 intermediate pages and each of them links to 1000 final pages.
[edited by: Jack_Straw at 2:07 am (utc) on Sep. 15, 2002]
savvy1, since muesli's pages all reside in the same dir, I guess he's referring to the filename (not pathname) length.
If I may ask, what's your site about?my site is a community and every registered user gets his own little website.
When you say filenames, do you mean filenames? hehe, sorry. I just want to be clear..you mean short filenames contained in that one flat directory, right?i mean it in a more general way: i have noticed googlebot not crawling certain pages. we i shortened the URL suddenly it did. no real empirical data. the experience is also from other parts of my site, not the million pages. the URLs in question were all dynamic.