Forum Moderators: open

Message Too Old, No Replies

helping Googlebot get around your site

how crucial is directory structure?

         

fom2001uk

3:47 pm on Jul 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've always assumed (rightly or wrongly) that
the further back (deeper) your pages are in the directory structure, the longer it takes to get spidered. How crucial is this an issue still for Google, and at what point do you get into bother (2nd level, 3rd level, 4th level of site structure)?

Example, one client wants to promote a small website but has every page at least 2 levels deep.

domain.com/widgets/widgets.html
domain.com/prices/prices.html

Each of these folders (widgets, prices) has nothing else in it, apart from that 1 page, so no reason to have a folder at all, as far as I can see.

I hate to see this kind of waste and I'd like them to ditch the redundant directory level. Am I right to get so annoyed about this, or does Google cope fine with this nowadays?

Even a short delay in indexing would be reason enough for me to tear up this structure!

trimmer80

9:28 pm on Jul 27, 2004 (gmt 0)

10+ Year Member



always assumed (rightly or wrongly) that
the further back (deeper) your pages are in the directory structure

maybe i have misunderstood you..... but directory structure does not effect crawling. The number of steps / levels to a page does.

i.e. A site could have a link from the main page to [example.com...]

this is still only one step as the google bot has found the link on the main page.

If directory structure impacted crawling we would all have all of our files in the root directory.

Thus changing the physical location of your pages (without changing your linking structure) is going to be of no benefit.

caspita

9:43 pm on Jul 27, 2004 (gmt 0)

10+ Year Member



In my opinion this is more related to the PR. From my experience GG use the directory structure to calculate the PR for the dependent pages. Just as an example let's say that your www.example.com domain has a PR5 and you have a direct link to a page www.example.com/dir/page.html, being this the only link to that page GG will try to use the parent to calculate a PR for that page and because according to the directory structure it is not one step below but two, it could give a PR3 intead of a PR4. I'm almost sure that page would be PR4 if the URL was just www.example.com/page.html

It is not of course a confirmed theory, but is what I have seen in my site happening for pages that have no external link referals.

CS.

graywolf

9:52 pm on Jul 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sub directories are one of the reason you want deep links, instead of only the homepage.

trimmer80

9:53 pm on Jul 27, 2004 (gmt 0)

10+ Year Member



this should effect PR in this way.
In my experience the transfer of pr is the same no matter what physical directory level.

steveb

11:14 pm on Jul 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Directory level has nothing to do with anything. Crawling and PR come from linking.

In the first example don't ditch the folders, ditch the html files... domain.com/widgets/

HarryM

10:58 am on Jul 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As Steveb said "Directory level has nothing to do with anything". The linking structure is all that counts in crawling and PR.

For example all my files (except the home page) are in directories only 1 level deep. But Google isn't aware of this, only the linking structure.

/index.php > /dir1/file1.php > /dir1/file2,3,4, etc. File 1 gets the PR of the index page -1, and files 2,3,4 etc get PR of -2.

It's still the same if directories are crossed. E.g., /index.php > /dir1/file1.php > /dir2/filex.

matrix_neo

11:19 am on Jul 28, 2004 (gmt 0)

10+ Year Member



In the first example don't ditch the folders, ditch the html files... domain.com/widgets/

Does that help all the files in the directory to be crawled? Any Plus ditching the file name?

jcoronella

10:34 pm on Jul 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The most important thing to get all your pages crawled is the amount of PR - second is how you distribute that PR. There is a point at which PR trails off and google stops paying a visit.

There is a trade off between having a very pyramid type shape to your site that focuses the PR at the top, and having a grid topology that steals the 'bang' from the top pages, but distributes the PR to the lower pages.

Removing the ".html" isn't going to do anything for you in google.

steveb

12:31 am on Jul 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Does that help all the files in the directory to be crawled?"

No, except as you appear to have it structured now, there is a 404 page at the root directory level. There isn't any benefit in that.

"Any Plus ditching the file name?"

Definitely, but not in terms of crawling. The first benefit is to not have the 404 page. The second benefit is it allows you the ability to change from index.html to index.htm or to index.php as the default directory page without losing any PR or link benefit (especially from off your domain that you can't easily change).