Forum Moderators: open
Example, one client wants to promote a small website but has every page at least 2 levels deep.
domain.com/widgets/widgets.html
domain.com/prices/prices.html
Each of these folders (widgets, prices) has nothing else in it, apart from that 1 page, so no reason to have a folder at all, as far as I can see.
I hate to see this kind of waste and I'd like them to ditch the redundant directory level. Am I right to get so annoyed about this, or does Google cope fine with this nowadays?
Even a short delay in indexing would be reason enough for me to tear up this structure!
always assumed (rightly or wrongly) that
the further back (deeper) your pages are in the directory structure
maybe i have misunderstood you..... but directory structure does not effect crawling. The number of steps / levels to a page does.
i.e. A site could have a link from the main page to [example.com...]
this is still only one step as the google bot has found the link on the main page.
If directory structure impacted crawling we would all have all of our files in the root directory.
Thus changing the physical location of your pages (without changing your linking structure) is going to be of no benefit.
It is not of course a confirmed theory, but is what I have seen in my site happening for pages that have no external link referals.
CS.
For example all my files (except the home page) are in directories only 1 level deep. But Google isn't aware of this, only the linking structure.
/index.php > /dir1/file1.php > /dir1/file2,3,4, etc. File 1 gets the PR of the index page -1, and files 2,3,4 etc get PR of -2.
It's still the same if directories are crossed. E.g., /index.php > /dir1/file1.php > /dir2/filex.
There is a trade off between having a very pyramid type shape to your site that focuses the PR at the top, and having a grid topology that steals the 'bang' from the top pages, but distributes the PR to the lower pages.
Removing the ".html" isn't going to do anything for you in google.
No, except as you appear to have it structured now, there is a 404 page at the root directory level. There isn't any benefit in that.
"Any Plus ditching the file name?"
Definitely, but not in terms of crawling. The first benefit is to not have the 404 page. The second benefit is it allows you the ability to change from index.html to index.htm or to index.php as the default directory page without losing any PR or link benefit (especially from off your domain that you can't easily change).