Forum Moderators: open
How can I get Google (or any bot) to crawl that deep? Right now the site is a PR 6 and NONE of the banklinks are from the site's own pages. Due to the former design Google is only aware of a few fluff pages (about us, etc).
The site is built around a search engine right now, which I am improving. For the user's sake I'd like to also build a hierarchical directory ala Yahoo/DMOZ but I'm faced with technical hurdles (the current database leaves a lot to be desired). If I did do that, though, it'll certainly be more than 3 levels deep. How deep will Google go, and how can I get her there?
I also thought about having an alphabetical listing just for Google (click "A" to see al the A's, etc) but each letter has anywhere from several hundred to many thousands of items. Plus that wouldn't be very themed.
Is this a good case for cnames? There are about 5 broad categories I could divide it up into. Would that have any impact on any PR that would be generated from internal linking?
Whichever approach I go with will take a lot of work, so I was hoping to get some feedback from you gurus in here first. How would you do it and what are the pros and cons?
What I did was make sure that all the pages were only two levels from the root and had no querystring variables. I did this with rewrites in apache. If you're running apache, I'll be happy to give you some tips. Otherwise you'll have to look up rewites for you server software.
Here's a sample url on my site:
[sampleurl.com...]
The URL used to be similar to [sampleurl.com...]
I can sticky mail the url of the site to you if you wish.
Thanks,
George
So with 140,000 pages not more than two levels deep, that's got to add up to alot of links per page, which is one thing I'm concerned about.
I'd love to see the site if you don't mind sticky'ing me. Thanks! :)
Try to keep the names short. Google does not like long urls.
Or list all folders in a database ¦ID¦Name¦ and link to them as /file/SubID1.SubID2.SubID3.SubID4
--
globay
home page
......level 2: contains 100 links to level 3 pages
..........level 3: 100 links to level 4 pages
...............level 4: 5 links to the real content
.....................the real content
100 * 100 * 5 = 50000
So the real content would be 5 levels deep in that scenario. If I used cnames for the level one, then it would still be 4 levels. To compress it to two or three levels, there would need to be many more than 100 links per page.
Everything I've read around here says keep it no deeper than 3 levels, try to stay withing about 100 links per page max. I don't see how that's possible with such a large site. So should I:
A) keep it at 100 or so links per page and let it go 5 levels deep, or
B) go with 300-500 links per page so it'll be 2 or 3 levels deep, or
C) is there another way?
A is much more user-friendly than B I would think, but I'm hoping there is a C. :) And I have no problem with creating a multi-page site map containing several hundred links per page if that's what works.
Why do these pages have to be in sub directories?
Surely they could all be in the root if you so desired...
You can have less links per page, and less sub directories.
Everything I've read around here says keep it no deeper than 3 levels,
"Three levels" means no more than 3 subdirectories.
If you have the following site-structure:
/index.html -> (links to)
--- categories [/cat] ->
------ subcat [/cat/subcat] ->
--------- subcat2 [/cat/subcat.name] ->
------------ ... ->
--------------- ... -> [/cat/subcat.name.name2.name3.]...
everything should be ok, am I right?
You will be just 2 directories down from your root directory, and you can have as many subpages as you want, thus you can have less than 100 links a page.
Correct me if I am wrong!
--
globay
But the proof is in the pudding - gbaker123 stickied me a PR6 site where the content pages are four click-levels down (the paths/filenames are all /dir/str.str.str/) and Google spidered all 140,000 pages. So I'm going to give that a shot, along with the mod-rewrite of course. I'll post back the results (probably a couple of months).
Thanks for all the help so far. :)
linking structure determines the level, not the path name
Just for the record a lot of what I've been reading around here tends to confirm this, especially jdMorgan's posts (#2 and #5) in this thread: