Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Important for directories to have index pages?

         

smithaa02

4:41 pm on Oct 11, 2011 (gmt 0)

10+ Year Member



eg On one of the sites I oversee, there will be a directory structure like this:

/california/blue/widgeta.php
/washington/red/widgetz.php
etc...

If a user navigates to /california...they get page not found. Same with /california/blue.

How bad is this and should all my directories have index pages? (I certainly don't link to these directories, so broken links shouldn't be an issue)

In the official, 'Search Optimization Starter Guide' from google they make the following observation:


Consider what happens when a user removes part of your URL -
Some users might navigate your site in odd ways, and you should
anticipate this. For example, instead of using the breadcrumb links
on the page, a user might drop off a part of the URL in the hopes
of finding more general content. He or she might be visiting http://
www.brandonsbaseballcards.com/news/2010/upcoming-baseballcard-shows.htm,
but then enter http://www.brandonsbaseballcards.com/news/2010/ into
the browser's address bar, believing that this will show all news from
2010. Is your site prepared to show content in this situation or will
it give the user a 404 ("page not found" error)?


For being a topic in a 'Search Optimization Starter Guide', I suspect this isn't merely about helping users who are being creative with the URL paths.

Does google think...wait a moment, if there is no index page, then that's not a real directory, but merely an URL rewrite and a very long page name and should be judged accordingly? Do index pages add to the legitimacy of a site's directory structure as used in the URLs?

[edited by: Robert_Charlton at 7:06 am (utc) on Oct 12, 2011]
[edit reason] fixed example link display problem [/edit]

tedster

7:11 pm on Oct 11, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First let's address the assumption behind the last paragraph - that Google somehow "thinks less" of rewritten URLs, or scores some negative judgment, however small. That's not the case. If the URL rewriting structure is technically solid, then "SEO-friendly", short & sweet descriptive URLs are a better practice.

I'd say it is a better practice either to serve content for the directory root or to return a 403 Access Denied status (not a 404 Not Found). By default, most servers would list all the URLs within the directory but the server can easily configured to return a 403.

As something that's good for the end user, I prefer serving useful "high level" contet - for all the reasons that Google explains on that Help page you quoted.

smithaa02

8:25 pm on Oct 11, 2011 (gmt 0)

10+ Year Member



To clarify, I don't think google has a problem with rewritten urls in general (although they are on record as saying they hate rewritten urls for dynamic content).

Don't you think it was odd that an SEO guide from google asks that you ensure that your directories have index pages?

To me the matter is how google treats directories in the URL path. If they know each dir in /blue/yellow/green/red/orange/widgets_a.php has a directory index, don't you think they would give more weight to such an url because they know it's not a glorified file name pretending to be a long file path, but does belong to a proper hierachy and is less likely to be keyword stuffing?

mark_roach

8:49 pm on Oct 11, 2011 (gmt 0)

10+ Year Member



although they are on record as saying they hate rewritten urls for dynamic content


That is a new one on me, do you have a link ?

smithaa02

8:53 pm on Oct 11, 2011 (gmt 0)

10+ Year Member



[googlewebmastercentral.blogspot.com...]

Think this was written by Cutts in 2008 and is more so about making sure you don't use url-rewriting to replace GET variables as opposed to be against URL rewriting in general (although it is worded in a confusing manner):

Does that mean I should avoid rewriting dynamic URLs at all?
That's our recommendation, unless your rewrites are limited to removing unnecessary parameters, or you are very diligent in removing all parameters that could cause problems. If you transform your dynamic URL to make it look static you should be aware that we might not be able to interpret the information correctly in all cases. If you want to serve a static equivalent of your site, you might want to consider transforming the underlying content by serving a replacement which is truly static. One example would be to generate files for all the paths and make them accessible somewhere on your site. However, if you're using URL rewriting (rather than making a copy of the content) to produce static-looking URLs from a dynamic site, you could be doing harm rather than good. Feel free to serve us your standard dynamic URL and we will automatically find the parameters which are unnecessary.

mark_roach

9:27 pm on Oct 11, 2011 (gmt 0)

10+ Year Member



If you transform your dynamic URL to make it look static you should be aware that we might not be able to interpret the information correctly in all cases


That certainly is a confusing statement. What is there to interpret about a static looking URL ? Just get your spider to request the URL and take the data.

I originally re-wrote my URLs because back in the day (when Google wasn't the only SE worth bothering with) spiders did have trouble crawling dynamic URLs. These days I re-write them because it is easier for my visitors to remember and share them.

I would like to think tedster is right and that it makes no difference to Google either way whether you re-write or not.

g1smd

9:35 pm on Oct 11, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google wants you to use URL rewriting only if you're doing a very good job.

An example of a bad job is when a site using URLs like:
www.example.com/index.php?product=12532&maker=acme&size=large&colour=blue

is altered to use URLs like this:
www.example.com/product/12532/maker/acme/size/large/colour/blue


In this case, a request for:
www.example.com/product/12532/maker/
is quite meaningless.

That's an example of a URL system that is not technically sound.

lucy24

9:42 pm on Oct 11, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is your site prepared to show content in this situation or will it give the user a 404 ("page not found" error)?

Why would it ever return a 404 in the first place? That's for nonexistent pages. For directories, you get either an index (whether named or auto-generated) or a 403. In fact, that's what 403 means for ordinary humans. It's not "You are an evil robot who is not allowed to cross this doorstep" but simply "Sorry, there's nothing for you in this directory".

deadsea

10:10 pm on Oct 11, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As a user, if I find directory structure in your url I expect to be able to use it. Either have index pages, or be able to redirect me to an appropriate page, even if that page is the home page.

scooterdude

10:38 pm on Oct 11, 2011 (gmt 0)

10+ Year Member



Curiously, i had a look at amazon and they return custom 404s if you try a truncated url, ok less branded site isn't amazon,

to me, custom 404 is friendly, common, well known and i don't see any search engine reporting that type of 404 as error or trying to index it

403 forbidden i'm not so keen on

lucy24

11:19 pm on Oct 11, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



403 forbidden i'm not so keen on

That's because you're a webmaster yourself and have gotten in the habit of equating 403 with lockouts. Deny from, - [F], that kind of thing. To ordinary humans, it's just a word. Maybe an overly strong word, but ours not to reason why.

As a user, if I find directory structure in your url I expect to be able to use it. Either have index pages, or be able to redirect me to an appropriate page, even if that page is the home page.

Someone really needs to do a broad-based survey. See, I absolutely hate it when I try an address at random and if there's nothing there, I get sent to the home page unasked. If I had wanted the home page, I would have said so.

The 404 and 403 pages are both made for humans, so there's no reason not to offer clickable links. But let the human decide. OK, so there's nothing at dir1/dir2/dir3. Give me a list of actual page titles-- which may or may not include the home page, depending on site structure-- and I can decide for myself which one is most likely to have what I'm looking for.

deadsea

11:34 pm on Oct 11, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The home page is usually a poor choice to redirect to. If somebody requests www.example.com/product/12532/maker/ (to pull from an earlier example) then a redirect to
www.example.com/brands/7366/maker-products.html or www.example.com/search?q=maker so that I could find a list of products that under the brand that I was looking for.

Sgt_Kickaxe

11:59 pm on Oct 11, 2011 (gmt 0)



I know of a link directory that accepts no submissions, nor does it ever update the 50 or so links in the directory, and it hasn't in many years.

What the site does have however is two links to "stores" on the same domain name which show literally millions of items under various sorting options/pagination which in turn create additional pages.

8 million+ pages indexed, 7 million+ images indexed (all affiliate, not even hosted by this site) and over half a million visitors per month.

HOW !?!?

~ Age, it's one of the oldest directories out there.
~ incoming links, nearly a decade of backlink building has to be worth something.
~ but what else ?!?!

The end result is a link directory FRONT with affiliate spam back end and for whatever reason Google has never downgraded this site, the owner is now very rich too.

So when you ask "do link directories..." my knee-jerk reaction is to say nothing you can do to the site will matter since Google focuses on the things you cannot change (years of history and backlinks) too heavily. The site is proof.

(p.s. if a mod needs me to divulge the site to make this claim/post stay feel free to sticky me and ask. Others, please don't ask)

g1smd

12:04 am on Oct 12, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The OP was asking about "directories" as in "folders" and whether each folder should contain an index page, rather than "directories" as in "directory websites". :)

mhansen

2:42 am on Oct 12, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As a user, if I find directory structure in your url I expect to be able to use it


Ditto... and I DO use them.

lucy24

3:29 am on Oct 12, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Oh, wait a minute. What do you mean, "if I find directory structure"? If you don't find directory structure, you find a site where all pages are lying loose in the top level. Did you mean "multiple levels of directories"?

Some people seem to be saying: if the content of /dir3/ is user-accessible, then the content of /dir2/ must also be user-accessible. And I'm not getting that conceptual leap.

Suppose, say, you've got a gazillion images. To save your sanity and keep everything organized, you've got them grouped. So instead of /dir1/ containing files beyond number, you've got /dir1/dir2a/ for one category, /dir1/dir2b/ for another category, /dir1/dir2c/ ... et cetera. Sure, some users might be inquisitive and see what's lurking in /dir1/. But I don't see where that action turns into a moral imperative for you to provide something.

The user didn't get there by clicking on a link-- which does carry an obligation. They're just trying doors. If you tell a houseguest that the bathroom is the second door on the left, there will always be those who get nosy and try the first door. But that doesn't confer any obligation on you. Well, maybe to lock door #1 if that's where your attack-trained Rottweiler lives. But that's all.

Sgt_Kickaxe

3:58 am on Oct 12, 2011 (gmt 0)



g1smd - I plead the... knee-jerk reaction!

Oops.

g1smd

6:42 am on Oct 12, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



No problem. What you said was useful in and of itself, just slightly off topic. :)

g1smd

6:45 am on Oct 12, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@lucy I think it is fair to say that if a site has /dir11/dir21 and /dir11/dir22 that there is an expectation that asking for example.com/dir11/ will provide a list of the folders below it.