Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

href='domain.com/cat1/cat2/./page1cat1.html - is it ok with Google?

         

AjiNIMC

6:51 am on Aug 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Every time I think that I won't a hit a new webby thing today, Gods of web proves me wrong. Here is another such instance. After 5 years, I saw a very strange thing (Don't ask me why will someone like to do this, do you have any idea?).

Links given in this format:
a href='example.com/cat1/cat2/../page1cat1.html' and it takes me to example.com/cat1/page1cat1.html ... After checking the headers no redirection, browsers handles it really well. Also the link shows example.com/cat1/page1cat1.html for most of the browsers (FF, IE), for safari the link shows like example.com/cat1/cat2/../page1cat1.html but takes you to example.com/cat1/page1cat1.html.

You can play with this. Any idea about the acceptance of this format by HTML standards? Logically it is perfectly ok as once it hits apache server it can understand slashes and dots. Also I tested that it doesn't go below document root (obviously, I just tested it).

Any idea about Google spiders?

Enjoy!
Aji

[edited by: tedster at 3:55 pm (utc) on Aug. 21, 2008]
[edit reason] switch to example.com [/edit]

tedster

4:03 pm on Aug 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If the server resolves the url with a 200 OK, then it's fine with Google and can be spidered and indexed.

My guess is that it's a poorly implemented URL rewrite system. Some people set up their logic to key off the "cat1" and "page1" and their server ignores extra characters anywhere in the filepath. You can put in any kind of extra garbage and the url will still resolve. That is a recipe for duplicate problems in the future.

tedster

1:19 am on Aug 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There's another possibility - you may be seeing this kind of thing, menioned in the main thread about canonical urls problems:

jdMorgan:
Here's one more. The following URL is valid, but non-canonical:

http://www.WebmasterWorld.com/http://www.WebmasterWorld.com/

The default server behaviour is to use the domain in the "local URL-path" part, as long as it's valid.

Canonical URL Issues - including a new one [webmasterworld.com]

AjiNIMC

11:08 am on Sep 2, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Tedster,

Now I see that more people are using it. Is it something to do with frontpage? May be the sites created in frontpage shows this pattern.

Regards,
Aji