Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Strange URLs in Webmaster Tools "Not Found Pages"

         

omoutop

7:40 am on Dec 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi all and thanks for any info/insight you can provide.

I am working with Google for web statistics and I checked the "Not found pages" in Google databases.
The bad news, it shows me thousands of pages that can not be found.

But after checking many of them, I have noticed that there is nothing wrong in my pages and code.
But for a reason, Google creates extra folders and for this reason, it considers that the page does not exist.

I have a the following page
www.example.com/some/path/index.htm
...that links the page
www.example.com/folder/page.html using the code:
<a href="../../folder/page.html">Forums</a>

but google finds the link www.mysite.com/folder/other/page.html which is totally wrong as there is the extra folder "other" which shouldn't be there.

Do you know why is that created? Anything wrong in my code? Should i change the link to <a href="/forum/forum33.html">Forums</a> or is it a bug from google bot?

I have checked all pages that link to the www.example.com/folder/page.htm with link checkers, sitemap generators and other tools, but so far the "wrong" url cannot be found/generated. And i am talking about some thousands error links.

Any help is appreciated

[edited by: tedster at 7:54 am (utc) on Dec. 17, 2008]
[edit reason] switch to example.com - it can nevfer be owned [/edit]

ZydoSEO

12:41 pm on Dec 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are you sure you don't have canonicalization issues w/ the original page (although this still wouldn't explain them inserting some random folder name). Perhaps they found that page with a different URL with a different number of folder names (possibly because you LINK'd web folders in Unix).

In IIS you can set up virtual directories so that:

http://www.example.com/virtualdir

is essentially an alias for:

http://www.example.com/folder1/folder2/folder3/realdir

If a program in the real directory is accessed via http://www.example.com/virtualdir and has a relative reference to say "../images/mypic.jpg" to get to /imageas/mypic.jpg then the code would work as long as it were accessed via the virtual directory but could break if accessed via the real directory because it would be looking /folder1/folder2/images/mypic.jpg and that folder doesn't exist.

For this and other reasons, I avoid folder/page relative paths. I always use root relative paths (e.g. /images/mypic.jpg) or absolute paths.

[edited by: ZydoSEO at 12:43 pm (utc) on Dec. 17, 2008]

omoutop

2:00 pm on Dec 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



no virtual directories whatsoever (we are on a dedicated server with unix/apache/php enviroment)

All of our urls are created by mod_rewrite (so an extra folder in url should trigger a different set of rules to apply for the page, loading completly different content from our database, or leading to a custom 404 error page).

The strange behaviour is that the cashed page in google shows the normal content (as if no extra folder was inserted in url), but google itself has the modified url stored. Strange eh?