Forum Moderators: open

Message Too Old, No Replies

Google doesn't seem to like relative links

Anyone else getting googlebot 404's?

         

Josefu

5:25 pm on Sep 14, 2003 (gmt 0)

10+ Year Member



First off I'd like to thank ALL of you here for helping me through the thumb-twiddling and hand-wringing first 'googleless' period plus giving me tips on how to make Mr. Googlebot like me more. Thanks, especially to those who made this place possible.

Now that Mister Googlebot has throughally groped my site I've noticed that he's hit a few errors for internal links - that don't exist in my site. I started off with a few relatively linked pages ("../../widgets.html") but then switched to absolute links ("/widgetfile/widgets.html") - but there's still a few more of those left.

I also noticed that the errors always include "/images/" with the rest of an existing link after it - my robots.txt says stay out of "/images/" - perhaps something to do with it as well?

Just wondering, and hoping Google 404's aren't of any concequence...

Josefu

11:41 am on Sep 15, 2003 (gmt 0)

10+ Year Member



Perhaps this has already been discussed elsewhere? If so, I couldn't find it : P

doc_z

12:05 pm on Sep 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Josefu,

are you using flash? This is the only case I know where problems arise from relative links (see this thread: Googlebot reading flash? [webmasterworld.com])

Zigire

12:20 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



Googlebot has been acting strangely today.

We use a <base href="http://www.widgets.co.uk/"> and all our links are relative due to the directory structure (i.e. www.widgets.co.uk/products/product.htm). I'm noticing hundreds '500 errors' because google was trying to pull www.widgets.co.uk/category/products/product.htm rather than the correct www.widgets.co.uk/products/product.htm and also A LOT of 404's ;)

This has only happened this morning and nothing has been changed over the past month or so (we get crawled daily).

The IP addresses of the googlebots were:
64.68.86.9
64.68.87.43
64.68.86.54
64.68.86.79

gaouzief

12:31 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



I get the same type of errors too;
none of my links are relative and sometimes Googlebot messes with the path as if they where relative

for example i get 404's with links like "/print/send" where as on the "/print/" page i only have links like href="/send/"

it seems like a bug to mee unless it's not really googlebot

Zigire

12:38 pm on Sep 15, 2003 (gmt 0)

10+ Year Member



it seems like a bug to mee unless it's not really googlebot

Must be - i've never noticed anything like this before in my logs. It's not an imposter as far as I can tell. The IP addresses I mentioned earlier appear in the post (http://www.webmasterworld.com/forum3/15655.htm)

Our other website was spidered fine although it doesn't use subdirectories with product categories - all files(products/categories) appear in root directory.

Josefu

7:24 am on Sep 17, 2003 (gmt 0)

10+ Year Member



(d-oh! Moved from elswhere)

Google has visited yet again today! But...

[error] [client 64.68.88.145] File does not exist: /home/********/public_html/salon/boutique/images/select/select.html
[Tue Sep 16 08:13:59 2003] [error] [client 64.68.88.151] File does not exist: /home/********/public_html/salon/boutique/images/original/original.html
[Tue Sep 16 05:59:37 2003] [error] [client 64.68.88.7] File does not exist: /home/********/public_html/salon/images/gallery/gallery.html
[Tue Sep 16 03:15:05 2003] [error] [client 64.68.88.131] File does not exist: /home/********/public_html/salon/boutique/images/couture/couture.html

...no relative links in Flash, either : P

Josefu

8:15 am on Sep 17, 2003 (gmt 0)

10+ Year Member



DOH! I don't know what it's been with me over the past few days...

We now have ireffutable proof that the googlebot DOES read Flash files. Sure enough, after doubting the certainty of my comments above, I opened up my flash files to have a look again, and lo and behold, I had a few relative links in there. The thing is, they were realtive from the page where the flash file was nested, not the flash file itself: the "widgetroom.html" page, residing withing the "widgetroom" folder, called the flash file from its own /images/ folder within the same folder... so if one was to follow a relative "newlink" link directly from the flash file one would get "/widgetroom/images/newlink"

This brings up a few more questions, though... : P