Forum Moderators: open

Message Too Old, No Replies

Yahoo indexing directories instead of pages?

I got indexed by yahoo, but it's not indexing it properly

         

PipSqueak

3:36 pm on Sep 11, 2005 (gmt 0)

10+ Year Member



Hi

My site was indexed by yahoo, but they are only directories indexed, not my pages. For e.g., searching for my domain.com lists "Index of /images/ads", etc. It brings users to my directory instead of my html site.

Can anyone tell me what the problem is?

PipSqueak

1:31 pm on Sep 12, 2005 (gmt 0)

10+ Year Member



Should I wait for the next crawl? my site was first indexed around 31 August.

Marcia

1:40 pm on Sep 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From the looks of what you've posted, there's some kind of a problem on your end, not theirs. Does your /images/ directory have an index.html page? And if not, have you excluded the roots of directories from being accessed if there isn't any index page?

Also, if there are images in that particular directory, how are they being linked to? And are you using absolute or relative URLs for images and pages?

Crawlers follow links, so take a look at what you've got that they're following.

PipSqueak

1:45 pm on Sep 12, 2005 (gmt 0)

10+ Year Member



well, it crawls fine on msn. I have index.html pages in my root directory, problem with yahoo is that it crawls every directory I have, even the directories I use to put my images.

Lord Majestic

1:46 pm on Sep 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Why won't you exclude those directories you don't want to be indexed via robots.txt?

PipSqueak

1:57 pm on Sep 12, 2005 (gmt 0)

10+ Year Member



I don't mind it spidering my image directory when there's only pictures in it, but the spider also indexes my directories even when I have html docs in it. :(

Marcia

2:07 pm on Sep 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's nice when images get indexed because they can bring a decent amount of traffic in on the image search, which is reason enough not to exclude them being being indexed. But there's no reason not to exclude the ROOT from being accessed if it's empty. However, even in cases where I haven't taken the time to do it, there still isn't the problem you're describing.

How are the files in those directories being linked to?

Lord Majestic

2:16 pm on Sep 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



but the spider also indexes my directories even when I have html docs in it.

Well then exclude those, spiders can't read your mind, right?

PipSqueak

2:16 pm on Sep 12, 2005 (gmt 0)

10+ Year Member



I think it's kinda difficult to explain what I'm seeing. Try searching for <snip> on yahoo and it'll mostly list my directories instead of my pages.

[edited by: martinibuster at 3:36 pm (utc) on Sep. 12, 2005]
[edit reason] Please, no references to sites. [/edit]

Lord Majestic

2:24 pm on Sep 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, directory listings were text/html type, so they got indexed, but their contents were image files, which obviously won't be indexed by WWW search engine, and may be indexed by image search engine.

The solution would be having a default script (index.php) which would output 404 header and thus prevent directory itself from being listed.

PipSqueak

2:31 pm on Sep 12, 2005 (gmt 0)

10+ Year Member



But why is it indexing my image directories and not my other web pages other than my main page?

Lord Majestic

2:42 pm on Sep 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



But why is it indexing my image directories and not my other web pages other than my main page?

Who knows? Could be many reasons, perhaps their algo gives preference to directories when choosing pages to index from your site.

Marcia

3:25 pm on Sep 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Make sure you have a site map that's linked to from the homepage, and exclude the root pages of directories that don't have an index.htm file (or whatever the file extension is) from being indexed.

abates

9:46 pm on Sep 12, 2005 (gmt 0)

10+ Year Member



This thread is confusing me. Yahoo only indexes pages which have links to them. PipSqueak: have you got links to the directories on your site? If so, it would not be surprising that Yahoo is indexing them...

I tend to put dummy index.html files in directories like that, which redirect users to a more appropriate page. I've observed that people may access an image directly, then attempt to go to the directory it is in (could be using the "up" button on a toolbar). I decided a redirect was better than showing them raw directory contents or (as I have directory level access disabled) a 403 error.

Hope that is of some help.

Alden.