Welcome to WebmasterWorld Guest from 54.224.200.104

Forum Moderators: Robert Charlton & andy langton & goodroi

Message Too Old, No Replies

Best practice for "directories" with no index page?

     
1:40 pm on Sep 23, 2012 (gmt 0)

Preferred Member

5+ Year Member

joined:Dec 19, 2007
posts: 404
votes: 0


just wondering what would be considered better practice for directories that have no index page, a no-indexed (robots meta) index page with links to relevant pages elsewhere on the site or just leave with no index page and forbid directory browsing in htaccess. yes i know this arises from poor design just wondered what would work best for google from this point.
5:37 pm on Sept 23, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member netmeg is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2005
posts:12929
votes: 200


If you don't want anyone browsing, just stick a blank index page in there.
5:39 pm on Sept 23, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Blank page returns 200 OK.

BIt's often better to turn off the DirectoryIndex feature.
5:48 pm on Sept 23, 2012 (gmt 0)

Preferred Member

5+ Year Member

joined:Dec 19, 2007
posts: 404
votes: 0


no its not a question of not browsing its a question of how to treat what are clearly low grade pages to google when being spidered. Doing nothing means it gets an a directory listing which google themselves index as an index html page. Clearly bad. if i turn off directory browsing google gets 100's of not authorized returns, so it doesnt know what is being forbidden. If i add an index page and put no-index in the meta then it sees 100's of no-index pages which will be about 10% of the site, wonder if thats a problem given the thoughts that such pages if a significant amount MAY hurt the site overall.
6:20 pm on Sept 23, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3227
votes: 146


A noindex meta tag in a subdirectory's index.html file should not have any effect on other pages in the same directory, BUT that assumes that those pages do have some other pages somewhere in your navigation that links to them and that they are in your sitemap. Or you could make the pages more useful and let them be indexed as /directoryname/. If the pages you want indexed have inbound links and are in your sitemap -and don't have a meta noindex tag they should get crawled and indexed just fine without the index.html pages being indexed or affecting other pages.

I recently needed to do something similar for a site where the folders previously did not contain any html pages and an urgent site redo had me end up with pages I want indexed in those subdirectories where I had had a noindexed index.html for years. I'm slowly working the site into proper structure, but those index pages were being shown in the sitemap until I changed them to index.php files. Now they sit there doing nothing except covering lists of files from nosey things. When I'm done they will be indexed pages with the url of /subdirectoryname/
10:39 pm on Sept 23, 2012 (gmt 0)

Moderator This Forum from GB 

WebmasterWorld Administrator andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:3332
votes: 140


Theoretically if there are no references to the root of the directory, it won't have any real side effect - or at least not one worth worrying about. That said, I've seen hungry Googlebot try directory roots just to see what happens, so best practice would suggest that you need those URLs to do something.

If you create any content at the URL, then you run the risk of creating additional, low quality URLs with no particular purposes, so I would say either refuse the requests, or if there's somewhere appropriate, redirect them. I wouldn't serve anything with a 200, robots excluded or otherwise - that just creates new content to be evaluated.
8:32 am on Sept 24, 2012 (gmt 0)

Preferred Member

5+ Year Member

joined:Dec 19, 2007
posts: 404
votes: 0


i thought that was a given, ive never seen a site where googlebot DIDN'T try the root of a directory even when there are no links to an index.
4:59 pm on Sept 24, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:Apr 30, 2008
posts:2630
votes: 191


You can stick in every directory the same index.php which just returns 404 headers or 403 headers when the directory is requested.
11:27 pm on Sept 24, 2012 (gmt 0)

Moderator This Forum from GB 

WebmasterWorld Administrator andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:3332
votes: 140


You can stick in every directory the same index.php which just returns 404 headers or 403 headers when the directory is requested.


But that creates additional files that require additional management. Options -indexes in htaccess will 403 every directory root that doesn't contain an index file, which seems like a more elegant solution if a 4xx response is desired.
8:25 am on Sept 25, 2012 (gmt 0)

Preferred Member

5+ Year Member

joined:Dec 19, 2007
posts: 404
votes: 0


i think the 404 route sounds the best idea. It ensures the index pages cannot be counted as content and prevents directory browsing. I just dont like the idea of 100's of forbidden requests where once there was indeed a page. I will probably make the headers return a 410.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members