homepage Welcome to WebmasterWorld Guest from 54.234.2.88
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Best practice for "directories" with no index page?
santapaws




msg:4498804
 1:40 pm on Sep 23, 2012 (gmt 0)

just wondering what would be considered better practice for directories that have no index page, a no-indexed (robots meta) index page with links to relevant pages elsewhere on the site or just leave with no index page and forbid directory browsing in htaccess. yes i know this arises from poor design just wondered what would work best for google from this point.

 

netmeg




msg:4498873
 5:37 pm on Sep 23, 2012 (gmt 0)

If you don't want anyone browsing, just stick a blank index page in there.

g1smd




msg:4498874
 5:39 pm on Sep 23, 2012 (gmt 0)

Blank page returns 200 OK.

BIt's often better to turn off the DirectoryIndex feature.

santapaws




msg:4498879
 5:48 pm on Sep 23, 2012 (gmt 0)

no its not a question of not browsing its a question of how to treat what are clearly low grade pages to google when being spidered. Doing nothing means it gets an a directory listing which google themselves index as an index html page. Clearly bad. if i turn off directory browsing google gets 100's of not authorized returns, so it doesnt know what is being forbidden. If i add an index page and put no-index in the meta then it sees 100's of no-index pages which will be about 10% of the site, wonder if thats a problem given the thoughts that such pages if a significant amount MAY hurt the site overall.

not2easy




msg:4498885
 6:20 pm on Sep 23, 2012 (gmt 0)

A noindex meta tag in a subdirectory's index.html file should not have any effect on other pages in the same directory, BUT that assumes that those pages do have some other pages somewhere in your navigation that links to them and that they are in your sitemap. Or you could make the pages more useful and let them be indexed as /directoryname/. If the pages you want indexed have inbound links and are in your sitemap -and don't have a meta noindex tag they should get crawled and indexed just fine without the index.html pages being indexed or affecting other pages.

I recently needed to do something similar for a site where the folders previously did not contain any html pages and an urgent site redo had me end up with pages I want indexed in those subdirectories where I had had a noindexed index.html for years. I'm slowly working the site into proper structure, but those index pages were being shown in the sitemap until I changed them to index.php files. Now they sit there doing nothing except covering lists of files from nosey things. When I'm done they will be indexed pages with the url of /subdirectoryname/

Andy Langton




msg:4498999
 10:39 pm on Sep 23, 2012 (gmt 0)

Theoretically if there are no references to the root of the directory, it won't have any real side effect - or at least not one worth worrying about. That said, I've seen hungry Googlebot try directory roots just to see what happens, so best practice would suggest that you need those URLs to do something.

If you create any content at the URL, then you run the risk of creating additional, low quality URLs with no particular purposes, so I would say either refuse the requests, or if there's somewhere appropriate, redirect them. I wouldn't serve anything with a 200, robots excluded or otherwise - that just creates new content to be evaluated.

santapaws




msg:4499232
 8:32 am on Sep 24, 2012 (gmt 0)

i thought that was a given, ive never seen a site where googlebot DIDN'T try the root of a directory even when there are no links to an index.

aakk9999




msg:4499408
 4:59 pm on Sep 24, 2012 (gmt 0)

You can stick in every directory the same index.php which just returns 404 headers or 403 headers when the directory is requested.

Andy Langton




msg:4499495
 11:27 pm on Sep 24, 2012 (gmt 0)

You can stick in every directory the same index.php which just returns 404 headers or 403 headers when the directory is requested.


But that creates additional files that require additional management. Options -indexes in htaccess will 403 every directory root that doesn't contain an index file, which seems like a more elegant solution if a 4xx response is desired.

santapaws




msg:4499619
 8:25 am on Sep 25, 2012 (gmt 0)

i think the 404 route sounds the best idea. It ensures the index pages cannot be counted as content and prevents directory browsing. I just dont like the idea of 100's of forbidden requests where once there was indeed a page. I will probably make the headers return a 410.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved