Internet Explorer finds directory links that are 403

Forum Moderators: open

Message Too Old, No Replies

Internet Explorer finds directory links that are 403

deadsea

11:40 am on May 29, 2008 (gmt 0)

I'm trying to track down a large number of 403 forbidden accesses in our log files. We have pages like this:

http://example.com/directory/file.html
That page has links to:
/
/directory/otherfile.html
/page.html
All the links on the page start with a slash.

Clients that use Internet Explorer end up frequently requesting
http://example.com/directory/
with a referrer of
http://example.com/directory/file.html
despite the fact that there are no links to the directory on the page.

We don't have an index page, and for technical reasons, it is is difficult for us to create one. So, IE users end up getting lots of 403 forbidden errors.

Yahoo slurp also requests http://example.com/directory/ but safari, firefox, googlbot, msnbot, and other user agents *never* seem to fetch the directory url.

What could IE and Slurp possibly be seeing on these pages as a link to the directory that no other user agent sees? Has anybody run across anything like this before?

pageoneresults

11:54 am on May 29, 2008 (gmt 0)

We don't have an index page, and for technical reasons, it is is difficult for us to create one. So, IE users end up getting lots of 403 forbidden errors.

Can you set up a redirect so the users end up at a valid page instead?

Yahoo slurp also requests http://example.com/directory/ but safari, firefox, googlbot, msnbot, and other user agents *never* seem to fetch the directory url.

There must be a link somewhere. Or a Toolbar that is phoning home with someone's dev movements? Who knows...

Its also possible that someone is hacking the URI. Land here http://example.com/directory/file.html and then trim back to here http://example.com/directory/ to see what's there. Depends on your audience and if they are nosy or not. I do it quite frequently but I'm probably not the average site visitor either. And when I do it, my Google Toolbar is active so its sending information. ;)

deadsea

1:12 pm on May 29, 2008 (gmt 0)

I would think there was a link too, but I've search the page source, and clicked on every link on the page. If there were a link I would expect that firefox and safari users would also follow it.

People that explore the url by hand don't end sending a referrer string, so I don't think it is that either. Also, about 10% of IE users get to the directory level, which is a very high percent for url exploration.

The toolbar theory is a good one, but I'm not sure what what toolbar that would be or what it would be trying to do. If there is such a beast, I would like to somehow prevent it from doing that.

pageoneresults

1:18 pm on May 29, 2008 (gmt 0)

I would think there was a link too, but I've search the page source, and clicked on every link on the page. If there were a link I would expect that firefox and safari users would also follow it.

Ah, is there an external link though? Did someone maybe link to that page by accident? Or purposely? Do a site: search for that particular URI and see if the SE's have the reference indexed.

deadsea

1:25 pm on May 29, 2008 (gmt 0)

The referrer is always from our page.

deadsea

1:34 pm on May 29, 2008 (gmt 0)

If you submit the directory url to google search, you get a list of the pages under that directory, but no link to the directory itself.

Yahoo search shows no results.

jdMorgan

1:45 pm on May 29, 2008 (gmt 0)

This is common with IE because of the "discussion bar" for on-line collaboration -- part of the MS Office suite, and because some people use FrontPage to surf.

You will probably find, after looking at your raw server logs, that these are HTTP OPTIONS and PROPFIND requests for the directory "page," rather than GET requests. These are often accompanied by requests such as "GET /_vti_bin/owssvr.dll" with an IE user-agent, and "GET /_vti_inf.html" and "POST /_vti_bin/shtml.exe/_vti_rpc" with a FrontPage user-agent.

Because you have no directory index page, and your server likely has "Options -Indexes" set (or equivalent for IIS), the server responds with a 403 for the index requests.

The non-Microsoft browsers have no such integration with Office, so you won't see this problem with them.

Yahoo? What can I say? Their 'digging' for unlinked directory index pages is just annoying.

Jim

deadsea

3:39 pm on May 29, 2008 (gmt 0)

I don't see any OPTIONS or PROPFIND requests. They all appear be GET.

Something like this:
124.#*$!.#*$!.#*$! - - [16/May/2008:05:25:54 -0400] "GET /directory/ HTTP/1.1" 403 347 "http://www.example.com/directory/file.html" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media Center PC 3.1)"

However, I do also see the _vti_ requests from the same IP addresses. So I think that your MS Office suite suggestion is correct.

And Yahoo may just be digging, they never send a referrer.

Thanks!

deadsea

3:45 pm on May 29, 2008 (gmt 0)

It would appear then that these directory level requests happen automatically when the user visits a deeper page and the user never actually sees the directory level page because it is just a background request made by the office integration.

Is there a way to identify these background requests? Are there special headers that get sent for example?