Welcome to WebmasterWorld Guest from 54.144.1.142

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

Googlebot User Agent Can Trip a Bug in ASP.NET 2.0

     
3:41 pm on Nov 6, 2007 (gmt 0)

Junior Member

10+ Year Member

joined:July 4, 2005
posts:163
votes: 0


About a month ago, many of our pages (.NET / IIS6) that were being crawled began producing a 500 -ONLY- when googlebot came by. This lasted for roughly 3 weeks. We could see the affected pages in cache with our custom 500 page. While we were researching the problem, we found threads of others experiencing the same problem related to a change in the user-agent string.

After about 3 weeks, the 500s stopped and the normal pages in the cache and live results began to appear. We did nothing as we found no solution to the problem. It looks like Google may have "fixed" something.

The problem is that over the three weeks our rankings slowly went down due to all of the 500s which produced the same page. The rankings have not been restored since the "fix".

I kinda feel like this may have been a bug on Googles part and we are being penalized.

Anyone else?

9:28 pm on Nov 6, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


I'm not clear why you are calling this a "user agent string" bug. If your server returned a 500 http status, then your server was saying it encountered an unexpected internal condition that prevented it from answering the request. Is your server configured to treat different user agents differently?
9:55 pm on Nov 6, 2007 (gmt 0)

Junior Member

10+ Year Member

joined:July 4, 2005
posts:163
votes: 0


it appears that way. The only requests that were resulting in a 500 were ones from googlebot and it was the user agent string that appeared to have been different.
10:44 pm on Nov 6, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


What was the offending user-agent string, then?

Jim

2:05 pm on Nov 7, 2007 (gmt 0)

Junior Member

10+ Year Member

joined:July 4, 2005
posts:163
votes: 0


User-Agent: Mozilla/5.0
instead of User-Agent: Mozilla/4.0

Admin note: Here's a link to a good summary of the issue:

[kowitz.net...]

[edited by: tedster at 6:34 pm (utc) on Nov. 7, 2007]

3:30 pm on Nov 7, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member jimbeetle is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Oct 26, 2002
posts:3295
votes: 6


Reading the fix I found from your search it appears that it was ASP.Net 2.0 choking on the "Mozilla/5.0" part of the string -- not really google-related problem at all. And it wasn't limited to just googlebot, but a few other UAs ASP.NET also choked on. It's tough, but sounds like if you did put the recommended fixes in the only thing you can do now is sit back and wait until your pages are reindexed.
6:51 pm on Nov 7, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


I've been researching this for several clients on the .NET 2.0 platform, and those who use Helicon's ISAPI Rewrite for url rewriting do not seem to have a problem with googlebot's US string. I haven't zeroed in yet on which solutions do trigger the 500 error problem, or why. A number of .NET/IIS rewriting solutions are discussed on Scott Gu' blog [weblogs.asp.net] (he's a developer for Microsoft) and only some of them have the problem.

I'd sugggest that anyone using .NET 2.0 install Firefox and the UserAgentSwitcher add-on so they can try a page from their site as Googlebot and discover if they get a 500 error. It's a quick health check-up that is well worth the little bit of time it takes.

I also note that mikey158's problem "fixed itself" with no apparent changes on his server - so possibly the Google crawl team is taking some action here when they see too many 500 http errors. All I can imagine them doing, however, is re-crawling with a different user-agent string, one that does not include "Mozilla".

Does anyone see that in their server logs?

7:36 pm on Nov 7, 2007 (gmt 0)

Preferred Member

10+ Year Member

joined:July 13, 2006
posts:500
votes: 0


Anyone hazard a guess why Microsoft has not fixed this yet? Its been a while since Google changed their user agent...
2:22 pm on Nov 8, 2007 (gmt 0)

Administrator

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month

joined:Jan 14, 2004
posts:859
votes: 3


Do you have IIS setup to use its internal Compression, or a 3 party compression by chance?
5:15 pm on Nov 12, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 2, 2003
posts:1184
votes: 0


Could this be addressed by updating the browsecap.ini file to include this flavor of mozilla or am I misunderstand the use of browsecap?
5:54 pm on Nov 12, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


The suggestion on the page linked above [kowitz.net] is:

create a "genericmozilla5.browser" file in your "/App_Browsers" folder in the root of your application... This will match generic Mozilla compatible browsers and spiders with user-agents strings such as Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
9:10 pm on Nov 12, 2007 (gmt 0)

New User

10+ Year Member

joined:Mar 14, 2006
posts:9
votes: 0


There are not bugs in ASP.NET

Bugs are in your code.......

10:20 pm on Nov 13, 2007 (gmt 0)

New User

10+ Year Member

joined:June 28, 2006
posts: 12
votes: 0


I banged my head against the wall for weeks with this bug. (And yes, it's an ASP.NET 2.0 bug.)

There is a simple work-around:

1. In your ASP.NET's App_Browsers directory, add a file called BrowserFile.browser

2. Add the following to the BrowserFile.browser file:


<browsers>
<browser refID="Mozilla">
<capabilities>
<capability name="cookies" value="true" />
</capabilities>
</browser>
</browsers>

Pete

4:04 am on Nov 14, 2007 (gmt 0)

New User

5+ Year Member

joined:Jan 9, 2007
posts:6
votes: 0


How do i reproduce this bug.
I have the whole site written on ASP.NET 2.0 and do not see any problem with googlebot.

What do i need to do to see that bug?

5:35 am on Nov 14, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Just set up Firefox with the UserAgentSwitcher add-on and change your user agent to googlebot's:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Then go to your site. If you can browse it, then you don't have a problem with this bug.

The exact conditions that trigger this bug are still not 100% clear to me, and neither am I clear whether either Google or Microsoft have taken steps to try to compensate for it. The most I can say is that it seems to be triggered at times by using .NET's native url rewriting, for example, the HttpContext.RewritePath Method [msdn2.microsoft.com]

The third party re-write utilities such as ISAPI Rewrite do not seem to trigger it. But I do want to emphasize that I'm using language filled with weasel words like "seems to". I recommend people do this quick check with the Firefox add-on, or check your server error logs to make sure googlebot is not generating 500 error codes.

12:21 pm on Nov 14, 2007 (gmt 0)

New User

10+ Year Member

joined:Dec 29, 2004
posts: 2
votes: 0


I had same issue for over 4 months and never figured out why my sites doesn't get indexed by google. last night i found out! Like you've said google always got 500 response and reason was server side advert request code of one of the advertises i use.

":" in google bot's user agent string was breaking it!

7:22 pm on Nov 14, 2007 (gmt 0)

New User

10+ Year Member

joined:June 28, 2006
posts: 12
votes: 0


I can vouch for tedster's supposition: for me the 500 error occurred when I was performing URL mapping using Context.Rewritepath within my ASP.NET's Application_BeginRequest code in Global.aspx, such as this:

Context.RewritePath("~/Default2.aspx", False)

The workaround I posted above, adding a .browser file to your asp.net's App_Browsers directory, was recommended by Microsoft after I submitted the bug to them in early 2006. The problem went away completely and I have seen no side effects at all.

I would think that others in this community would have experienced this problem bigtime, considering that many of us use Context.Rewritepath for SEO purposes.

That said, I am unable to recreate the problem today with the test case that I submitted to Microsoft. So, my assumption here is that a patch may have come out for ASP.NET that has addressed the problem. (I have since upgraded to .NET 3.0, maybe that's what did it.)

Pete

11:55 am on Nov 20, 2007 (gmt 0)

New User

5+ Year Member

joined:Nov 20, 2007
posts:7
votes: 0


I have a site which uses ASP only, and the 500 error is occuring. I have installed the Firefox extension, and when I visit the page as Googlebot, it works fine, however the log records a 500 error! Can anyone explain that?

My website is ASP.net compatible however, so could I still create the browser file in App_Browsers? (i have just tried this and the error occurs - though i cant restart the site manually to properly test this).

Chris.

6:21 pm on Nov 20, 2007 (gmt 0)

New User

10+ Year Member

joined:Mar 14, 2006
posts:9
votes: 0


Kill me. I had switch user agent to Googlebot and went to my site.
Everything works perfectly.
and I do use HttpContext.RewritePath to make my urls SE friendly.

So still think that you people have a bug somewere......

6:42 pm on Nov 20, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Thanks George - hope we can move to more clarity here. I note in pclaar's post above that Microsoft confirmed a bug affecting his case. I also note that you are not seeing the problem in you case. Seems there must be another factor around that is involved in tripping the problem.
7:16 pm on Nov 20, 2007 (gmt 0)

New User

10+ Year Member

joined:June 28, 2006
posts: 12
votes: 0


Yes, Microsoft *confirmed* that this was a bug in ASP.NET 2.0 and offered me the workaround I listed previously. I had created a very simple test case that had two web pages and one line of code in the global.aspx that caused the error when the User_Agent string was set to Googlebot or Ask Jeeves/Teoma. The error that caused the 500 was "Cannot use a leading .. to exit above the top directory."

I believe that this bug has been fixed, because I cannot get it to reproduce anymore. This bug was over a year old, and I was running on ASP.NET 2.0 in early 2006. After a few 2.0 patches and the installation of .NET 3.0, I can no longer reproduce the error.

I'd be curious to know if Mickey158 has installed .NET 3.0 or at least has patched 2.0. I think that alone might fix his issue instead of the workarounds.

Pete