Welcome to WebmasterWorld Guest from 54.162.248.199

Forum Moderators: incrediBILL

Message Too Old, No Replies

Escaped characters in URLs

     
4:26 am on Mar 19, 2008 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



It happens I get 404 emails as agents tried to load a page that carried something like

?variable=

which for some reason got translated into

%3Fvariable%3D

Why this happens at the first place, and what determines if it is supposed to happen or not?

This would refer to my campaigns at Google AdWords where I use such variables so I know where the traffic has originated from.

Besides “WHY”, I am also trying to figure if those agents are live people that really wanted to get to my site through AdWords ad, or if those are something else.

User agent always seems to be the same, IP addresses are different:

User Agent = Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

I see that SV1 stands for "Internet Explorer 6 with enhanced security features (Windows XP SP2 and Windows Server 2003 only)."

Would that SV1 matter?

Thanks

9:46 am on Mar 19, 2008 (gmt 0)

5+ Year Member



URLs are automatically escaped in headers by the browser. Example:
www.google.com/search?q=godaddy&ie=utf-8&oe=utf-8

becomes
www.google.com/search%3Fq%3Dgodaddy%26ie%3Dutf-8%26oe%3Dutf-8

> live people
Probably.

> Would that SV1 matter?
Probably not.

6:55 pm on Mar 19, 2008 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



It still happens only in cases of SV1, no other.
12:38 am on Mar 20, 2008 (gmt 0)

WebmasterWorld Senior Member penders is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



URLs are automatically escaped in headers by the browser. Example:
www.google.com/search?q=godaddy&ie=utf-8&oe=utf-8
becomes
www.google.com/search%3Fq%3Dgodaddy%26ie%3Dutf-8%26oe%3Dutf-8

IMHO characters should only be escaped if they are part of the URL or need to be passed in the querystring. '?', '=' and '&' have special meaning (the start of the querystring, name=value pairs and name/value delimiter). If these values are escaped they lose their special meaning and become part of the URL - which is the problem smallcompany is having, the URLs are no longer valid.

1:58 am on Mar 20, 2008 (gmt 0)

5+ Year Member



> IMHO characters should only be escaped

You misunderstand. The escaping happens automatically within the browser's HTTP client to make strings URL-friendly - the URL must be a continuous unbroken string, without spaces and with limited punctuation. There's an RFC that covers this, but its number escapes me at the moment.

7:38 am on Mar 20, 2008 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



I found that RFC (old one from 90's) and all that stands.

What is not OK is the fact that all other user agents are working just fine.

When you browse, and pay attention to full URLs in address bar, you will see

?, =, and &

not their “translations”.

I am just trying to get to the bottom of the reason why this particular user agent gets those translated on the first click onto ad on Google AdWords.

… while all the rest don’t.

If true for all (user agents), nobody would be able to browse the web today.

I just checked how it looks by looking into HTTP headers (LiveHTTPHeaders - Firefox), and those characters did not get translated.

The only time I get them translated is when I use service like Profixy.

What I am missing here?

8:04 am on Mar 20, 2008 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Back end scripts at the proxy service that translate the characters for security reasons?
11:57 am on Mar 20, 2008 (gmt 0)

5+ Year Member



Are you sure it's not a fake bot? I've seen plenty of them so badly written that they request malformed URL.
5:15 pm on Mar 20, 2008 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Back end scripts at the proxy service that translate the characters for security reasons?

That SV1 stands for IE6 with enhanced security, whatever that is. Proxy services? Not sure. IP addresses start with 66, 71, 76, and they belong to different ISPs, and reside in different states.

Are you sure it's not a fake bot?

Would they be able to pick ads from Google AdWords? I opt for search only so my ads do not show on AdSense sites.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month