homepage Welcome to WebmasterWorld Guest from 54.205.247.203
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

    
Escaped characters in URLs
smallcompany




msg:3604881
 4:26 am on Mar 19, 2008 (gmt 0)

It happens I get 404 emails as agents tried to load a page that carried something like

?variable=

which for some reason got translated into

%3Fvariable%3D

Why this happens at the first place, and what determines if it is supposed to happen or not?

This would refer to my campaigns at Google AdWords where I use such variables so I know where the traffic has originated from.

Besides “WHY”, I am also trying to figure if those agents are live people that really wanted to get to my site through AdWords ad, or if those are something else.

User agent always seems to be the same, IP addresses are different:

User Agent = Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

I see that SV1 stands for "Internet Explorer 6 with enhanced security features (Windows XP SP2 and Windows Server 2003 only)."

Would that SV1 matter?

Thanks

 

MarkFilipak




msg:3605076
 9:46 am on Mar 19, 2008 (gmt 0)

URLs are automatically escaped in headers by the browser. Example:
www.google.com/search?q=godaddy&ie=utf-8&oe=utf-8
becomes
www.google.com/search%3Fq%3Dgodaddy%26ie%3Dutf-8%26oe%3Dutf-8

> live people
Probably.

> Would that SV1 matter?
Probably not.

smallcompany




msg:3605632
 6:55 pm on Mar 19, 2008 (gmt 0)

It still happens only in cases of SV1, no other.

penders




msg:3605914
 12:38 am on Mar 20, 2008 (gmt 0)

URLs are automatically escaped in headers by the browser. Example:
www.google.com/search?q=godaddy&ie=utf-8&oe=utf-8
becomes
www.google.com/search%3Fq%3Dgodaddy%26ie%3Dutf-8%26oe%3Dutf-8

IMHO characters should only be escaped if they are part of the URL or need to be passed in the querystring. '?', '=' and '&' have special meaning (the start of the querystring, name=value pairs and name/value delimiter). If these values are escaped they lose their special meaning and become part of the URL - which is the problem smallcompany is having, the URLs are no longer valid.

MarkFilipak




msg:3605963
 1:58 am on Mar 20, 2008 (gmt 0)

> IMHO characters should only be escaped

You misunderstand. The escaping happens automatically within the browser's HTTP client to make strings URL-friendly - the URL must be a continuous unbroken string, without spaces and with limited punctuation. There's an RFC that covers this, but its number escapes me at the moment.

smallcompany




msg:3606151
 7:38 am on Mar 20, 2008 (gmt 0)

I found that RFC (old one from 90's) and all that stands.

What is not OK is the fact that all other user agents are working just fine.

When you browse, and pay attention to full URLs in address bar, you will see

?, =, and &

not their “translations”.

I am just trying to get to the bottom of the reason why this particular user agent gets those translated on the first click onto ad on Google AdWords.

… while all the rest don’t.

If true for all (user agents), nobody would be able to browse the web today.

I just checked how it looks by looking into HTTP headers (LiveHTTPHeaders - Firefox), and those characters did not get translated.

The only time I get them translated is when I use service like Profixy.

What I am missing here?

tedster




msg:3606177
 8:04 am on Mar 20, 2008 (gmt 0)

Back end scripts at the proxy service that translate the characters for security reasons?

Achernar




msg:3606273
 11:57 am on Mar 20, 2008 (gmt 0)

Are you sure it's not a fake bot? I've seen plenty of them so badly written that they request malformed URL.

smallcompany




msg:3606530
 5:15 pm on Mar 20, 2008 (gmt 0)

Back end scripts at the proxy service that translate the characters for security reasons?

That SV1 stands for IE6 with enhanced security, whatever that is. Proxy services? Not sure. IP addresses start with 66, 71, 76, and they belong to different ISPs, and reside in different states.

Are you sure it's not a fake bot?

Would they be able to pick ads from Google AdWords? I opt for search only so my ads do not show on AdSense sites.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved