Forum Moderators: open

Message Too Old, No Replies

Re-visiting Data URI

Unexpected data:image URI calls to web sites

         

dstiles

10:03 pm on Jul 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Last october there was a discussion here about data URI in the form:

url(data:image/png;base64,[long Base64 string])

[webmasterworld.com...]

The discussion concluded that it was probably a specific version of google toolbar that was at fault. This far on, with many version updates of GTB, it would seem that it is inherent in (probably) the google toolbar design.

I see many hits of this form in my "trap" logs.

Having revisited this problem I have reached the conclusion that they seem to be illegal access attempts but have no idea why they should exist.

As far as I can tell the data URI code should be generated by a web server and fed to a web browser. The effect I'm seeing is the reverse of this. At the time of the original problem this seemed so unlikely that I returned a 403 and killed the IP that carried it on the basis that it was stupid if not dangerous behaviour.

After monitoring a known (reasonably) good source IP I have now stopped killing the IP and modified the trap system so that it returns a 404 with a message on the page "Google Application Error".

Does anyone have any comments on this, please? Any idea why GTB (if that's what is causing it) is doing this, and why it's not every instance of GTB that causes the problem?

g1smd

11:11 pm on Jul 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Some images within MediaWiki exhibit this behavior, not least the "Powered By..." logo at the bottom of the page.

dstiles

10:08 pm on Jul 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Can I assume mediawiki is a web site/server and not an OS application? If so, that would be fine. That's one of the things data URI is designed for (as far as I understand it).

I had confirmation from a network manager today (UK education) that the source of the data URI seems to be GTB on some desktops. Either that is a programming error or google are expecting it to do something clever: either case wouldn't surprise me but the latter case worries me.

g1smd

11:10 pm on Jul 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



MediaWiki is the PHP scripting that runs sites like WikiPedia. It's Open Source and available for anyone to start their own wiki.

lucy24

11:33 pm on Jul 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yup. I know those. They come from human visits, immediately before the favicon request. That is, normally the second-to-last item in a group. There's absolutely no unique and unifying feature in the UAs or IPs. (Dug up the last seven and then got tired.) The search queries also tend to be a big mess, but I'm ### if I can find any unifying feature here:

{g###}.com/search?q={search terms}&hl=en&rlz=1T4ADSA_enUS416US416&biw=1191&bih=681&num=10&lr=&ft=i&cr=&safe=images&tbs= 

{g###}.co.uk/search?sourceid=navclient&aq=0h&oq={search terms}+&ie=UTF-8&rlz=1T4ADSA_enGB413GB414&q={search terms}

{g###}.ca/search?source=ig&hl=en&rlz=1G1ACAW_ENCA401&q={search terms}&aq=f&aqi=&aql=&oq=

{g###}.co.uk/url?sa=t&source=web&cd=4&ved=0CDYQFjAD&url={my url}&rct=j&q={search terms}&ei=bTcATur0FJC7hAfohOWTDQ&usg=AFQjCNEb_8717MifBL6Wx9GxSxgNzQrHWw&sig2=fnLM5pyNo1-6FtvCCBJiSg

{g###}.com.mx/url?sa=t&source=web&cd=2&ved=0CB8QFjAB&url={my url}&rct=j&q={search terms}&ei=zrT-Tb6TKoq-sQP846zeBQ&usg=AFQjCNEhD3iq5b9MJULKvTIm49RBDP6m2Q

{g###}.co.jp/search?q={search terms}&hl=ja&rlz=1T4ADBR_jaJP229JP229&prmd=ivnsb&ei=ZUT9TbGXMIaYvAOLsdS3Aw&start=30&sa=N&biw=854&bih=538

{g###}.com.tr/url?sa=t&source=web&cd=19&ved=0CEgQFjAIOAo&url={my url}&rct=j&q={search terms}&ei=JmgMTt2nNsXCswaIoZnnDg&usg=AFQjCNEkQqYMbVWPAouHwUtdP7-tXHj4SQ


The urls (mine) and queries are all different.

dstiles

9:24 pm on Jul 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Are we talking about the same thing here, Lucy?

The thing I was describing is formatted exactly as I wrote in the OP. The URL entities in your examples seem to refer to search engine referers, which are rather different.