Forum Moderators: Robert Charlton & goodroi
When I sent the original DMCA notices to Google, and referenced that my client's homepage was indexed in Google on a multitude of Appspot subdomains, they had absolutely no idea what I was talking about.
Unless you take the time to learn enough about what you're doing to know you could go by IP Range too.
I'm pointing out that it's frustrating that Google gives anything that pretends to be a proxy a pass on the rules.
And, if they get smarter, and copy the content from the Google cache, wayback machine, etc, what then?
This is exactly what bothered me. Google, by the nature of what they do, should understand the situation *more* than anyone else. They don't. Leverage appspot proxies as part of a negative seo campaign is very common. They don't seem to know, or perhaps don't care.
I followed up to each of the DMCA denials stating that Google is storing the cache on these Appspot subdomains, and I would like these removed ASAP. I'm still waiting on their response to the follow-ups.
I'd like to know if Google's DMCA people can figure out when you say something is stored on one site you really mean a "cache of what it displays" is stored on another. TIA
It appears that is what you are trying to do to this thread - turn it into an English lesson with the proper use of verbs, nouns, etc.
They do not. You do not understand the rules online, so you don't know proxies aren't "getting a pass" on them from anyone, including Google.
Lastly, again, I disagree that these "web based proxies" should be considered anything other than a normal website from Google's perspective.
The RFCs are talking about an entirely different kind of proxy.
An intermediary program which acts as both a server and a client
for the purpose of making requests on behalf of other clients.
Requests are serviced internally or by passing them on, with
possible translation, to other servers. A proxy MUST implement
both the client and server requirements of this specification. A
"transparent proxy" is a proxy that does not modify the request or
response beyond what is required for proxy authentication and
identification. A "non-transparent proxy" is a proxy that modifies
the request or response in order to provide some added service to
the user agent, such as group annotation services, media type
transformation, protocol reduction, or anonymity filtering. Except
where either transparent or non-transparent behavior is explicitly
stated, the HTTP proxy requirements apply to both types of
proxies.
This is where I keep talking about knowledge level. I'm not trying to sound harsh, but the reality is if you read the following with an open mind, you'll see the RFC is not talking about an totally different type of proxy at all.
If a proxy receives a host name which is not a fully qualified domain name, it MAY add its domain to the host name it received. If a proxy receives a fully qualified domain name, the proxy MUST NOT change the hostname.
If a proxy receives a host name which is not a fully qualified domain name, it MAY add its domain to the host name it received. If a proxy receives a fully qualified domain name, the proxy MUST NOT change the hostname.
But how do you think they're changing the hostname?
- inserting Via: headers (sec 14.45)
HTTP/1.0 200 OK
x-xss-protection: 1; mode=block
via: HTTP/1.1 GWA
p3p: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
content-type: text/html; charset=ISO-8859-1
x-frame-options: SAMEORIGIN
cache-control: max-age=3600
Vary: Accept-Encoding
Date: Mon, 06 May 2013 22:46:49 GMT
Server: Google Frontend
[source: WebmasterWorld control panel server header]
- deleting hop-by-hop headers such as Connection: (sec. 14.10).
13.5.1 End-to-end and Hop-by-hop Headers
Hop-by-hop headers, which are meaningful only for a single
transport-level connection, and are not stored by caches or
forwarded by proxies.
The following HTTP/1.1 headers are hop-by-hop headers:
- Connection
- Keep-Alive
- Proxy-Authenticate
- Proxy-Authorization
- TE
- Trailers
- Transfer-Encoding
- Upgrade
- Adding warning headers since they've changed the entity-body (sec 14.46)
14.46 Warning
The Warning general-header field is used to carry additional
information about the status or transformation of a message which
might not be reflected in the message. This information is typically
used to warn about a possible lack of semantic transparency from
caching operations or transformations applied to the entity body of
the message.
So, getting back the original question, in those scenarios "changing the url" is what they are doing to every url in the body of the page...they are changing it to point to their own domain. It's a neat little hack that I feel is a go-kart.
- inserting Via: headers (sec 14.45)
Hop-by-hop headers, which are meaningful only for a single transport-level connection, and are not stored by caches or forwarded by proxies.
The links are the only thing I can find modified by the proxy
A "trick", imo, would be if a person thought they were "surfing via proxy" and then they clicked a link and requested the URL on the site being proxied themselves and landed on the site they thought they were surfing via proxy
Correct...so a proxy that's compliant doesn't forward one if it exists..in other words, it deletes it. The AppSpot proxy doesn't comply in that respect.
I suspect it's an example where an AppSpot proxy passes on an existing Via: header. The AppSpot proxies will pass one on if it exists, but they don't inject one as it should.
HTTP/1.0 200 OK
Server: Apache
Content-Type: text/html; charset=utf-8
P3P: CP='NO P3P'
Vary: Accept-Encoding
Content-Encoding: gzip
Cache-Control: max-age=15
Date: Tue, 07 May 2013 00:41:17 GMT
Connection: close
HTTP/1.0 200 OK
via: HTTP/1.1 GWA
vary: Accept-Encoding
p3p: CP='NO P3P'
content-type: text/html; charset=utf-8
cache-control: max-age=3600
Date: Tue, 07 May 2013 00:39:09 GMT
Server: Google Frontend
Honestly, rish3, I wish there were some things they would change, but in all the years I've looked, they don't make many mistakes, and they may "play in the grey" a bit, but they're usually not totally, flat out wrong when it comes to standards and protocol.
Well, rather than suspecting, you should try it out for yourself.
Then WTF are you F'ing B*tching about AppSpot for in the First Place?