Forum Moderators: phranque
My website logs revealed an Amsterdam IP address (proxy?) that scanned all my webpages and then generated a 404 error code for about 15 webpages similar to:
"GET /https://mysite.com/contact.html HTTP/1.1" 404 972 "-" "Java/1.5.0"
Looking at other log files I have never seen a "GET /https: … " command.
Is the 404 a function of the "Get /https …" or is something fishy going on?
Comments much appreciated.
StillTrying
It's likely that this is just a badly-written scraper or robot: The proper form for an HTTP(s) request header would be
"GET /contact.html HTTP/1.1" 404 972 "-" "Java/1.5.0"
That is, the HTTP(s) protocol or 'scheme' should not be included in the request header.
You could block all requests from that IP address or address range, block the Java/ user-agent, or simply let these requests go 404 -- They're not likely to be requests that would benefit you in any way should you 'correct' them using code on your server.
Jim
Thanks much - I am just paranoid that the person may be trying or / have hijacked my website somehow -
i.e. they hijack the site for certain IP addresses (not all, but many), then they use the above "get" command to check to see if their code is blocking access to the site.
Regards from "The New Guy" -
Still Trying