Forum Moderators: open

Message Too Old, No Replies

moe

New Google spawn?

         

Pfui

12:39 am on Nov 3, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



User-Agent: moe
From MYIP.MS:
IP Owner: Google Inc
Owner Website: sites.google.com
Owner IP Range: 34.4.5.0 - 34.63.255.255 (3,930,880 ip)
Comment: *** The IP addresses under this Org-ID are in use by Google Cloud customers ***
- - - - -

In the past two days I received 755 hits from 8 G IPs using the UA “moe”. Today, three more G IPs but less than 100 hits total (thus far). The files sought are all images; 404s do not dissuade, neither do 403s. All hits are by IP, not Host.

IPs (partial listing):

34.27.56.171
34.27.84.139
34.27.96.5
34.27.197.135
34.27.241.169
34.28.38.55
34.28.72.182
34.28.170.147

Here's all I could find about "moe"/MOE:

[github.com...] › google › moe
GitHub - google/MOE: Make Opensource Easy - tools for synchronizing ...
MOE is a system for synchronizing, translating, and scrubbing source code repositories. Often, a project needs to exist in two forms, typically because it is released in open-source, which may use a different build system, only be a subset of the wider project, etc. Maintaining code in two repositories is burdensome. MOE allows users to: ...

[opensource.google...] › documentation › reference › thirdparty › moe
MOE Support | Google Open Source
But some google3 projects define in blaze an open source subset of their code, via a tool like go/moe or an export script. These projects/teams would like to "blaze {build,test}" their open source export from CL to CL, to ensure that their changes don't break a to-be-public ant or maven or make (or whatever...) build.

Thoughts?

(Hi, all! Long time no see:)

tangor

1:40 am on Nov 3, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Pfui! Long time lurk, happy to see!

Not found in my logs, but I suspect will happen soon. Bots tend to extend tentacles by days and weeks... Whew!

A lot of the 34.2* range is living in my .htaccess already...

Sgt_Kickaxe

4:22 pm on Nov 5, 2022 (gmt 0)



Unless images are unique and important to your site it's all the more reason to add this to your htaccess.
<FilesMatch "\.(jpg|jpeg|gif|png|webp)$">
Header append X-Robots-Tag "noindex"
</FilesMatch>


The ratio of spam to converting visitors coming from image search is not good on many sites. You can see if that's true on your site by going to the Search Console "pages" report and comparing WEB and IMAGE performance tabs. While blocking image search sounds draconian there is also the issue of an image in the images tab of SERPs blocking an actual web result entry. I can't believe that's still the case a decade later...

Important: If you take the draconian route you should still observe image SEO best practices with alt tags, image compression etc. Noindex means keep it out of your index, not "don't evaluate it".

lucy24

5:49 pm on Nov 5, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Annie? Is that you?

Is “moe” literally the entire UA string?

I don’t think the fact that G### owns the IP counts for much. There are (1) search ranges, (2) ranges for googloid functions such as site-verification or translate, and (3) uh, well, G's anwer to AWS. 34.anything is definitely the third. (In fact I've been remiss in updating my local IP records, and hadn't realized 34.0/10 is no longer Halliburton. Oops.)

404s do not dissuade, neither do 403s.
I can’t remember when I last saw a robot that, meeting an immediate 403--let alone 404--abandoned its mission. They have a shopping list and stick to it. Being robots, they are not injured by doors repeatedly slamming in their face.

All hits are by IP, not Host.
You mean, they arrived at your site without a hostname? Surely at this late date it is safe to block hostless requests categorically, unless you have a very unusual target audience.

Pfui

7:11 pm on Nov 5, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



- Sgt., thanks for the info. FWIW, I manage one primary site with its own old Apache installation, so no Search Console or similar. And I've been using ALT tags and blocking images since, oh, 1995:)

- Lucy, m'dear, oui, c'est moi! And it's SO nice to see my old bot-hunting buds around here still, all still watching, tracking, reporting! On to moe, etc. ...

When it appeared -- and yep, t'was just plain moe -- I thought it was worth a mention here. And yep, no hostname, despite most G-related Hosts appearing as-is. We're set up to do a double or hostname lookup (I forget the exact name of the directive). The minuscule hit in speed more than makes up for my not having to look up too many IPs when reading logs.

The curious thing about moe, aside from its anonymous ID, is that it hit as if scanning our copy-pasted content somewhere. Which kind of makes sense if someone's put a chunk on a G-based site or in a post.

Lastly, regarding blocking hostless hits, that would tax my dusty ModRewrite skills fer shure:) But also, large numbers of plain-IP hits come from bona fide Host-named addresses that for some reason don't double-lookup on the fly. Maybe it's their coding, more probably it's mine.

lucy24

7:35 pm on Nov 5, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



hostless hits

I may have misunderstood. I thought you meant the requests came in to your site using only your IP rather than your site's name. Those you probably do block already.

Incidentally, while looking up 34 I found that the letsencrypt verification thingie uses one chunk of 34 (mysteriously only for two of my sites, though all live on the same server) as part of their wide range of seemingly random IPs. So that's a hole that some sites may need to poke.

Pfui

11:47 pm on Nov 9, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Anyone else seeing "moe" yet? Hits have been minimal in recent days, less than 100, until today when three Hosts hit more times over about an hour. FWIW:

226.160.27.34.bc.googleusercontent.com (203 hits)
107.11.27.34.bc.googleusercontent.com (53 hits)
36.155.28.34.bc.googleusercontent.com (3 hits)

Am curious -- what do y'all think about my outright killing 34.27.0.0/16 and 34.28.0.0/16 at the server level via iptables? Too extreme, or what the heck?

SumGuy

12:36 am on Nov 18, 2022 (gmt 0)

5+ Year Member Top Contributors Of The Month



The CIDR's 34.27.0.0/16 and 34.28.0.0/16 are somewhat interesting. Unlike many many many google, MSFT, Amazon /16 CIDR's that I am blocking, those 2 I am not. No IP in those subnets have ever cropped up and caused malarkey on any in-bound port. Not the web ports (80/443), not SMTP and no other ports. But interestingly, just a few days ago on Nov 14, a request for a PDF file on my site came out of the blue from 34.28.76.144. The user-agent was:

Python/3.9 aiohttp/3.8.1

So for now I will keep those CIDR's open. If I was getting the hits that you are from those IP's, yes I would block them instantly without giving it a second thought.