I'm pretty militant when it comes to looking through my web log files and blocking organizations and net-ranges (by IP) from being able to reach my web server. They give themselves away by looking for non existent files (ie wordpress) or they are just scraping my site from cloud hosters, etc.
When it comes to google, I block IP ranges that come back as "google user content". When it comes to Amazon, I block all of Amazon unless I need to keep a small range something open (like letsencrypt).
Because I don't own a cell phone and hence don't own or use devices like Alexa and Assistant (and I guess maybe Apple has something similar?) I have no idea if these devices intercept all web traffic between users and web-sites and maybe some of the traffic I'm blocking are people trying to hit my site through Alexa or Assistant and they don't know or can't bypass the device and just hit me directly? I'm just speculating here because I don't know how these devices operate at the home network / routing level. Otherwise all the google user content hits and AWS hits i see (or rather, that I block) is just junk?