Among the various development vectors of search cum internet information retrieval there is one that has quite fascinating possibilities as well as quite dire potential consequences for webdevs. The chronic angst as Google adds variations of ‘answers’ is nothing to what is, possibly, just barely, over the horizon.
I’ll use two different category illustrations of what I’m talking about:
1.Current news, i.e. hurricane Florence. Currently, if I want to know what’s happening I may visit one or another news sites, check Google news, etc. However, instead I can send out a number of bots, scrape any number of local news and news agency sites, scrape bounded data, use an algo with or without ML to create a custom news result. From which further specific data can be targeted and specific interests followed.
2.Reviews, i.e. for Widget model FooBar19. Currently, I can do a search and read in serial. However, instead I can also send out a number of bots to scrape reviews of target item and, again using an algo with or without ML, create designed combined results to order.
I’ve actually done the above. And others have done much more. It puts control and context in the hands of the user. And it is all automated. No human involved beyond designating target and parameters and pressing ‘go’ then reading the results and redefining if desired.
As a hobbyist hacker it is heaven. As a webdev looking for revenue from site visitors it is like looking into hell.
As a side issue I’ve been testing just how blatant a user-agent and/or behavior it takes to get blocked. Incredibly many sites, even enterprise, are pretty much wide open. And no sites, even my own with all my years of building defences, catches all the deliberately ‘human’ bots.
Also depending on how it is done it is NOT illegal (in most/all jurisdictions) nor does it necessarily infringe anyone’s copyright. Think of all the chronic angst as Google shifts from search to answer. Now think of Google et al, the middlemen, are rendered moot and visitors act as their own search operators. Instead of Googlebot it’s everyone’s bots.
Where’s the beef?