my htaccess file contains no blocks whatsoever
Holy ###. NO blocks? Welcome to mod_authz-thingummy. (Its exact name depends on which Apache version you've got. In 1.3-- which I devoutly hope you haven't got-- it was mod_access instead.)
Don't bother about importing other people's lists. Make your own. There are two prongs:
#1 The raw
Deny from 184.108.40.206
directive, using IP addresses in CIDR format. At first you will spend a lot of time counting on your fingers; after a while you'll get it internalized so when you see
220.127.116.11 - 18.104.22.168
you instantly translate
#2 User-agent blocks using mod_setenvif, which always executes before mod_authz-whatever. You can look at all kinds of aspects of the request, but the most useful shortcut is BrowserMatch which means "look for this RegEx in the user-agent string":
BrowserMatch ^-?$ keep_out
BrowserMatch Ahrefs keep_out
BrowserMatch "America Online Browser" keep_out
BrowserMatch AppEngine keep_out
where quotation marks are used to preserve literal spaces, and the first thing on the alphabetical list is the null user-agent. (Technically - is if they don't send the User-Agent header at all, while "" [nothing] is if the header is empty. Cover your bets.) The variable called "keep_out" doesn't mean anything; give it any name you want and then proceed to
Deny from env=keep_out
If you want to start building up Deny lists, go next door to the Search Engine Spiders and User-Agent forum (SSID). There's always a running thread on server farms. Some people also deny whole countries. You'll also get a sales pitch on whitelisting (by user-agent, not IP). Personally I think this is only appropriate for huge sites that don't mind locking out the occasional human.
You can also use mod_rewrite for access control, but save it for the more complicated actions, especially the ones that are specific to your site.
Do I just keep an eye on my cPanel raw access logs
I'm not sure how you fit "cPanel" and "raw" into the same sentence. Raw means raw. If you've never looked, you will first have to find where your host keeps them. They may be aliased from your site's physical directory (where you go to upload stuff) or you may have to follow a different path, possibly using a different password. And you will almost certainly have to change the default number of days that they keep raw logs. It's simply a text file; any text editor will open it.
Analytics programs like GA or Piwik are good for tracking real human visitors. Robots and lockouts can only be tracked in raw logs. As you start building up your Deny lists you'll see the 403 responses start accumulating.