homepage Welcome to WebmasterWorld Guest from 54.205.144.54
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 44 message thread spans 2 pages: 44 ( [1] 2 > >     
What about your Parked domains.
And bots?
Angonasec




msg:4647407
 9:16 am on Feb 21, 2014 (gmt 0)

I've just moved our domains to a new host, with the consequent, character-building-bug-trawling-all-nighters.

Now that the dust is settling, I have the previously parked domains on my host account (sharing the main IP), so I'm more in control of what each parked page's construction.

Fellow log-watchers, with a healthy attitude to bot-blocking:

Should I let the bot-tide in, or put the pristine domains behind a password?

What is +your+ preferred set-up for previously unused, parked domains, especially in regard to bots?

 

wilderness




msg:4647480
 2:51 pm on Feb 21, 2014 (gmt 0)

Angonasec,
I've an active domain (and sub-domain on same) that virtually contain ZERO content, and the rogue bots are rampant.
I've actually had to add denys that are non-existent in my large site and long-established site.
Although, there's nothing to gain/lose by adding the denys on ZERO content site, I don't see any reason to encourage their activity.

Angonasec




msg:4647558
 5:59 pm on Feb 21, 2014 (gmt 0)

Bot control of parked domains is not something I've ever thought about, whilst they were parked with my Registrar.

Mine are unused domains and totally empty, not even an index file.

My first thought is to have a password protected pop-up.

But long term, that may appear suspicious to the search engine bots we approve of.

Hence the thread to see what you lot do :)

dstiles




msg:4647592
 9:00 pm on Feb 21, 2014 (gmt 0)

Is there a reason for "parking" the domain rather than simply not setting up IPs etc for it? I can understand if you expect to get some kind of gain from it but if not I would leave the setup blank.

lucy24




msg:4647601
 9:48 pm on Feb 21, 2014 (gmt 0)

One of the secondary uses of my test site is robot monitoring. All domains share a single htaccess that lists the numerical and simple-UA "Deny from..." directives; the more complicated lockouts are site-specific. That means some robots get into the test site where they'd be blocked from "real" sites. Every week or so I check logs and slap on some further lockouts as appropriate.

The other useful thing is to periodically check the 403 logs. Cross-check that everyone who got locked out-- especially multiples-- is also from a blocked IP, as opposed to being blocked only on referer and/or UA grounds. Things do level out. By now I hardly ever see a lockout that's based purely on referer (unwanted ones such as .mobi or .ua other than search engines); they're from Ukrainian ranges that would be blocked in their own right.

If you want to split hairs, it isn't "parked". It simply isn't used for anything publicly, and is 100% roboted-out.

Angonasec




msg:4647634
 11:44 pm on Feb 21, 2014 (gmt 0)

dstiles:
"Is there a reason for "parking" the domain rather than simply not setting up IPs etc for it? I can understand if you expect to get some kind of gain from it but if not I would leave the setup blank."

By that I take it you mean leave the Registrar nameservers blank: Is that a secure option?
You'd still be able to lock the domain to prevent theft and transfer?

Parking in order to secure them as .tld variants of the main domain. Not used for any diverting of traffic, or financial gain.

Angonasec




msg:4647635
 11:50 pm on Feb 21, 2014 (gmt 0)

Lucy: Interesting thinking. Presumably you don't intend developing the "bot monitoring" domain for any public use?

Google would of course keep records of hit-responses from the domain, and related domains on the same IP.

lucy24




msg:4647674
 4:12 am on Feb 22, 2014 (gmt 0)

Presumably you don't intend developing the "bot monitoring" domain for any public use?

Every week or so it gets a human visitor from Bing search thanks to their fondness for exact matches, so I've added three or four pages just to amuse them. But other than that ... I'm probably not the right person to develop the name.

I don't think there would be any unintended consequences if I did use the site, though. Search engines can tell they're all on the same server-- but so are

:: detour to look up ::

77 other sites. And they know it's shared hosting.

Angonasec




msg:4647686
 5:14 am on Feb 22, 2014 (gmt 0)

"And they know it's shared hosting."

Indeed, and more to the point, the biggest search monsters, also know who owns each domain on each IP, hence my concern.

For example, of half a dozen domains, I own, most were previously parked at my Registrar, and used their IP.

Now that I've parked them on my shared IP account, I'm wondering if I should let the monsters in, to see the house is empty, or drop a portcullis, and let them wonder what's inside.

The Registrar doubtless let the bots have free rein previously.

Once I decide to develop a parked domain, I'd build it behind a password, and only open it once it's built.

lucy24




msg:4647708
 6:04 am on Feb 22, 2014 (gmt 0)

the biggest search monsters, also know who owns each domain on each IP

You think? That kinda defeats the point of private registration, doesn't it? :(

I've a lurking suspicion, with no hard evidence to back it up, that robots may go away faster if they meet a 404. The 403 is just a challenge: "What don't they want me to see?"

keyplyr




msg:4647733
 9:00 am on Feb 22, 2014 (gmt 0)


That kinda defeats the point of private registration, doesn't it?

I think you'd be surprised how un-private private registration really is. Depending on which WHOIS clone you're looking at, some hide your personal info and some just don't.

Many shared hosting environments (example: Godaddy, Dreamhost) will charge extra for so-called "private registration" but they have no control over how that info is displayed at the hundreds of sites that display your info.

I gave up on that BS a while ago. Save your money. Just use a throw-away email address, a PO Box and "administrator" as your name.

lucy24




msg:4647745
 9:30 am on Feb 22, 2014 (gmt 0)

Huh. I don't pay extra for mine; it's just a clickbox. But you can look up my sitename and see if any public sources give my real name. This would be an unpleasant surprise, since my everyday free lookup has me down as "another happy {host/registrar} customer". Or, uhm, whatever the exact wording is.

It's obviously different if your name is amazon dot com or equivalent. Then people do have a right to know who's behind the name. But I'm just a human.

Angonasec




msg:4647817
 2:31 pm on Feb 22, 2014 (gmt 0)

"That kinda defeats the point of private registration, doesn't it?"

I'm a cynical about private registration as KeyP; but even assuming it was private, I surely would not want my Registration details to be private if the site I was launching was a serious project.

When a bod, or bot, is trying to assess the "integrity" of a domain, surely transparency send superior signals than "mind yer own business".

But, of course, for your casual bot-monitoring domains, yes private registration would be best.

Angonasec




msg:4647822
 2:46 pm on Feb 22, 2014 (gmt 0)

"I've a lurking suspicion, with no hard evidence to back it up, that robots may go away faster if they meet a 404. The 403 is just a challenge: 'What don't they want me to see?'"

That would be a pleasant surprise, if so.

And by joyful circumstance, a theory I can now test.

During my current domains move, my addled thoughts being preoccupied, I left the drawbridge down on my unused /test/ domain.

It was only live for two days, but BANG, the big boys arrived, and sucked up the lot!

I've now emptied the domain, just a blank index.html file, so I will soon be in a position to see if they do indeed stay away when bloated with 404's.

And... if the test site appears in their serps...

But of course, it was my mistake to not protect my test site with a password htaccess.

blend27




msg:4647823
 2:48 pm on Feb 22, 2014 (gmt 0)

I'm wondering if I should let the monsters in


4 bot fracking purposes, why not? Log it! see what comes out of it. Throw a background process that logs the info you are interested in and let it storm in.

Unless you get David Lo Pan on you tail, what do you have to loose?

Angonasec




msg:4647824
 3:01 pm on Feb 22, 2014 (gmt 0)

Ta for the suggestion B27!

I'll certainly be monitoring the traffic, being pretty sure that when they parked domains were at my Registrars IP, there was zero bot-blocking in place.

"What do I have to loose?"

My concern is the long term "reputation" of them, and current "reputation" of the main domain they are variant .tld's of.

They all now share the same IP, as well as owner, me.

None of them are disposable domains.
All secured for the maximum allowable duration.
My instinct is to barricade them, but your suggestions are most welcome. :)

Angonasec




msg:4647834
 3:30 pm on Feb 22, 2014 (gmt 0)

Hah! I've just checked.

The test domain I accidentally exposed to the Dragons, for 2 days this week, is now appearing at number 2 in the Bing serp for its domain name. Complete with snippet of my test site text. Aggh! So I clicked it, and saw the blank index.html. Let's see if Bing drops the snippet. That's Bing bot for you.

Thankfully, not found at all in G...yet.

keyplyr




msg:4648159
 7:20 pm on Feb 22, 2014 (gmt 0)

Lucy, there's a thread at your host's forum about this. And as I said, not all WHOIS clones display the same thing.

Then there's all the hundreds of site info scraper sites that display your site's preview image, Google PR, IP and A record history, traffic stats, etc. If your domain registration is/was ever public, they'll have this info and sadly it will never go away.

dstiles




msg:4648181
 9:29 pm on Feb 22, 2014 (gmt 0)

Angonasec - I have a couple of domains that have no web IPs and no mail servers. They are registered for future use but not currently in use. Some used to be registered with "parking" companies that paid me money per hit but I haven't received anything for a long time so now I just remove everything and leave them be. Where a domain is registered for domain name anti-theft or to accommodate people who cannot spell, those domains are pointed to the real sites and 301'd through (in my case) IIS.

I can't see a domain name being prejudiced just because it's not in use as long as the registration details are valid.

NOTE: .COM/NET/ORG domains are in some cases being verified, with emails being sent to the registrant. If they are not confirmed the domain WILL be suspended. Nominet has just begun a similar excercise, validating company types, registrant organisations etc. If the domain name has an invalid/out-of-date/untended email address you will obviously not get the notification email. I recommend ALL domain names be checked. Not sure about the nominet ones, but if a com/net/org is suspended the DNS will be altered to point to a verification page and the real site disconnected from the web.

The com/net/org registration process was always primitive and building paid-for "security" on top is just a money-maker. I get all sorts of spam from email addresses stolen from the arin database. I never get spam to my nominet (UK) domain registration addresses. I wonder why? :)

Angonasec




msg:4648217
 12:34 am on Feb 23, 2014 (gmt 0)

Thank you dstiles, interesting.

I have received those annual automatic official "Confirm your Registration details are correct" emails for years, and I always verify. They are sent by my Registrar, to satisfy the relevant registration authority's regulations.

If I ever get spam through the parked domain's Registrant details being public, I've not noticed more than a couple a year at most. :)

dstiles




msg:4648358
 8:24 pm on Feb 23, 2014 (gmt 0)

Those annual verification emails are not the same at all! This is a new thing. The old confirmation emails didn't matter if you ignored them. The new ones WILL suspend your domain! :(

Angonasec




msg:4648419
 3:39 am on Feb 24, 2014 (gmt 0)

Different eh? What criteria do the Authorities use to decide whether to send this type of verification request?

Thread drifting:

Consensus on best practices for a valuable Parked Domain in regard to bots.

dstiles




msg:4648701
 8:17 pm on Feb 24, 2014 (gmt 0)

I would consider this part of the "best practices for a valuable parked domain". :)

The new com/net/org validation emails are issued on change of details, according to "my" registrar. My registrar made some unexpected Tech updates and triggered a number of domain emails, some to obsolete email addresses (registered by the owners, not me). It's possible the annual verfications may follow suite but I'm guessing.

The UK ones are triggered by missing or suspect details (eg untraceable registrant, missing Limited Company number etc).

Angonasec




msg:4648934
 3:46 pm on Feb 25, 2014 (gmt 0)

I see, since I just changed the name-servers, I'll keep an eye on my domains inbox.

Thank you for emphasising it.

blend27




msg:4651900
 3:10 am on Mar 7, 2014 (gmt 0)

If I could just Add to this:

I was reviewing some of the IIS logs the other day on a hosting account(was static IP 2 month ago for 6 month) that has logs of visitors trying to access the account by IP Address.

Bots. Bots Bots! Juicy Stuff!

I mean think about it, who else?

keyplyr




msg:4651905
 5:53 am on Mar 7, 2014 (gmt 0)


@blend27 - Or it could just be the way the logs read that day. It happens :)

Angonasec




msg:4651966
 1:50 pm on Mar 7, 2014 (gmt 0)

"Juicy Stuff!"

Indeed. But personally, I don't want to use a valued parked domain in this way. One of my other .tld's will suffice for bot-spotting.

It's a couple of weeks now since I inadvertently let the bots invade my test site.

The raw access logs now reflect that the major SEs have stopped hammering the 404s. (The domain/site is totally empty.) Which is a relief. So it's time to erect the password protection, and begin development behind that barrier.

Thankfully, Bing has now dropped the listing it had of the site when I used it for testing, and it never was listed in G :)

So apparently, no harm done.

dstiles




msg:4652104
 9:08 pm on Mar 7, 2014 (gmt 0)

blend27 - Quite a few, actually. Despite our server having several shared-hosting IPs we still get access attempts using only the IP, which would, given laxness on my part, return meaningless results.

lucy24




msg:4652124
 10:30 pm on Mar 7, 2014 (gmt 0)

<continuing topic drift>
Despite our server having several shared-hosting IPs we still get access attempts using only the IP, which would, given laxness on my part, return meaningless results.

What, if anything, would happen by default if you request an IP used by more than one host? I tried it (well, of course I did ;)) and just got the browser's loading-up bar, as if it's stuck at 1% of a large download. But not the same as the timeout I get from an https request for my own (non-secure) site. Same request in a different browser led to "site temporarily unavailable". At which point I had to go request my test site by its proper name to make sure I hadn't triggered some automated response.
</end topic drift>

keyplyr




msg:4652141
 12:02 am on Mar 8, 2014 (gmt 0)

What, if anything, would happen by default if you request an IP used by more than one host? I tried it (well, of course I did ;)) and just got the browser's loading-up bar, as if it's stuck at 1% of a large download. But not the same as the timeout I get from an https request for my own (non-secure) site. Same request in a different browser led to "site temporarily unavailable". At which point I had to go request my test site by its proper name to make sure I hadn't triggered some automated response.

Yeah, but that's a browser. A bot written to retrieve all documents (or whatever) at that location would behave differently.

This 44 message thread spans 2 pages: 44 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved