Welcome to WebmasterWorld Guest from 54.224.121.67

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

humans.txt file

We Are Humans, Not Machines!

     
1:59 am on Oct 25, 2017 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10113
votes: 550


As the robots.txt [robotstxt.org] file is to robots, the humans.txt [humanstxt.org] file is to humans.

The humans.txt file basically identifies the humans responsible for the website & its content: author(s), staff, address, contact info, software & platforms being used, etc. This is a loose protocol, meaning you can put anything you like in this file.

I use one and it occasionally gets requested by bots & humans alike... called from the HEAD:
<link rel="author" href="humans.txt">
2:46 am on Oct 25, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 166
votes: 12


I am unsure that I buy into the purpose and benefits of the humans.txt file. Who would use such information? Please forgive my sceptical nature.
2:51 am on Oct 25, 2017 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10113
votes: 550


@TorontoBoy - Did you read the documentation? Have you done a few web searches for more info? Beware of condemnation prior to investigation :)

I'm not saying you need this file, or that there is any benefit. This is more of an FYI post.

The WWW is for humans, yet most of the traffic is not human [webmasterworld.com]. Robots control what we see, what we buy, what we read.

This file credits the humans.
12:30 pm on Oct 25, 2017 (gmt 0)

Junior Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 166
votes: 12


I actually did read all the docs on humanstxt.org. Maybe as I age I am getting more ornery. We live in a world of bots, that is for sure. The humans.txt file is a file that a human with normal browser behaviour will never see as it will not by default display in their browser. It sounds like a beacon of hope sent out in a sea of bots, hidden where only a bot can access, altruistic but not so effective. Maybe in the future this will be a standard browser behaviour.

Here is Google's:
Google is built by a large team of engineers, designers, researchers, robots, and others in many different sites across the globe. It is updated continuously, and built with more tools and technologies than we can shake a stick at. If you'd like to help us out, see google.com/careers.
1:30 pm on Oct 25, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11080
votes: 106


Here is Google's

404 on microsoft, apple, yahoo, bing, w3.org, amazon, ...

there are a few that took a page from g's playbook:
[nytimes.com...]
[gov.uk...]
6:31 pm on Oct 25, 2017 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10113
votes: 550


Google's humans.txt [google.com]

There are many versions of a humans.txt. As I said, it's a loose protocol so it can be pretty much anything you like.

I use the format laid out by humans.org & others.
/* IMPRESSUM*/
[stuff here]

/* AUTHOR */
[stuff here]

/* SITE */
[stuff here]

/* COMPANY */
[stuff here]
By using this format, the data can be used to further validate the author/ownership of the digital property (your website) especially when linked from the HEAD section of your documents.
5:39 pm on Oct 26, 2017 (gmt 0)

Junior Member from DE 

10+ Year Member

joined:June 25, 2005
posts:182
votes: 1


As the robots.txt [robotstxt.org] file is to robots, the humans.txt [humanstxt.org] file is to humans.
Wrong.
robots.txt contains directives and is intended for bots.
humans.txt contains my personal data and is intended for scrapers and harvesters.

humans.txt would be cool if I could control the access to my website with it as with robots.txt.
Name: John Doe
Disallow: /
7:31 pm on Oct 26, 2017 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10113
votes: 550


Wrong... humans.txt would be cool if I could control the access to my website with it as with robots.txt.
That's not the function of humans.txt.

As for scrapers, here are a few Blocking Methods [webmasterworld.com] to consider.

BTW - you cannot "control the access to your website" with robots.txt. That file does not give you control over who accesses anything. Very few agents support robots.txt and those that do can disregard it if they choose.
8:22 pm on Oct 26, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14256
votes: 551


Hasn't the whole thing been by and large superseded by the “About Us” page?

:: detour to logs ::

:: pause for “wtf” as counter in background suddenly leaps from 2 digits to 4 ::

Nope, red herring. On Valentine's Day 2014 (really) some idiotic robot came in with a flurry of
GET /long-complicated-path-here/index.php?some-param-name=http://www.google.com/humans.txt? HTTP/1.0
The rest can be counted on your fingers. I think one site actually has a file by this name, but it's just a spoof on robots.txt.
8:27 pm on Oct 26, 2017 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10113
votes: 550


Hasn't the whole thing been by and large superseded by the “About Us” page?
I don't think it has been "superseded." All indication is the humans.txt file has a different purpose than the About Us webpage, although much of the information on either is likely similar.

IMO, since it is not a front facing webpage, the humans.txt file is more for techies and those data miner agents that are looking for this type of info.

Very coincidently I just saw a request for this file from:
Mozilla/5.0+(compatible;+PiplBot;+http://www.pipl.com/bot/)
5:28 am on Oct 27, 2017 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10113
votes: 550


BTW - I've seen that Google humans.txt used in params as well as referrer... illogical.
5:59 pm on Oct 27, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14256
votes: 551


illogical
Whenever I see things like that, I wonder if they're doing it on purpose--for reasons which pass ordinary human ken--or whether the stupid robot is the work of a stupid botrunner who entered the wrong variable while creating the script, and didn't bother to test-run it.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members