Welcome to WebmasterWorld Guest from 107.22.97.23

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

apple.bot

massive requests

     
11:12 pm on Jan 28, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 22, 2005
posts:657
votes: 20


Anybobody got experience of massive apple.bot requests - seems official ?
2:26 am on Jan 29, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13908
votes: 491


Do you mean Applebot from 17.138 and 17.142?
Mozilla/5.0 (compatible; Applebot/0.3; +http://www.apple.com/go/applebot)
AND
Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Applebot/0.3; +http://www.apple.com/go/applebot)

Or some unrelated entity really called "apple.bot"?
9:04 am on Jan 29, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 22, 2005
posts:657
votes: 20


lucy@24

after a look through yesterdays rather large log....

17.138.55.240 - - [28/Jan/2016:22:10:48 +0000] "GET /example.htm?id=xxx HTTP/1.0" 200 35124 "-" "Mozilla/5.0 (compatible; Applebot/0.3; +http://www.apple.com/go/applebot)"

Acting as a bot, no page elements accessed

Thanks.
9:25 pm on Jan 29, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13908
votes: 491


There's a thread about it here [webmasterworld.com] with supplementary information and link here [webmasterworld.com]. (Note the comments about Googlebot robots.txt directives in the main thread.) I've noticed it increasingly often in recent months. It seems to be primarily interested in one specific directory, which suggests that it's paying attention to someone else's links and/or RSS feed.

There are two versions, vanilla and mobile. It does not ask for robots.txt as often as one would like. (My personal preference is at the beginning of every visit for sporadic robots, or at least once every 24 hours for regular crawlers.)
11:27 pm on Jan 29, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 22, 2005
posts:657
votes: 20


but is it a bad bot or a good bot ?
3:37 am on Jan 30, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13908
votes: 491


That's what we would all like to know :(

The other thread does point out some benefits in the specific context of local businesses. I don't know what it's good for (or not-good for) when it comes to purely informational sites. With me, it has primarily been visiting ebooks. No idea what it does with them.
1:05 am on Feb 3, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 22, 2005
posts:657
votes: 20


Um the bot was back again today. I have decided not to block it. Seems to go to a sub-directory and does about 8 pages then moves on. No idea why. We are a UK info site ( for reference)
7:23 am on Feb 3, 2016 (gmt 0)

Junior Member

joined:July 8, 2014
posts:46
votes: 1


I block Applebot. They don't stick to the rules outlined in my robots.txt. For me that's enough to treat them as a bad bot. Blocked by User-agent.
10:30 am on Feb 4, 2016 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:9255
votes: 443


Rumor has it Apple is building an index but exact purpose is so far unknown.

I've not seen any issues with this bot and I keep a diligent watch.
10:49 am on Feb 4, 2016 (gmt 0)

Junior Member

joined:July 8, 2014
posts:46
votes: 1


@keyplyr Applebot does react to User-agent: * where Googlebot is mentioned.
11:03 am on Feb 4, 2016 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:9255
votes: 443


Some bots implement robots.txt differently. Yes robots.txt is a web standard, but how it is implemented is up to a very wide interpretation sadly.

I use wild cards. Google, Yandex & DuckDuck read & obey them just fine. Bing does not. Go figure.
7:02 pm on Feb 4, 2016 (gmt 0)

Junior Member

joined:July 8, 2014
posts:46
votes: 1


Bing does not respect the wildcard? Wow, that's weak.
At least Bing does not follow Googlebot instructions if instructions don't explicitly mention Bing... one plus over Applebot.
8:55 pm on Feb 7, 2016 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 25, 2002
posts:123
votes: 0


On my site at least, applebot doesn't seem to recognize the 'base href' tag. For example, on a page http://www.example.com/foo/bar/ I have

<base href="http://www.example.com/foo/">

and internal links like

<a href="baz/">

which is supposed to refer to http://www.example.com/foo/baz/ but Applebot tries to crawl http://www.example.com/foo/bar/baz/ instead. I had a whole slew of these yesterday.
10:57 pm on Feb 7, 2016 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:9255
votes: 443


applebot doesn't seem to recognize the 'base href' tag.
A lot of agents don't support base href.
11:07 pm on Feb 7, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13908
votes: 491


I have

<base href="http://www.example.com/foo/">

and internal links like

<a href="baz/">
You may as well play it safe and use site-absolute links beginning in / (slash) or in this case /foo/. Then the robot will have no excuse for misunderstanding. Some robots still will get it wrong-- I've seen them-- but this will be due purely to the robot's own gratuitous stupidity. Save the relative links for URLs that you know will always be in the same directory, even if the whole directory packs up and moves.