| This 32 message thread spans 2 pages: < < 32 ( 1  ) || |
|Questioning the wisdom of using fat pings to deal with scrapers|
So, I'm [slowly] trying to wrap my head around the concept of SubPubHubbub, and the only reason I looked at a protocol with the name like this is because tedster recommended it :)
But perhaps both MC (at 2011 Pubcon) and tedster had something other than the default implementation in mind when they suggested that it might give your site a preference in terms of content authorship as opposed to your scrapers, especially those scrapers with some authority if you immediately "fat ping" Google (i.e. push through a hub) the entire content of your new post.
Perhaps I am not getting some important detail here but it looks like the most popular WP plugin for "fat pings" (not "fat pigs", silly spell checker! ) pushes it to two "default" hubs - Demo hub on Google App Engine and SuperFeedr.
Both hubs (as well as any other PSHbbb hub for that matter) are completely open and anyone can subscribe to your pings just like you hope Google will. In other words, whether or not Google will actually subscribe to read your "fat pings" through the open hub is an open question, but you can be sure the scrapers would LOVE to get your full article content the second it gets published. More so considering that you normally only publish excerpts in the RSS feed, which is what they used to aggregate before.
So, I think I'm missing an important bit of info here: how to make sure that Google gets the fat ones and the scrapers (pardon, aggregators) don't?
Can anyone more experienced with fat pigs chime in?
Thank you, Ted. I'm glad I caught your post before checking out for tonight. I was able to connect the dots now. I don't use a common CMS, so the development process will be drawn-out but I think I understand what direction to take. Fortunately, there's sample code on the Web to help write my own.
Tedster, if you're still interested in the subject matter, I wanted to add some of my own observations here after about 6 weeks of using PuSH on one of my sites just for a test.
I have to admit, I feel even more skeptical about it now than I did before starting this thread and starting the test. I'll try to explain:
My biggest confusion about this whole PuSH thing, specifically as it relates to authorship and Google, was that there seemed to be no way to make (or entice - whatever word you'd use for an inanimate software bot) Google to subscribe to my PuSH pings.
And that's exactly what I'm seeing now. Within less than 2 hours from my installing a PuSH hub (WP plugin) , I could see two entities have subscribed:
Yandex and linksalpha .appspot. com (whoever they are). And for the longest time there was no one else until about a week ago some other seemingly random appspot app had subscribed: pshb-service .appspot .com
I could some *some* use in Yandex subscription because I do get valid traffic from Russia on this site but the other two - I could find no information on them and for all I know, they might as well be cutting-edge scrapers.
But still no Google subscription!
During these 6 weeks several updates have been posted, all indexed by Google within hours, and one of the posts specifically has gotten pretty big coverage in related popular blogs etc. In other words, it's a living and breathing site, developing at normal pace, visited by regular Googlebot often. So, there would seemingly be no reason Google would not subscribe if this is really what they generally want to do when they see the <atom:link rel='hub' href='http://example.com/?pushpress=hub'/> line in a feed.
Do you think I'm missing/messing something in the implementation or has the importance of PuSH update been slightly exaggerated? Any particular steps to take to steer Google towards your hub?
| This 32 message thread spans 2 pages: < < 32 ( 1  ) |