Dissecting the Attention Recorder: Write Your Own

Originally published: 03/2006 by J Wynia

So, the other day, I posted about how to use the AttentionTrust toolkit to record your attention.

Recording just *which* pages you visit is a good start on tracking your attention, really leveraging your attention is going to require more flexible and complete tracking. The easiest place to expand on the basics that the AttentionTrust Firefox plugin and sample server provide is to build your own server from scratch. The information below talks about how the existing extension works. With this information, most web developers should be able to implement a server that clones the existing sample server in pretty much any programming language or data storage method.

The data is posted to the server "raw". In PHP, this means that $_POST array is empty because there are no name/value pairs. Instead, the toolkit uses the "php://input" method of grabbing the raw posted data. That interface is documented at: http://us2.php.net/wrappers.php.

The contents of that raw POST is an XML string that looks like this:


<attention xmlns="http://attentiontrust.org/attention/ns#" version="0.11" recorderGUID="{7118cc65-ee56-4af0-b5fc-37205e1bc61e}" recorderVersion="0.16">
<httptransactions>
<httptransaction>
   <title>The Glass is Too Big - J Wynia</title>
   <url>http://www.wynia.org/wordpress/</url>
   <cookie>0</cookie>
   <setcookie>0</setcookie>
   <responsecode>0</responsecode>
   <method>CACHE</method>
   <date>Sun, 05 Mar 2006 13:41:22 GMT</date>
</httptransaction>
</httptransactions>
</attention>

The rest of the request just comes through as a regular web request. This means that you can use your normal cookie or other authentication mechanisms. If you have the user visit some control panel and set up a lifetime cookie via a regular web page that the extension/browser will also make available on the tracking posts.

As such, you can see that just doing a quick parse of that XML and you can save the data wherever you want. In a single user system, just put some HTTP authentication on it and you'd be done. For multiple users, you'd just need to do what the sample server did, which is set the user id as a permenant cookie.

All in all, a pretty simple server. I'm not putting up code to a full-blown prototype for a specific reason.

While the existing extension is useful, it doesn't capture (or offer easy ways to capture) more that basic "this URL was loaded" type of recording. There's no way to track how long a page was the primary window, no way of tracking what the text under the mouse was when you clicked, etc.

As a result, I actually want to see a different, more flexible client before I write my own server. At that point, I'll probably make it compatible with the existing client, but also support the other mode too.

At any rate, if you were intrigued by the sample toolkit, but wanted a little more info on how it works, this should get you going.

Now, on to building that better client.

Comments

Aaron Westre on 3/6/2006
What I really want, which I'm sure you're getting at with this, and like we've talked about before, is a constantly updated profile generated with from this data, with which an agent of some type could make me a morning reading list to die for. Hurry up and make me one J!
Aaron Westre on 3/6/2006
Of course there are more "important" goals with this than satisfying my info-whore tendencies. Academic research, writing, personal research, cyber-stalking, all kinds of stuff. Not just a suggestion service or directory, but a real live personal web crawler.
J Wynia on 3/6/2006
Yep. That's exactly where I'm headed. Realistically, this is just step 1 in a digital version of my METER approach. If you want to really understand the "digital you", you need to start measuring whatever you can. Tools like this let you measure for long periods of time without having to keep up with it. It's handled for you. When you've gotten a bunch of measurements, you can use those as evaluation tools, as training data for learning algorithms, etc. You then ask the filter for things like "What topics am I really spending my time looking at?", "What is my daily 'biorhythm'?", Am I doing most of my surfing when I should be working?", "Can you give me 10 articles that are fresh and interesting?". Then, you feed raw data (think subscribe to 50,000 feeds) through the derived filter and ask those questions of it, it can do its job. However, without a huge pile of data as input, you can't build step 2. The real likelyhood is that the truly useful things you can do with this kind of data aren't going to be apparent until you start working with the resulting dataset. *Then*, the really interesting stuff starts. This is just the mechanics that need to be in place to make the good stuff happen.
Alex Barnett on 3/12/2006
Keep going with this, I love where you are going... Alex.
Elroy Jetson on 3/17/2006
This is fantastic work. Let me see if I understand where you are going. What you are proposing is a way to "subscribe" to services that you would, by means of the subscription, allow to access your data. They in turn would then be able to better compile information (news, whatever) that would be specifically tailored for you. If that is correct, let me pose a few questions. If you control/store the data locally, how do you propose to share this data? Since we are transient, firewall protected and what not, it couldn't be the local machine, it would have to be server based. Would you propose a distributed protocol for a vendor to tap into your data similar to say bittorrent? Also if you house the data, which would only consists or location information (url) then processing would need to occur. For this to work you would suspect a vendor to process that through a batch of some sort. How do you see this data being crunched? Certainly not in real time? I like the idea here and am interested to see where it ends up.
» First Crack 76. Paying Attention with J Wynia - the First Crack Podcast with Garrick Van Buren on 3/25/2006
[...] J’s Windows Attention Recorder and Edison Thomaz’ OnLife and the AttentionTrust Recorder [...]
JustPlain » Blog Archive » Paying Attention with J Wynia on 3/27/2006
[...] Recording Your Attention: Spying on Yourself Dissecting the Attention Recorder: Write Your Own [...]
blog comments powered by Disqus
Or, browse the archives.
© 2003- 2014 J Wynia. Very Few Rights Reserved. This article is licensed under the terms of the Creative Commons Attribution License. Quoted content or content included from others is not subject to that license and defaults to normal copyright.