Originally published on: 7/31/2009 9:40:22 PM
I liked hers so much that I ordered one for myself shortly thereafter and it's gone with me nearly every day, everywhere. You can read any of several hundred online reviews of the device itself. Of course, most of those will either recommend or pan the device on its own, with no real consideration of how you read.
Lots of the stuff that I saw in reviews before I bought mine goes on a list of features I couldn't care less about. For instance, in the 4 months I've had mine, I've wanted to look up precisely 5 words in the dictionary and NONE of them was in there.
At any rate, I like it and my use probably isn't exactly what people think of when the Kindle gets mentioned. See, for me, the killer feature is the ability to highlight a sentence or paragraph or entire page and save it for later. Most of what I read on it is non-fiction: technical books and stuff like Curious? and Ratio
.
In print books, I've always dog-eared pages and taken notes, but retrieving that information is always less than convenient. So, when I found out that the Kindle makes clipping text from books easy, I was excited. After clipping a couple of things, I went looking at the file format. All of the clippings get dumped into a single .txt file.
While that's fine for opening and looking through or even searching through directly on the Kindle, I wanted something that made it a little more organized. So, I wrote a quick proof of concept app to parse the file and put its contents into SQL Server and thought I'd share it with you.
I'd been wanting to do a little more with Subsonic 3.x's new feature SimpleRepository, which functions somewhat similar to db4o, but still SQL underneath. It strikes me as particularly suited to quick prototyping applications just like this one.
Rather than setting up the database schema, you just create your plain old C# objects (POCO) and SimpleRepository takes care of creating the tables as well as retrieving and storing your objects.
My POC app puts each clipping/quote into the database once and avoids duplication on subsequent runs. The result is much easier to, for instance, pull together everything from one book. The Kindle stores the clippings in the order they were clipped, making a mess if you don't read books linearly. Once written, I can easily keep my database up to date with anything I want to save from anything I'm reading on the device. Now that it's in that form, it could easily sit behind a web front end or be re-mixed in other interesting ways.
Anyway, on to the sample console app. I named it IpseDixit, which is Latin for "he himself said it".
You need a connection string to your database server. The database name in the connection string needs to exist and your Windows account needs access, but you don't have to create any tables in it.
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<connectionStrings>
<add name="IpseDixit" connectionString="Data Source=localhost;Initial Catalog=IpseDixit;Integrated Security=True" providerName="System.Data.SqlClient" />
</connectionStrings>
</configuration>
I created a simple class for a Quotation. Here's the entire source to the console application itself. I put multiple classes into the one file for example purposes. You need a reference to Subsonic.Core.dll from the Subsonic 3.x download. Comments in the code itself explain what's going on and why.
namespace IpseDixit
public class Quotation
[SubSonicLongString]
while(index < testString.Length) {
string[] final = new string[offset+1];
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using SubSonic.Repository;
using SubSonic.SqlGeneration.Schema;
{
{
public int Id { get; set; }
public string Source { get; set; }
public string SourceLocation { get; set; }
public string Text { get; set; }
public DateTime DateClipped { get; set; }
}
class Program
{
static void Main(string[] args)
{
//Set up a Subsonic SimpleRepository
var _repo = new SimpleRepository("IpseDixit", SimpleRepositoryOptions.RunMigrations);
//Open the Kindle clippings file. Obviously, moving this to the config file would make sense when turning this example into your own "real" app.
var _kindleClippingsFile = new FileStream(@"E:\documents\My Clippings.txt", FileMode.Open, FileAccess.Read);
StreamReader _streamReader = new StreamReader(_kindleClippingsFile);
//Get the complete contents of the Kindle clippings into a string
string _kindleClippingsString = _streamReader.ReadToEnd();
_streamReader.Close();
_kindleClippingsFile.Close();
//The individual clippings are split up by a "=====" kind of seperator, so we split it into records on that "seam".
var _kindleClippings = SplitByString(_kindleClippingsString, "==========");
foreach (var _clipping in _kindleClippings)
{
try
{
//A basic property bag object called Quotation provides a container for all of the bits.
Quotation _newQuote = new Quotation();
//More splitting by line breaks to get the individual bits of a given quote
var _clippingElements = SplitByString(_clipping, "\r\n");
//The title of the book/magazine/etc is one of the basic bits
_newQuote.Source = _clippingElements[1];
//Because bookmarks are kept in this same file, we check only for those called "Highlight"
//which is Amazon's term for a quote.
if (_clippingElements[2].Contains("Highlight"))
{
//The date the quote was added and the digital equiv of a page number are on the same line
//so we split that up too.
var _quoteDateAndLocation = _clippingElements[2].Split('|');
//I haven't found a reasonable way to use the location outside of the Kindle, but I figured it
//made sense to store it anyway.
_newQuote.SourceLocation = _quoteDateAndLocation[0];
//I wanted the date the quote was added as a DateTime, so parse that out.
var _quoteDateAddedString = _quoteDateAndLocation[1];
if (_quoteDateAddedString.Length > 0)
{
_quoteDateAddedString = _quoteDateAddedString.Replace("Added on", "").Trim();
DateTime _quoteDateAdded = DateTime.Parse(_quoteDateAddedString);
_newQuote.DateClipped = _quoteDateAdded;
}
//Finally, what was the highlighted text of the quote.
_newQuote.Text = _clippingElements[4];
//Let's check the SimpleRepository for whether this quote's already been added.
//Since the Kindle keeps adding to this file, it's going to have all of the quotes
//from the last import on it. Check if we've already stored this one.
//I used the full text of the quote for comparison, which is slower than other comparisons,
//but more likely to catch collisions.
if (_repo.Exists<Quotation>(x => x.Text == _newQuote.Text)){}
else
{
//Add the quote to the database via the SimpleRepository
_repo.Add(_newQuote);
}
}
} catch (Exception e)
{
//Since this is just a blog post sample, I'm not handling the exceptions. You should.
}
}
}
private static string[] SplitByString(string testString, string split) {
int offset = 0;
int index = 0;
int[] offsets = new int[testString.Length + 1];
int indexOf = testString.IndexOf(split, index);
if ( indexOf != -1 ) {
offsets[offset++] = indexOf;
index = (indexOf + split.Length);
} else {
index = testString.Length;
}
}
if (offset == 0 ) {
final[0] = testString;
} else {
offset--;
final[0] = testString.Substring(0, offsets[0]);
for(int i = 0; i < offset; i++) {
final[i + 1] = testString.Substring(offsets[i] + split.Length, offsets[i+1] - offsets[i] - split.Length);
}
final[offset + 1] = testString.Substring(offsets[offset] + split.Length);
}
return final;
}
}
}