You’ve just finished a book you love and you want to read another one just like it. How do you find it?
You could go to an e-bookstore like Amazon and see what other people who bought that book bought or what people who liked that book liked. You could ask a friend what they’d recommend. You could pick up a book by the same author or one that seems like it’s about the same subject.
Or you could use artificial intelligence to analyze books you already like to recommend books you haven’t heard of yet. Booksai, a new project from a small, international group of developers based in places like Russia, Germany and India, is developing a system that makes book recommendations based on the style, tone, mood and genre of things you already know you like to read. The tool is in beta currently and, the developers say, is not a fully formed product but just an engine at this point for testing.
One problem Booksai purports to solve is how do you find books outside of the echo chambers of best-sellers, books your friends also liked or books by the same author?
“Our technology analyzes only the book contents and has no idea about sales ranks and purchasing history. It can sense more delicate mood and hidden connections between different books and authors,” says the Booksai “about” page.
The technology works using an algorithm that can detect things like writing style, tone and mood and finds similar books. Booksai has created something called “text features” that is a unique description of a book’s style. It then associates those “text features” with other similar books in similar genres. Users can ask the engine to “find similar books” or to perform “book roulette,” which will come up with a random recommendation with similar titles to that random book below. (Test it our yourself here.)
The simple explanation from one of the Booksai developers, Dima Bitnique: “Self-learning technology with innovative content ranging algorithm not based on semantics. Complex algorithm, which includes elements of graphematic, phonosemantic, Markov models and expert systems.” Say that five times fast.
For now, there are only 10,000 books in the system, a rather small sample of the millions of books currently for sale. Short text is a problem: the longer the text, the more chance it has to identify text features and associate them with books in its database. The algorithm works best for works of 40,000 characters or longer (something like 5,000 to 8,000 words) but can work effectively with as few as 4,000 characters, according Bitnique.
But there are some promising early results. During a group discussion on Kindle Boards, Gloucester, Mass-based self-published author Kathleen Valentine wrote: “Okay, this is freaky. I pasted a selection from my new ‘The Whiskey Bottle in the Wall’ and got 2 books by Donald Harrington. I was reading Donald Harrington books while I was working on my book.”
The advantage of the Booksai technology over similar technologies is scale and accuracy, according to Bitnique.
“There are a lot of simple programs in internet which demonstrate recognition of authors by using machine learning classifiers,” he said. “Typically, they can easily cope with the classification of several dozen texts and authors in which they were trained. The accuracy of such analysis is at the level of about 60%.”
The implication is that Booksai has better accuracy — and certainly a lot more authors. There are nearly 5,000 authors who wrote the 10,000 books in the Booksai database right now.
For now, the point of Booksai isn’t to create a business-ready product; it’s to demonstrate the power of the technology and help advance the cause of science. As the company’s “who we are” blurb explains, “What we really want to have is to have interesting technologies provide us and you with interesting reading materials.”
Today, that’s the goal. Perhaps tomorrow similar technology will power booksellers and offer readers recommendations outside of the best-seller bubble.