The creator of Marlowe spills the goods on the new artificial intelligence for fiction authors
Can a computer really know what makes up an enticing novel? And if so, how does that actually work? That’s often the first question we get from readers and authors when we explain what Marlowe, our artificial intelligence, is accomplishing in the field of fiction.
We brought in Dr. Matthew Jockers, Marlowe’s creator, to explain how the A.I. can distinguish a great novel from a weak one and point out ways to improve it. Watch the chat below, where Matt does a good job of unwrapping the mystery. The core explanation lies in pattern recognition and Marlow’s ability to read thousands of books and remember minute details, traits and linguistics about every single novel.
Here is where things get cool, because the more she reads, the more she knows and the more she can share with us. In this live chat, we discuss how popular fiction has changed over the years, what elements of a plot are needed to hold readers’ interests, and how Marlowe can identify where an author is from simply by how often she uses the word “the.”
Click above to watch our half-hour discussion about A.I. in fiction, vampires and qualities that Marlowe has identified in bestselling books. If you’re interested in using Marlowe on your novels, click here to explore her free and advanced A.I. reports and plans.
Transcript of our conversation
Alessandra: All right, we are live. This is First Draft Friday. I’m Alessandra Torre and I am so excited to have with us today, Dr. Matt Jockers. Matt is one of our founders at Authors A.I. He’s also the co-author of the Bestseller Code and he’s the creator of Marlowe. So for all of you who have used Marlowe or have looked at Marlowe and thought about using her on your manuscript, this is the man behind the machine. I don’t like to use the word machine, but the man behind the technology. So, he’s got a lot of great information to share with us today and I can’t wait to dive in. So thank you for being here with this Matt.
Matt: You bet, great to be here.
Alessandra: I thought we’d dive in, just give them a little bit background, if you can introduce yourself and say how you ended up working in artificial intelligence and fiction and how Marlowe came to be.
Matt: Sure. Yeah, so I have kind of a weird backstory in that I’m a technologist, but also have a Ph.D. Researcher and trained in classical kind of literary training. But I became very interested in computation, probably in the eighties really and got my first Mac back in those days. And at some point the interest in technology and literature collided, and I discovered that we could use computers to help us understand prose. And so, I built my academic career around text mining and text analysis of fiction.
Alessandra: And when did you start thinking about using it like in a commercial capacity?
Matt: So that’s like a pretty funny story, really. So I was working teaching at Stanford back… I was there at 2001 through 2011 and I had written a piece of programming code that I was calling the Canonizer. And the purpose of this was to study classic works of literature to try and find what the sort of DNA of a class that looked like and pull out what are the patterns, the linguistic patterns. And I was talking about this in a class one day and typical Stanford student, you know, they’re very entrepreneurial. He’s in the back row and raises his hand and says, have you considered the commercial applicant because of this technology? And of course, the answer was no, I’d never thought about that. But within a year of that moment, I think I had gotten involved with my first startup company doing this sort of thing.
Alessandra: I love that. And for those of you who aren’t familiar with Author’s A.I. and aren’t familiar with Marlowe, she is… well, I know how I always describe her, but I’d love to know… what I always say is she’s kind of like a developmental editor, artificial intelligence developmental editor, she can read your novel and then point out things where you might be missing the mark, but that is really a crude way of describing her. So, can you explain how she works? Like, what she knows and how she knows it and then how she applies that when she reads anybody’s manuscript, like my manuscript that I upload into her.
Matt: Sure. So I first have to kind of say that the whole personification of this programming code is a little weird for me. I didn’t ever along the way think about personifying it.
Alessandra: Humanizing it?
Matt: Yeah. It really was what these kinds of technologies do or what they’re good at, what computers are good at is pattern recognition. And that means that they’re good at finding common patterns, but also then identifying outlier patterns, right, what’s different from the norm. So, there’s different things that Marlowe does read, and she reads sequentially the way that we do from left to right. But what she does during that process is to notice things that we as readers and writers tuned in the way she’s tuned in to them. And so, she can show us when we’re overusing particular constructions, linguistic constructions and that could be a word, when we’re overusing the word very, for example. She can do that, but she can also read with an awareness of everything that she’s read before. And this is the really cool thing.
So as a literature major and Ph.D. in literature, I read a lot of books; maybe a thousand in a lifetime of reading, Marlowe reads a thousand books in an afternoon and she can then keep track of what she’s read. She can keep track of common themes that she sees occurring in those books, common ways of expressing things. And then when she sits down to look at your manuscript, she has all of that knowledge. And we do this as critics too, and as editors we have that, but we’ve only got our 500 or a thousand and maybe I’ve read a thousand, but can I remember?
Alessandra: I have. I can remember like 15 books, yeah.
Matt: Right. So she’s pretty special in that regard. And then she can…
Alessandra: Marlowe is built on those really great books, right? Like she prioritizes, I mean, you have fed her or she has read poor-selling books and bestselling books. And she has learned what is common in the bestsellers is that correct, and that is what she uses as her goalpost or what’s the…?
Matt: So right now she’s actually, she’s sort of… we sent her back to grade school and wanted her to kind of thank her library. So the pre-Marlowe research that I did with my writing partner, Jodie Archer, for the book The Bestseller Code, there was sort of a prototype Marlowe there and we never thought about naming her. Actually, we did name it The Bestsellerometer. But that code was trained on a pretty limited set of books that had hit the New York Times Bestseller list, hardcover adult fiction. And so, what we’re doing now is quite a departure from that because we’re looking at books very much in a genre specific way so that we can identify within the romance genre or within the thriller genre what are the hallmarks of books that tend to sell well or… and selling isn’t always necessarily, I mean, now in this new marketplace of content; sometimes it’s downloads that are the markers of success.
Alessandra: And reviews. We have reviews now, you know. You’re right, like what used to… and things change over the genres. Like, does the AI tell us that, or has your research shown that books written in the eighties have a different structure and different hot points than books written today, or have readers changed in what they like?
Matt: Yeah, that’s a great question. So in the work that Jody and I did, we did kind of find that over a 30 or so year period, certain things were the sort of had a steady presence, but some things are more of the moment. And I often think about vampires and… well, I don’t often think about vampires… I think about vampires as a theme in fiction though, because, you know we have the original Dracula, which actually was a follow on to Camilla an earlier book about vampires. And then, you know, vampires disappear for a while and then ghost comes around and we get this big thing in vampires, then it disappears again. And then you get Twilight, and so, this is… I wouldn’t recommend it to anybody who wants to be a bestseller, writes a book about vampires, but there have been periods where they’ve been popular. But then there are these other things that are just steady throughout, and those are the kinds of things that Jody and I were looking for. And now, of course, Marlowe is looking for those things in a different kind of way.
Alessandra: So, like if I as an author want to write a page turner, and that is really kind of the key, right, the common thing, one common thing in bestsellers is that the readers devoured them quickly. Like, they move through them and keep turning pages, obviously. But if that’s my goal, what are the things that A.I. has taught you or has taught us about what… in terms of plot; let’s just look at plot for a minute. In terms of plot, what are the key attributes that I should strive for in my novels?
Matt: Yeah, it’s interesting. Plot is one of those things that we can sit around in a literary class and talk about and debate what exactly we mean by plot. And there are character-driven plots and there are action driven plots, and so on and so forth. If we’re just talking about this class of books that we describe as page turners and not all bestsellers by the way are page turners. All the Light We Cannot See is a wonderful bestseller that won a… I can’t remember which prize it one, Pulitzer Prize maybe. I don’t remember which prize, but I wouldn’t describe that book as a page turner. DaVinci Code I think is a page turner. And I think Dan Brown in that book mastered sort of the page turner. And there are certain things that Marlowe can detect about the way that Dan Brown wrote that book, by the way. And Fifty Shades of Grey is another one that has this kind of pattern to it.
Marlowe is able to detect a signal that is consistent with the books that we describe as bestsellers. And you’ve read The Da Vinci Code so you know that he ends every chapter with this sort of cliffhanger or some kind of suspense. And then he goes and deals with another set of characters for a chapter so you have to wait, and you get that. And Marlowe is able to detect that pattern of sort of suspense and delayed resolution of conflict and plots that as a graph so that you can see the moments where conflict and suspense are introduced, and then when they get resolved, and introduced, and resolved. And the pattern of the bestseller that we did that we found was that, when you do that as a writer in a very rhythmic sort of consistent pattern, those are the books that people can’t put down.
Alessandra: That makes perfect sense. And thinking about it from a writing point of view, I love the idea and the thought that Dan Brown does, which is like leave the reader, you know, the person’s like hanging onto the edge of the cliff. I don’t think that ever happened to The Da Vinci Code, but then, you know, cut and then we’ve moved to a totally different scene somewhere else, and then you have to get through that scene so you can get back to find out what happened. So those are called story beats, is that right? Like, is beats what you would refer to as moments of conflict and you want them to be spaced like 10%, every 10%, is that correct?
Matt: So the idea of beats I don’t think was ever in my literary training. I don’t ever remember talking about beats, but I think that in creative writing programs and in writer communities, the idea of beats is more commonly understood. So for me, the more kind of academic way of thinking about this is conflict and conflict resolution. And, you know, we use the example of DaVinci Code, and I don’t want to suggest here that that’s the only way to achieve this because 50 Shades of Gray doesn’t have people chasing each other in car scenes and so on. That’s not what you need to create a page turner. It’s a matter of how you’re managing the introduction and resolution of conflict. And so the beats, the beats that we provide in the Marlowe report mark those moments, and they give you a sense too of the magnitude of those moments. So, there can be minor conflicts and then there can be the major conflicts. And so, that graphic in fact shows both of those things.
It has the macro level arc of the story. And so you can imagine a story that has the main conflict reach its absolute apex at 85% of the way through the novel; it’s just before the end. And then maybe it gets resolved at the very end. And so, you see this dip at the very end and then an uptick of which is the resolution. Another shape might have that central conflict. So, I’ll use a literary example… Portrait of the Artist as a Young Man, which is a James Joyce early; the central conflict happens exactly the midway point.
Alessandra: And two things first, if anyone has any questions, please pop them into the comment section. So we want to answer as many of your questions that you have about the report about just A.I. in general, about the qualities of a bestseller; don’t be shy. And for those of you who have run a report on authors.ai, which is the website that you can run your own manuscript through Marlowe and see. One thing that I found really interesting, which is kind of the opposite of what I was thinking is when you are looking at your plot and your moments of conflict resolution; conflict is the dips in the… which in my mind, for some reason, I was just thinking, I don’t know, like the climax, you know, or something. But the moments of conflict are shown low on the chart and then resolution is high.
And I’ve run a lot of manuscripts. I’ve probably run 10 manuscripts of mine through Marlowe, and that’s the one thing I always go for is I’m very curious to see. And every novel is different, and some have a big dip at the bottom near the end and then it goes up, and you can definitely see books that end on a cliffhanger, you know, because they don’t have that resolution. So, that’s fascinating and that’s plot and Marlowe dealing with plot. The other thing which I found interesting, and you and I were talking about this the other day was style. So when you look at the elements of a great book; you’ve got plot, you’ve got characters and you’ve got style. Is it true that an A.I. that I have a distinct writing style that a computer can recognize versus me versus someone else?
Matt: So yes to a point, right? And so, this can get a little in the weeds, so we’ll keep it simple.
Alessandra: Yeah, AI for dummies.
Alessandra: So every one of us, and this is a little bit of a, you know, if we were having this conversation in an academic conference, I’d have to put a lot of asterisks next to things I’m about to say, right. But everyone has sort of a distinct way of using language. And I remember after I published my first book called Macro Analysis, one of the critics who reviewed it, noticed that I had a habit, a bad habit of using the phrase in other words. And I overused that phrase and I was unaware that I had this tick. And so if you played poker, right, you know that certain players have ticks that if they scratch their temple, it means they’ve got a full house. And they don’t know that they’re scratching the temple “in other words” was my temple scratch.
They don’t necessarily have to be phrases either. They’re actually embedded in really mundane things like the way that you use the word and how often you use the word. And so here’s an interesting little thing. If you have a writer who was brought up in great Britain or Australia versus America. An Australian or British writer uses the word “the” about four, four and a half times every hundred words. Whereas, if you’re from America, it’s more like five to five and a half times for every hundred words.
Alessandra: Wow.
Matt: So we can pull up, pull a book off my shelf here, count the frequency of “the” and determined pretty well if it was authored by a Brit or a Yank.
Alessandra: Can you tell if it’s a man or a woman?
Matt: Yeah, that’s an interesting and sometimes controversial topic. And so, there was a really good study of that done by Capell and a colleague, a couple of scholars, and they found that they could detect author gender. And we can for now treat that as binary male or female authors with 80% accuracy. And when I wrote Macro Analysis, I repeated their analysis and I got the same result. It’s about 80%.
Alessandra: Yeah, so that’s interesting. You do you know off the top of your head what it is that distinguishes a woman from a man in terms of their writing? Are we more descriptive or less descriptive? I think I read it in the bestseller code that men are more literary in their writing. Is that correct? Or maybe literary wasn’t the right word, but I was curious if there is anything that.
Matt: Yeah, so the one, the one really easy one to kind of grasp, I don’t remember the details of the mundane words like “of” and “the”, and this just applies to some study that I did of 19th century novels. In the 19th-century novel, if you look at the frequency with which male and female authors use the pronouns, he and she, and him and her; male writers in the 19th century have way more, he than she. They’re much more male driven. And part of this has to do with genres that males were writing in versus genres that females were writing in. If you graphed the usage of “he” over the 19th century, here’s the male writers using he, and here’s the male writers using she. Now, if you just plot the same data for the female authors, it’s here. They use “he” and “she” about the same frequency.
Alessandra: That is really interesting. I love that. We do have a few questions. I just want to pop over and make sure we get to them. Audrey said, “Is it possible to submit a partially finished manuscript through Marlowe? How far in do you have to have written before it’s helpful and acceptable to send it in?”
Matt: Yeah. So, I mean, the more words Marlowe has the better she does. So I wouldn’t probably submit a chapter; that’s not going to be super useful. There are certain things, I mean, she would find the cliches, if you had used cliches. She would find frequent words and so on, but she really likes some of her metrics, like the thematic metrics, the plot metrics; the character profiling really needs a lot more data to get real actionable information. So, if you’ve got a draft and you haven’t written the ending yet, that’s fine. You can get a lot of useful information, but once you add that ending on, it’s going to change particularly the plot; the plot shape is going to change. There’s no set number. I mean, I haven’t tried putting a hundred words, if then, I think that would be a disaster, but 20,000 words, something like that is probably plenty to get some good information.
Alessandra: Yeah. And she was saying, it might be easy to make changes as she goes, if she can see early on that, you know, the beginning is really slow. And a lot of the things that I think, like, that I often look at when I’m looking at a book and I’ve realized a lot about myself when looking at the reports, like the words that I do overuse. Here’s another question. Can you give us an idea how best to read the narrative arc line when a plot turns arc? What should we look for and what do they indicate?
Matt: Yeah, that’s a good question. So the green narrative arc gives you a sense of the sort of plot archetype, if you will. There’s a great video, you can Google for it, of Kurt Vonnegut talking about the shape of stories. And he describes, for example, what he calls a “man in the hole” story, and it’s a shape that’s a big U and the way Vonnegut describes it, a guy gets into trouble and then it gets out again, right, so the plot has this U shape. And in some other work that I’ve done and some other scholars, in fact, did a very similar study and found a result. We think there’s probably six or seven basic shapes, so the man in a hole shape is one of those. If you think about, again, sort of classical studies of literature, we tend to divide books into tragedies and comedies. And so the tragedy has this kind of shape and the comedy has a positive one. So those are sort of high-level plot archetypes.
Then the plot turns line that’s also on that graphic. This is the one that shows you more about the conflict and conflict resolution that happens along that arc of the novel. So, a very low or dip in that plot turns line is suggesting that this is a moment where things have gone very, very badly for your characters; there’s a crisis. And then when it goes up to the top, it’s that crisis or some micro crisis being resolved.
Alessandra: Sorry, I’m trying to… that is one of the most common questions that we get oftentimes through just the support desk with Authors AI, so I’m really glad that he asked that question. And hopefully, chime back in if that was clear if you need any future help.
Matt: I was just going to add that, you know, generally speaking, and this is, you know, remember that we’re dealing with an imperfect intelligence here, right? She’s you know… so generally speaking, those dips also are going to be moments where your suspense is heightened. Now it’s not a perfect, perfect match, but generally speaking, those dips in the purple line are marking moments where there’s some conflict or suspense that the reader is waiting for you to resolve.
Alessandra: So the alarming thing, if you’re looking at it where you should be concerned about your novel if it’s a flat, right? If either one are flat should you be concerned, or can the can the green be flat, pretty flat?
Matt: The green one can be flatter. They’re never going to be perfectly flat, but what you want in that smaller plot turn is you want more bumps, more ups and downs. And what we’ll sometimes see is like the first third of the book is very flat and then it gets bumpy. I would spend a lot of time thinking about the writing in that first part then, because you’re just not doing enough to manipulate the reader emotions.
Alessandra: Yeah, and that makes perfect sense. And you’ll often then also see that when you go to the next area, which has those, I’m calling them beats, but when it has that pacing kind of throughout the book.
Matt: Yeah.
Alessandra: Someone asked, “How do you teach Marlowe?” So, we’ve covered a little bit about this talking about how she’s read books, but go ahead.
Matt: She gets up very early in the morning. So, how do we teach her?
Alessandra: In layman terms.
Matt: Yeah. Well, first you have to give her the tools to read with, right. And so there’s all these pieces of her code that are about understanding what a sentence is and what the units of composition are, and what a noun is. She knows all of this, about grammar and so on, although she’s not a grammarian, there are other tools for that. But once she knows the primitives, if you will, of how writing works, then how does she learn? She learns by reading and looking at those. The more we feed her to read, the more she learns about the galaxy of books. And we can help her learn by telling her, “Hey, this book is a thriller or this book is a mystery” because then she’ll begin to associate patterns that appear in those human genres and look for those the next time she reads a mystery.
Alessandra: Somebody asked me something the other day and I started to answer it. And then I really thought, I should ask Matt this. They said, “Does it get to a point where she’s read too much and it almost becomes like she’s read too much and everything blends together, the learning curve flattens out?” And I thought… my impression was the more she reads the better, but then I was like, I don’t know, let me ask Matt about that.
Matt: Well, I’ll just put the question back to the human beings, right. At what point have you read so many romances that they all just seem the same? When I was a kid, I loved Louis L’Amour novels and I read every Western. And thinking back now, I think, yeah, I think that was the same five stories told 1600 different ways, so Marlowe is sensitive to that too, but I think generally, she can just keep reading.
Alessandra: Yeah, that makes sense. And especially like in the exciting future, and it’s not that far out future, as her report evolves and we’re able to add more comparison points; I think the more she reads will help authors a lot. Because especially she can get more finely tuned per genre, and that takes reading like you said, a ton of books, but she’s fast so she can read quickly. We are already out of time. It’s been so great to have you, Matt. We do have a free hand out on A.I., so the link is there. If you would like to try Marlowe out on your manuscript or your partial manuscripts as Audrey was asking about, visit authors.ai. We have free plans, and then if you want her more advanced plan, which includes a lot of what we were talking about today with those plots where it will chart your plot or show really the conflict and suspense moments in your plot; that’s part of our pro, but either way, we’d love you to give her a try. So thank you so much, Matt. It was really great to have you here. We’ll try to circle back. If we miss some of your questions, I will keep an eye on the comment section. If you watch this later and you’re watching a recording of it; feel free to pop your questions since the comment section and the team will be watching that. So, thank you guys. Thank you, Matt.
Matt: Thanks, Allie, great to see you.
Disclosure: Matt Jockers and I are co-founders of Authors A.I.
What’s your story?
We’d love to hear your thoughts on this episode of First Draft Friday discussion. Please drop your thoughts or questions in the comments below, or drop me a line. If you enjoy the chat, watch more video chats about characters, story beats and overcoming writing struggles here.
Free handout: How A.I. can enhance your fiction writing
GET FREE A.I. REPORTS
Would Marlowe understand a framed story, one with an external frame that uses a different set of characters in a different time period, such that there’s no overlap between the frame and inner story?
Hi Mark, Marlowe reads a book the same way that a human does, which is to say, she reads from front to back. When your narrative moves back and forth in time, Marlowe moves along with the narrative just as a reader would.
Thanks. I can see some of Marlowe’s routines working fine: word count, punctuation, risque words, etc. But narrative arc, plot turns, and beats might be tough to follow in that the outer frame in my story has its own set of those, while the inner story has its set of those. So, when I ran Marlowe, I stripped off the outer frame, so that it wouldn’t be confused.