Lara's Storytelling Resources

Skip table of contents

Here is a non-exhaustive list of various resources you might want if you're interested in automated story generation, interactive fiction (IF), or related research areas (such as tabletop roleplaying games—TRPGs). This list was first created when I co-taught Interactive Fiction and Text Generation at UPenn with Chris Callison-Burch.

I also made a list of related researchers, and I try to keep up a list of upcoming conference and workshop deadlines. If you want me to add or update anything on any of these lists, please let me know! You can unscramble my email address here:

Note: This is not a list of papers in the field, but rather a list of corpora & code and their corresponding papers if they have it.
If you're looking for paper lists, you might be interested in @arnicas's list of text generation papers found on arXiv, Stephen Ware's Narrative Intelligence Lab reading list, or the Tsinghua Natural Language Processing Group's text generation list.

Story Datasets

Dataset
Papers
Paper Code (Baselines)
Hugging Face Link
Leaderboard
Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence – dndbeyond.com Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence
Deep Dungeons and Dragons (DDD) Corpus – roleplayerguild.com Deep Dungeons and Dragons: Learning Character-Action Interactions from Role-Playing Game Transcripts
ROCStories – 5-sentence crowdsourced stories for Story Cloze Test A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories and LSDSem 2017 Shared Task: The Story Cloze Test https://competitions.codalab.org/competitions/15333
CaTeRS – Causal and temporal relations using ROC Stories CaTeRS: Causal and Temporal Relation Scheme for Semantic Annotation of Event Structures
Scifi TV Plots – science fiction episode summaries from Fandom Story Realization: Expanding Plot Events into Sentences https://github.com/rajammanabrolu/StoryRealization https://huggingface.co/datasets/lara-martin/Scifi_TV_Shows
WritingPrompts – r/WritingPrompts Hierarchical Neural Story Generation https://github.com/facebookresearch/fairseq/tree/main/examples/stories https://huggingface.co/datasets/rewardsignal/reddit_writing_prompts
Lit Bank – annotated Project Gutenberg An Annotated Dataset of Literary Entities and Literary Event Detection
STORIUM – storium.com (gamified storytelling) STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation https://github.com/dojoteef/storium-gpt2
ESTER – tagged events from news articles from the TempEval3(TE3) workshop ESTER: A Machine Reading Comprehension Dataset for Event Semantic Relation Reasoning https://github.com/PlusLabNLP/ESTER https://eventqa.github.io/
CMU Movie Summary Corpus – Wikipedia movie summaries Learning Latent Personas of Film Characters
The Children’s Book Test – kids' books from Project Gutenberg The Goldilocks Principle: Reading Children’s Books with Explicit Memory Representations and Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks https://github.com/facebookarchive/bAbI-tasks
Cornell Movie Dialog – movie scripts and metadata Chameleons in Imagined Conversations: A New Approach to Understanding Coordination of Linguistic Style in Dialogs https://convokit.cornell.edu/documentation/movie.html https://huggingface.co/datasets/cornell_movie_dialog
ScriptWriter – from GraphMovie, which no longer exists (descriptions of movie plots) ScriptWriter: Narrative-Guided Script Generation https://github.com/DaoD/ScriptWriter
NarrativeQA – movie scripts from various sources and Project Gutenberg books The NarrativeQA Reading Comprehension Challenge https://github.com/deepmind/narrativeqa https://huggingface.co/datasets/narrativeqa https://paperswithcode.com/sota/question-answering-on-narrativeqa
MCTest – 150-300 word stories written by crowdworkers MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text https://huggingface.co/datasets/sagnikrayc/mctest https://paperswithcode.com/dataset/mctest
InSentive – authored stories from BookCorpus Inspiration through Observation: Demonstrating the Influence of Automatically Generated Text on Creative Writing https://github.com/roemmele/InSentive
CoAuthor – collaborative writing dataset CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities
TimeTravel – stories and counterfactual continuations Counterfactual Story Reasoning and Generation https://github.com/qkaren/Counterfactual-StoryRW
TellMeWhy – Q&A for stories TellMeWhy: A Dataset for Answering Why-Questions in Narratives
PerSenT – author sentiment prediction (news articles) Author's Sentiment Prediction
EmotionLines – dialog from the Friends TV show & EmotionPush private chat logs EmotionLines: An Emotion Corpus of Multi-Party Conversations
TVRecap – TV shows from Fandom and TVMegaSite (soap operas) TVRecap: A Dataset for Generating Stories with Character Descriptions
FanFiction Archive – fanfiction.net Beyond Canonical Texts: A Computational Analysis of Fanfiction
HPAC Harry Potter and the Action Prediction Challenge from Natural Language https://github.com/aghie/hpac
SummScreen SummScreen: A Dataset for Abstractive Screenplay Summarization
SQuAD 2.0 (Stanford Question Answering Dataset) – reading comprehension SQuAD: 100,000+ Questions for Machine Comprehension of Text and Know What You Don't Know: Unanswerable Questions for SQuAD https://worksheets.codalab.org/worksheets/0x8212d84ca41c4150b555a075b19ccc05/
Naive Psychology of Characters in Simple Commonsense Stories – "cause and effect of mental state changes of characters in a story" Modeling Naive Psychology of Characters in Simple Commonsense Stories
Character Relations Annotating Character Relations in Literary Texts
TVShowGuess TVShowGuess: Character Comprehension in Stories as Speaker Guessing
Various corpora from UCSC's Natural Language and Dialogue Systems (NLDS) lab

Mixed Visual & Textual Datasets

Dataset
Papers
Paper Code
Hugging Face Link
Leaderboard
BookCorpus Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books and Skip-thought vectors https://github.com/ryankiros/skip-thoughts https://huggingface.co/datasets/bookcorpus
COIN COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis https://github.com/coin-dataset
WikiHow WikiHow: A Large Scale Text Summarization Dataset https://github.com/mahnazkoupaee/WikiHow-Dataset https://huggingface.co/datasets/wikihow
VIST – Visual storytelling data + task Visual Storytelling https://paperswithcode.com/dataset/vist
MovieGraphs – knowledge graphs, images, and descriptions MovieGraphs: Towards Understanding Human-Centric Situations from Videos
KG-Story Knowledge-Enriched Visual Storytelling
Character-Preserving Coherent Story Visualization (CP-CSV) – character-based story visualization Character-Preserving Coherent Story Visualization
StoryGAN – story visualization StoryGAN: A Sequential Conditional GAN for Story Visualization
Pororo-SV – StoryGAN CLEVR dataset StoryGAN: A Sequential Conditional GAN for Story Visualization https://paperswithcode.com/sota/story-visualization-on-pororo
DramaQA – Video Story Understanding on Korean TV Show "Another Miss Oh" DramaQA: Character-Centered Video Story Understanding with Hierarchical QA https://github.com/liveseongho/DramaQA
MovieQA MovieQA: Understanding Stories in Movies through Question-answering https://github.com/makarandtapaswi/MovieQA_CVPR2016/

Story Evaluation & Cloze Tests

Data Scrapers & Processors

Dataset
Info
Novel Chapter Summaries full book chapters and their summaries
Archive of Our Own Scraper scraper for Archive of Our Own fanfiction
Fanfiction Scraper scraper for fanfiction.net
BookNLP process your own book data
Newspaper3k newspaper scraper Python library
Homemade BookCorpus recreation of BookCorpus

Interactive Fiction Environments

Interactive Fiction Agents

Story Planning Systems

Planner
Papers
Glaive - a fast planner for multi-agent stories Glaive: a state-space narrative planner supporting intentionality and conflict
Sabre - next-gen Glaive Sabre: A Narrative Planner Supporting Intention and Deep Theory of Mind
StoryAssembler - "a narrative system for procedurally generating choice-based interactive narratives" StoryAssembler: An Engine for Generating Dynamic Choice-Driven Narratives
Belief and Intentional PDDL Using Domain Compilation to Add Belief to Narrative Planners
Winnow - "declarative domain-specific query language for story sifting" Winnow: A Domain-Specific Language for Incremental Story Sifting
Felt - "simple story sifting and simulation engine for emergent narrative play experiences" Felt: A Simple Story Sifter
Recurve (C++) - decompositional planner
STRIPS Planner (Python)
Partial Order Causal-Link (POCL) Planner (Python)

Story Generation Code

Code
Papers
DOC - generate stories from outlines with OPT DOC: Improving Long Story Coherence With Detailed Outline Control
Re^3 - code to generate stories from premises with GPT-3 Re3: Generating Longer Stories With Recursive Reprompting and Revision
Story Gen BART Content Planning for Neural Story Generation with Aristotelian Rescoring
EnGen Neural text generation in stories using entity representations as context
AI Dungeon 2
Plan-And-Write Plan-and-Write: Towards Better Automatic Storytelling
ASTER (Automated Story-Telling using Event Representations) Event Representations for Automated Story Generation with Deep Neural Nets
Story Realization Story Realization: Expanding Plot Events into Sentences
COINS (COntextualized Inference Rules for Narrative Story Completion) COINS: Dynamically Generating COntextualized Inference Rules for Narrative Story Completion
C2PO Automated Storytelling via Causal, Commonsense Plot Ordering
Creative Help Creative Help: A Story Writing Assistant
Infilling by Language Modeling (ILM) Enabling Language Models to Fill in the Blanks
Switching Linear Dynamical System (SLDS) Generating Narrative Text in a Switching Dynamical System
Label Semantics for Predicting Emotional Reactions Modeling Label Semantics for Predicting Emotional Reactions
Paranoid Transformer Paranoid Transformer: Reading Narrative of Madness as Computational Approach to Creativity
SoCP (Storytelling of multi-Character Psychology) Controllable Multi-Character Psychology-Oriented Story Generation
TD-VAE for Story Generation A Temporal Variational Model for Story Generation
PlotMachines PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking
Talk of the Town Characters Who Speak Their Minds: Dialogue Generation in Talk of the Town and Simulating Character Knowledge Phenomena in Talk of the Town
Toward Better Storylines with Sentence-Level Language Models Toward Better Storylines with Sentence-Level Language Models

Libraries & Toolkits

Library
Info
OpenAI GPT-3, ChatGPT, GPT-4
Hugging Face Hugging Face provides state-of-the-art general-purpose neural language model architectures like BERT, GPT-2, and others.
Hugging Face Transformer Library
AllenNLP Deep learning for NLP with state of the art models
Spacy "Industrial-Strength Natural Language Processing" in Python
NLTK - Natural Language Toolkit Basic NLP tools for Python & interfacing with some external models
Stanford NLP various NLP models in Java
Stanza Stanford NLP for Python
ConvKit Cornell Conversation Analysis Toolkit
Open IE information extraction on sentences

Knowledge Bases & Commonsense Reasoning

Knowledge Base
Papers
Hugging Face Link
VerbNet VerbNet: A Broad-Coverage, Comprehensive Verb Lexicon
FrameNet FrameNet II: Extended Theory and Practice
WordNet WordNet: An Electronic Lexical Database
ConceptNet ConceptNet 5.5: An Open Multilingual Graph of General Knowledge https://huggingface.co/datasets/conceptnet5
ATOMIC (ATlas Of MachIne Commonsense) ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning https://huggingface.co/datasets/atomic
COMET (COMmonsEnse Transformers) - uses ATOMIC and ConceptNet COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
GLUCOSE (GeneraLized and COntextualized Story Explanations) GLUCOSE: GeneraLized and COntextualized Story Explanations https://huggingface.co/datasets/glucose
Power and Agency in modern films Connotation Frames of Power and Agency in Modern Films
Eraser - Movie Rationales ERASER: A Benchmark to Evaluate Rationalized NLP Models https://huggingface.co/datasets/movie_rationales
ECIpedia
The NOC List Round Up The Usual Suspects: Knowledge-Based Metaphor Generation
NULEX - combines WordNet, VerbNet, and Wiktionary NULEX: An Open-License Broad Coverage Lexicon
CausalBank Guided Generation of Cause and Effect
SCRUPLES - ethical judgements SCRUPLES: A Corpus of Community Ethical Judgments on 32,000 Real-life Anecdotes
PeKo - event preconditions PeKo: A Large Scale Precondition Knowledge Dataset
SWAG (Situations With Adversarial Generations) - NLI from video captions SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference
HellaSwag (Harder Endings, Longer contexts, and Low-shot Activities for Situations With Adversarial Generations) - commonsense inference (harder SWAG) HellaSwag: Can a Machine Really Finish Your Sentence? https://huggingface.co/datasets/hellaswag
CLUTRR (Compositional Language Understanding with Text-based Relational Reasoning) CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text https://huggingface.co/datasets/CLUTRR/v1
Social Chemistry Social Chemistry 101: Learning to Reason about Social and Moral Norms
VADER (Valence Aware Dictionary and sEntiment Reasoner) VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text

Extras

Programming Languages & Authoring Tools for Writing Interactive Fiction

Notable IF Games

Tutorials

RPG/IF Inspiration

Name
Info
Polygon's Favorite Actual Play Podcasts Personal recommendation: The Adventure Zone
Actual Play Podcasts
Roll 20 Play tabletop games with friends virtually
chooseyourstory.com
AI Dungeon
Interactive Fiction on Itch.io Find cool indie IF games
Interactive Fiction Database IMDb for IF
Interactive Fiction Wiki

Related Courses

Course
Taught By
Year
Interactive Narrative Nick Montfort 2019 (Fall)
Interactive Fiction and Text Generation Lara J. Martin & Chris Callison-Burch 2022 (Spring)
AI Storytelling in Virtual Worlds Mark Riedl 2022 (Spring)
Computational Poetics Kathy Wu 2021 (Spring & Fall)

Generators for TRPGs and IF

Name
Info
Picrew Make customizable character images
Fantasy Map Generator
RPG Tinker D&D 5e NPC Generator
AnyDice Dice Probability Calculator
Print graph paper Just blank graph paper!
donjon Random generators for tabletop games
RPG Maps in Wolfram Language Code to tile hex pieces together to make a map
RPG Map Editor 2 Downloadable app for making maps
RPGgen A collection of generators

Various Tools

Name
Info
Versu "an engine for telling interactive stories about people"
WOOL "dialogue platform for creating virtual agent conversations"
Sudo Write "Bust writer’s block with our magical writing AI."
Verse by Verse "An experimental AI-powered muse that helps you compose poetry inspired by classic American poets"