These notes are a summary of concepts presented in “Toyteller: Toy-Playing with Character Symbols for AI-Powered Visual Storytelling.”
John Joon Young Chung, Melissa Roemmele, and Max Kreminski. 2024. Toyteller: Toy-Playing with Character Symbols for AI-Powered Visual Storytelling. In Adjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST Adjunct ’24). Association for Computing Machinery, New York, NY, USA, Article 22, 1–5.
- Motion System
- Enables users to create stories combining text and visuals through direct manipulation of character symbols.
- Character symbols convey nuanced social interactions via anthropomorphized motions.
- Supports bidirectional interaction
- Motion-steered text generation
- Text-steered motion generation
- Core Technology
- Motion and text inputs mapped onto a shared semantic vector space
- Translational layer allows large language models and motion generation models to interact seamlessly
- Inspired by Heider and Simmel’s experiments on symbolic movements expressing social semantics
- Features of the System
- Story generation input flexibility
- Gestural manipulation of symbols
- Natural language prompts
- Output versatility
- Story text aligned with symbol motions
- Character motions generated from story text
- Story generation input flexibility
- Interactive and Collaborative Storytelling:
- Users can choose their contribution level, with AI completing the remaining parts
- User Interaction Workflow
- Setting up the story
- Configure characters, background, and scene via a setup page
- Options for manual input or AI-generated text and images
- Story Creation Modes:
- Motion-to-Text
- Users manipulate symbols to create motions, generating corresponding story text
- Text-to-Motion
- Users or AI provide story text, and motions align with the text
- Motion-to-Text
- Setting up the story
- Interface Components
- Setting page
- Configure characters’ names, descriptions, profile images, and scene details
- Story timeline
- Displays progress with recorded frames and corresponding textboxes
- Playground
- Interactive area for manipulating character symbols alongside story text
- Setting page
- Editing and Revision Capabilities
- Revise motions by overriding frames via symbol manipulation
- Edit story text manually or regenerate using AI tools
- Delete specific frames, text, or content after a given point
- Playback functionality for reviewing story content