Are Large Language Models Capable of Generating Human-Level Narratives?

Read original: arXiv:2407.13248 - Published 7/19/2024 by Yufei Tian, Tenghao Huang, Miri Liu, Derek Jiang, Alexander Spangher, Muhao Chen, Jonathan May, Nanyun Peng

Are Large Language Models Capable of Generating Human-Level Narratives?

Overview

• This paper explores whether large language models (LLMs) can generate human-level narratives, which would be a significant breakthrough in AI storytelling capabilities.

• The researchers analyzed the discourse structure of narratives generated by LLMs and compared them to human-written stories to assess their quality and coherence.

Plain English Explanation

In this study, the researchers wanted to find out if large language models - the powerful AI systems that can generate human-like text - are capable of creating narratives that are on par with stories written by people. Telling engaging, coherent stories is a complex task that requires understanding things like character development, plot structure, and emotional dynamics. If LLMs could match human-level narrative abilities, it would be a major milestone for AI storytelling.

To evaluate the LLMs' narrative skills, the researchers looked at the structure and flow of the stories they generated, comparing them to stories written by people. They analyzed factors like how the narratives unfolded, how the different parts connected to each other, and how the language and ideas progressed to create a compelling narrative experience.

The findings from this analysis provide insights into the current capabilities and limitations of LLMs when it comes to generating human-like stories. This has important implications for the development of more advanced AI systems that could one day assist or even replace human writers in certain contexts.

Technical Explanation

The paper investigates the ability of large language models (LLMs) to generate narratives that are comparable in quality and coherence to those written by humans. The researchers analyzed the discourse structure of narratives produced by LLMs and compared them to a corpus of human-written stories.

The study examined factors such as [internal link: https://aimodels.fyi/papers/arxiv/do-language-models-enjoy-their-own-stories] narrative flow, character development, and emotional dynamics to assess the overall coherence and quality of the LLM-generated stories. This involved looking at how the different components of the narratives - such as plot points, dialogue, and descriptions - were connected and sequenced.

The findings revealed both strengths and limitations in the LLMs' narrative generation capabilities. While the models were able to produce stories with some level of structure and progression, they struggled to maintain consistent character traits, logical plot development, and nuanced emotional expression [internal link: https://aimodels.fyi/papers/arxiv/can-nuanced-language-lead-to-more-actionable].

These insights have important implications for the [internal link: https://aimodels.fyi/papers/arxiv/improving-visual-storytelling-multimodal-large-language-models] continued development of large language models and their potential applications in creative domains like fiction writing and screenplay generation [internal link: https://aimodels.fyi/papers/arxiv/leveraging-large-language-models-learning-complex-legal].

Critical Analysis

The paper provides a rigorous analysis of the narrative capabilities of large language models, but it also acknowledges several limitations and areas for further research.

One key caveat is that the study focused primarily on the discourse structure of the generated narratives, rather than more subjective factors like emotional resonance and artistic merit. While the discourse analysis is valuable, it may not fully capture the nuanced qualities that make a story compelling to human readers [internal link: https://aimodels.fyi/papers/arxiv/analyzing-narrative-processing-large-language-models-llms].

Additionally, the researchers note that the LLMs were trained on a limited dataset of human-written stories, which may have constrained their ability to generate truly novel and innovative narratives. Expanding the training data and exploring different model architectures could lead to more advanced storytelling capabilities.

Further research is also needed to understand how LLMs' narrative skills might be influenced by prompting techniques, as well as how they might perform in collaborative writing scenarios alongside human authors.

Conclusion

This study offers important insights into the current state of large language models' narrative generation abilities. While the models show promising signs of being able to construct structured stories, they still struggle to match the depth, coherence, and emotional nuance of human-written narratives.

Continued advancements in large language models, coupled with further research on the cognitive and creative processes underlying human storytelling, could eventually lead to AI systems that are capable of generating narratives that are truly indistinguishable from those crafted by people. This would have significant implications for the future of creative writing, filmmaking, and other storytelling domains.

However, as the researchers acknowledge, there are still many open challenges to overcome before large language models can be considered truly human-level storytellers. Maintaining a critical and thoughtful approach to the development of these technologies will be essential as the field of AI narrative generation continues to evolve.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Are Large Language Models Capable of Generating Human-Level Narratives?

Yufei Tian, Tenghao Huang, Miri Liu, Derek Jiang, Alexander Spangher, Muhao Chen, Jonathan May, Nanyun Peng

This paper investigates the capability of LLMs in storytelling, focusing on narrative development and plot progression. We introduce a novel computational framework to analyze narratives through three discourse-level aspects: i) story arcs, ii) turning points, and iii) affective dimensions, including arousal and valence. By leveraging expert and automatic annotations, we uncover significant discrepancies between the LLM- and human- written stories. While human-written stories are suspenseful, arousing, and diverse in narrative structures, LLM stories are homogeneously positive and lack tension. Next, we measure narrative reasoning skills as a precursor to generative capacities, concluding that most LLMs fall short of human abilities in discourse understanding. Finally, we show that explicit integration of aforementioned discourse features can enhance storytelling, as is demonstrated by over 40% improvement in neural storytelling in terms of diversity, suspense, and arousal.

7/19/2024

💬

Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation

Cyril Chhun, Fabian M. Suchanek, Chlo'e Clavel

Storytelling is an integral part of human experience and plays a crucial role in social interactions. Thus, Automatic Story Evaluation (ASE) and Generation (ASG) could benefit society in multiple ways, but they are challenging tasks which require high-level human abilities such as creativity, reasoning and deep understanding. Meanwhile, Large Language Models (LLM) now achieve state-of-the-art performance on many NLP tasks. In this paper, we study whether LLMs can be used as substitutes for human annotators for ASE. We perform an extensive analysis of the correlations between LLM ratings, other automatic measures, and human annotations, and we explore the influence of prompting on the results and the explainability of LLM behaviour. Most notably, we find that LLMs outperform current automatic measures for system-level evaluation but still struggle at providing satisfactory explanations for their answers.

5/24/2024

💬

Improving Visual Storytelling with Multimodal Large Language Models

Xiaochuan Lin, Xiangyong Chen

Visual storytelling is an emerging field that combines images and narratives to create engaging and contextually rich stories. Despite its potential, generating coherent and emotionally resonant visual stories remains challenging due to the complexity of aligning visual and textual information. This paper presents a novel approach leveraging large language models (LLMs) and large vision-language models (LVLMs) combined with instruction tuning to address these challenges. We introduce a new dataset comprising diverse visual stories, annotated with detailed captions and multimodal elements. Our method employs a combination of supervised and reinforcement learning to fine-tune the model, enhancing its narrative generation capabilities. Quantitative evaluations using GPT-4 and qualitative human assessments demonstrate that our approach significantly outperforms existing models, achieving higher scores in narrative coherence, relevance, emotional depth, and overall quality. The results underscore the effectiveness of instruction tuning and the potential of LLMs/LVLMs in advancing visual storytelling.

7/4/2024

Can Nuanced Language Lead to More Actionable Insights? Exploring the Role of Generative AI in Analytical Narrative Structure

Vidya Setlur, Larry Birnbaum

Relevant language describing trends in data can be useful for generating summaries to help with readers' takeaways. However, the language employed in these often template-generated summaries tends to be simple, ranging from describing simple statistical information (e.g., extrema and trends) without additional context and richer language to provide actionable insights. Recent advances in Large Language Models (LLMs) have shown promising capabilities in capturing subtle nuances in language when describing information. This workshop paper specifically explores how LLMs can provide more actionable insights when describing trends by focusing on three dimensions of analytical narrative structure: semantic, rhetorical, and pragmatic. Building on prior research that examines visual and linguistic signatures for univariate line charts, we examine how LLMs can further leverage the semantic dimension of analytical narratives using quantified semantics to describe shapes in trends as people intuitively view them. These semantic descriptions help convey insights in a way that leads to a pragmatic outcome, i.e., a call to action, persuasion, warning vs. alert, and situational awareness. Finally, we identify rhetorical implications for how well these generated narratives align with the perceived shape of the data, thereby empowering users to make informed decisions and take meaningful actions based on these data insights.

5/7/2024