Skip to content

chtmp223/Frankentext

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ§Ÿβ€β™‚οΈ Frankentext: Stitching random text fragments into long-form narratives

arXiV

This repository hosts the code for our paper, Frankentext: Stitching random text fragments into long-form narratives.

Pipeline Overview

πŸ§Ÿβ€β™‚οΈ Frankentext is a new type of long-form narratives produced by LLMs under the extreme constraint that most tokens (e.g., 90%) must be copied verbatim from human writings. To produce Frankentexts, we instruct the model to produce a draft by selecting and combining human-written passages, then iteratively revise the draft while maintaining a user-specified copy ratio.

πŸ“£ Updates

  • [2025-05-23]: Dataset and prompts for Frankentext are now available! Pipeline code is coming soon...

πŸ“¦ Using Frankentext

Getting Started

(Coming soon...)

Project Structure

.
β”œβ”€β”€ README.md
β”œβ”€β”€ assets
β”œβ”€β”€ data 
    β”œβ”€β”€ outputs
    └── inputs
β”œβ”€β”€ prompts
└── scripts
    β”œβ”€β”€ pipeline
    β”œβ”€β”€ eval
  • data:
    • inputs contains input data for each experiment.
    • outputs contains outputs from each experiment mentioned in the paper. Each subfolder represents outputs from an experiment.
  • scripts contains code to obtain and automatically evaluate Frankentexts:
    • eval contains code to obtain metrics.
    • pipeline contains code to construct Frankentexts with models tested in the paper.
  • prompts contains all prompts used in the paper.
    • llm_judge contains prompts to obtain coherence and relevance judgments with LLM.
    • pipeline contains prompt to generate and edit Frankentexts.

πŸ“œ Citation

@misc{pham2025frankentextstitchingrandomtext,
      title={Frankentext: Stitching random text fragments into long-form narratives}, 
      author={Chau Minh Pham and Jenna Russell and Dzung Pham and Mohit Iyyer},
      year={2025},
      eprint={2505.18128},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.18128}, 
}

About

Frankentext: Stitching random text fragments into long-form narratives

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published