Skip to content

ericchamoun/NLP-Framing-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLP-Framing-Analysis

Implementation of our EMNLP 2025 main conference paper,
“Social Good or Scientific Curiosity? Uncovering the Research Framing Behind NLP Artefacts.”
This project extends the earlier version.

This repository extracts epistemic elements from paper introductions, computes semantic/uncertainty-based scores, ranks research framings with rule-based scoring, and runs an LLM-based framing classifier.

data/ layout:

data/
  fact-checking_val.json              # AFC annotated papers with processed abstract + introduction (val)
  fact-checking_test.json             # AFC annotated papers with processed abstract + introduction (test)
  hate_speech_data.json               # HS annotated papers with processed abstract + introduction
  automated-afc-analysis.json         # Processed abstract + introduction for our automated analysis.

src/ layout:

src/
  configs/                          # YAML configs per domain
  elements/                         # Epistemic elements generation + semantic entropy
  framing/                          # Narrative ranking (rules) + narrative classification (LLM)
  inference/                        # LLM wrapper for Gemini
  utils/                            # Shared helpers (elements + framing)

Supported domains (YAML names in src/configs):

  • fact-checking
  • hate-speech

Environment Setup

1.1 Create/activate Conda env

conda create -n nlp-framing-analysis python=3.12.1 -y
conda activate nlp-framing-analysis

1.2 Install Python packages

pip install -r requirements.txt

1.3 Gemini API key

The LLM wrapper uses Google Gemini via google-generativeai.

export GEMINI_API_KEY="YOUR_KEY_HERE"

Step-by-Step Pipeline

Step A — Generate epistemic elements with an LLM and Run Semantic Clustering

python -m src.elements.run_inference \
  --domain fact-checking \
  --dataset_path data/fact-checking_test.json \
  --num_generations 10 \
  --temperature 1.0 \
  --model gemini-2.0-flash

src.elements.run_inference — Arguments

Argument Type Description
--domain str Name of config in src/configs (without .yaml). Task domain (fact-checking or hate-speech).
--dataset_path str Path to the JSON dataset to analyze.
--use_assoc_labels bool Set to only consider paragraphs annotated as containing research framing information.
--num_generations int Number of generations per input from the LLM.
--temperature float LLM sampling temperature.
--strict_entailment bool Keep only high-confidence entailments during clustering.

Outputs in outputs/ (filenames derived from the dataset filename):

  • *_generations.pkl — raw LLM generations and avg logprobs per query
  • *_mrrs.txt — element-level metrics (filtered MRR)
  • *_generation_scores.txt — JSON of per-paper, per-element scores (used for ranking and as hints to classification)

Step B — Rule-based Narrative Ranking

Takes the epistemic-element scores JSON and computes research framing scores per paper using semi-automatically inferred domain rules.

python -m src.framing.ranking \
  --input_file outputs/elements/<dataset_base>_generation_scores.txt \
  --output_path ranking_results \
  --gold_path data/fact-checking_test.json \
  --domain fact-checking
  

src.framing.ranking — Arguments

Argument Type Description
--input_file str Path to the JSON file with epistemic-element predictions.
--gold_path str Path to human-annotated gold data (used for evaluation).
--output_path str Output stem under outputs/ where results are saved.
--use_gold_rankings bool Whether to use gold narrative rankings instead of model predictions.
--domain str Task domain (fact-checking or hate-speech).

Outputs:

  • *.json — narrative scores per paper
  • *.txt — filtered MRR summary (overall + per gold label, if gold provided)
  • *_aggregated.json — aggregated epistemic-element scores used for ranking

Step C — LLM Narrative Classification

In this stage, a large language model (LLM) refines narrative predictions by reasoning over the epistemic element confidence scores and their supporting evidence from previous stages.
Rather than classifying from scratch, the LLM is prompted with structured justifications summarizing the system’s prior reasoning, alongside paper context and framing definitions.

Set both --system_generations and --paper_scores_path to provide these structured summaries to the LLM.

python -m src.framing.classification \
  --dataset_path data/fact_checking_test.json \
  --output_path classification \
  --domain fact-checking \
  --system_generations \
  --paper_scores_path outputs/framing/<ranking_output_path>_aggregated.json  \
  --model gemini-2.0-flash \
  --temperature 1.0 \
  --trials 15

src.framing.classification — Arguments

Argument Type Description
--domain str Task domain (fact-checking or hate-speech).
--dataset_path str Path to the JSON dataset with paper introductions.
--output_path str Output stem (predictions will append _{trial}_predictions.json).
--system_generations bool If set, include ranking-model hints in the prompt.
--paper_scores_path str Path to the epistemic-element scores JSON used to build hints.
--model str LLM model ID (e.g., gemini-2.0-flash).
--temperature float Sampling temperature for the LLM.
--trials int Number of runs to perform (to compute confidence intervals)

Outputs:

  • *_predictions_<trial>.json — model-selected labels + reasoning per paper
  • Console shows macro-averaged Precision/Recall/F1 and average F1 over trials

Domain Adaptation

To add a new domain, update the following two directories:

src/configs/

Edit or add a YAML file (*.yaml) to define:

  • Epistemic-element questions, labels, and base task templates
  • Framing (narrative classification) labels and prompt templates
  • System roles, definitions, mappings, and confidence thresholds

src/framing/rules/

Implement domain-specific ranking rules in a new Python file.
Each file defines how epistemic-element scores combine into narrative predictions.

Together, these two components (configs + rules) fully determine how the system operates for a given domain.


Citation

If you find this useful, please cite our paper as:

@inproceedings{chamoun-etal-2025-social,
    title = "Social Good or Scientific Curiosity? Uncovering the Research Framing Behind {NLP} Artefacts",
    author = "Chamoun, Eric  and
      Ousidhoum, Nedjma  and
      Schlichtkrull, Michael Sejr  and
      Vlachos, Andreas",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.emnlp-main.1286/",
    pages = "25310--25346",
    ISBN = "979-8-89176-332-6",
    abstract = "Clarifying the research framing of NLP artefacts (e.g., models, datasets, etc.) is crucial to aligning research with practical applications when researchers claim that their findings have real-world impact. Recent studies manually analyzed NLP research across domains, showing that few papers explicitly identify key stakeholders, intended uses, or appropriate contexts. In this work, we propose to automate this analysis, developing a three-component system that infers research framings by first extracting key elements (means, ends, stakeholders), then linking them through interpretable rules and contextual reasoning.We evaluate our approach on two domains: automated fact-checking using an existing dataset, and hate speech detection for which we annotate a new dataset{---}achieving consistent improvements over strong LLM baselines.Finally, we apply our system to recent automated fact-checking papers and uncover three notable trends: a rise in underspecified research goals, increased emphasis on scientific exploration over application, and a shift toward supporting human fact-checkers rather than pursuing full automation."
}

About

Code and data for our paper on automatically uncovering research framings in NLP papers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages