Skip to content

wgyhhhh/EASE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EASE: Towards Real-Time Fake News Detection under Evidence Scarcity

License: MIT Python 3.10 PyTorch 2.7.0 arXiv GitHub Stars

drawing

Guangyu Wei*, Ke Han*, Yueming Lyu†, Yu Luo, Yue Jiang, Caifeng Shan, Nicu Sebe
(*Contribute equally, †Corresponding author)

We will publicly release all implementation details, including code, datasets, and infrastructure, to enable result verification and contribute to the research community.

If you have any questions, please new an issue or contact gywei@stu.ouc.edu.cn.

📰 News

[2026.3.23] We have further improved the Agent and added support for the Deepseek series of models.

[2025.12.27] We have fully open-sourced the agent code in step 1.

[2025.12.22] We have fully open-sourced the expert architecture and training code in step 3.

Previous Releases [2025.10.17] We have publicly released the EmergingNews-25 Dataset. Researchers can now download and use it by completing [this form](https://forms.office.com/r/mJRTtJR2Qf).

📊 Dataset Download

This dataset can be accessed by completing the Application to Use the EmergingNews-25 from EASE for Emerging Fake News Detection. Upon approval, it will be available for download and use.

Dataset Examples

The dataset is structured as follows:

├── data
    ├── news
        └── news.json
    ├── imgs
        ├── 0.png
        ├── 1.jpg
        ├── 2.png
        └── ... # {id}.jpg/png/webp

👨‍💻 Code

Environment Setup

  1. Clone the repository:
git clone https://github.com/wgyhhhh/EASE.git
cd EASE/Expert
  1. Install dependencies:
conda create --name EASE python=3.10
conda activate EASE
pip install -r requirements.txt

Pretrained BERT

After downloading the pretrained models from their links (bert-base-uncased and chinese-bert-wwm-ext), please configure the local bert_path in your scripts.

  1. Agent

Prepare API

Please first register API keys on OpenAI API and Serper API, and fill them in the ./Agent/config/api_keys.yaml file. Currently, the Agent only integrates OpenAI's GPT series and DeepSeek's series models. Support for other models will be provided in future updates, and we encourage everyone to submit Pull Requests.

Prepare Dataset

The dataset should be organized in the following format and placed in the /data/ directory, divided into train.json, val.json, and test.json:

[
  {
    "id": 0,
    "content": "News",
    "label": "real", # real or fake
  },
  {
    "id": 1,
    "content": "News",
    ...
  },
  ...
]

Domain Knowledge Generation

cd EASE/Agent
conda create --name Agent python=3.10
conda activate Agent
pip install -r requirements.txt

If you need to use 🔥Firecrawl, please run the following command; otherwise, the agent will use the alternative BeautifulSoup.

docker run -d -p 3002:3002 tudamailab/firecrawl

Currently, the agent supports two languages for data processing. You can switch the language by modifying the second line in run.py to os.environ["LANGUAGE"] = "zh" (where zh is for Chinese and en is for English).

python Agent/scripts/run.py

Web UI

We also provide a lightweight Web UI for configuring API keys, switching the Agent language, launching run.py, and monitoring logs and token cost in real time.

Start the Web UI with:

python Agent/scripts/webui.py

Then open http://127.0.0.1:8001 in your browser.

The Web UI currently supports:

  • editing Agent/config/api_keys.yaml directly from the page
  • switching between Chinese (zh) and English (en), which is synced back to Agent/scripts/run.py
  • selecting the model defined in Agent/config/available_models.csv
  • real-time log streaming for the current run
  • real-time token and cost statistics based on available_models.csv
  • a custom OpenAI-compatible base_url for GPT-series models

After processing, the data will be transformed into a format suitable for the expert model.

[
  {
    "id": 0,
    "content": "News",
    "label": "real", # real or fake
    "sentiment": "Sentiment analysis from Agent",
    "reasoning": "Reasoning knowledge from Agent",
    "evidence": "External evidence from Agent",
    "sentiment_pred": "fake", # Prediction for the news based on this knowledge
    "reasoning_pred": "real",
    "evidence_pred": "real",
    "sentiment_acc": 0, # Whether it matches the label (1 if matches, otherwise 0)
    "reasoning_acc": 1,
    "evidence_acc": 1
  },
  {
    "id": 1,
    "content": "News",
    ...
  },
  ...
]
  1. Expert

Training Scripts

# For training on FakeNewsDetection dataset
bash train.sh

Testing Scripts

After obtaining the trained weights (saved in results/EASE_{expert_type}_{dataset}/checkpoints/parameter_{expert_type}_{dataset}.pkl), simply update the corresponding paths in test.sh to run batch testing on the news dataset.

bash test.sh

❤️ Citation

Please cite the paper as follows if you use the data or code from EASE:

@misc{wei2025realtimefakenewsdetection,
      title={Towards Real-Time Fake News Detection under Evidence Scarcity}, 
      author={Guangyu Wei and Ke Han and Yueming Lyu and Yu Luo and Yue Jiang and Caifeng Shan and Nicu Sebe},
      year={2025},
      eprint={2510.11277},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
}

About

About Official repository for "Towards Real-Time Fake News Detection under Evidence Scarcity"

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors