MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning

A MEta-path guided contrastive learning method for logical Reasoning of text, performing self-supervised pre-training on abundant unlabeled text data to reduce heavy reliance on annotated training data.

Authors

Fangkai Jiao¹, Yangyang Guo²*, Xuemeng Song¹, Liqiang Nie¹*

¹ School of Computer Science and Technology, Shandong University, Qingdao, China
² School of Computing, National University of Singapore
* Corresponding author

Links

Paper: Paper Link
Hugging Face Models:
Dataset (Wikipedia Entities & Relations): Dataset
Code Repository: GitHub

Updates

[2022] The paper was presented at the Findings of the Association for Computational Linguistics: ACL 2022.
[2022] Initial release of the code and Hugging Face checkpoints.

Introduction

This repository is the official implementation of the paper MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning.

Logical reasoning is of vital importance to natural language understanding. Previous studies heavily depend on annotated training data, and thus suffer from overfitting and poor generalization problems due to dataset sparsity.

Our method addresses these problems by proposing MERIt, a self-supervised pre-training method on abundant unlabeled text data. This repository provides the official training and fine-tuning implementation, pretrained checkpoints, and data pre-processing scripts.

Highlights

Novel Self-Supervised Pre-training: To the best of our knowledge, we are the first to explore self-supervised pre-training for logical reasoning to reduce the heavy reliance on annotated data.
Meta-Path Strategy: We successfully employ the meta-path strategy to mine the potential logical structure in raw text, automatically generating negative candidates via logical relation editing.
Counterfactual Augmentation: We propose a simple yet effective counterfactual data augmentation method to eliminate the information shortcut during pre-training.
State-of-the-Art Performance: We evaluate our method on two logical reasoning tasks, LogiQA and ReClor, achieving new state-of-the-art performance on both benchmark datasets.

The overall framework of the MERIt method, detailing graph construction, meta-path guided positive instance construction, negative candidate generation, counterfactual data augmentation, and contrastive learning objectives.

Method / Framework

The core of MERIt consists of two novel components: meta-path guided data construction and counterfactual data augmentation.

Meta-Path Guided Construction: Given an arbitrary document, we build an entity-level graph utilizing both external relations from knowledge graphs and intra-sentence relations. We then derive positive instances linking an entity pair via a search algorithm.
Negative Instance Generation: We construct negative options and negative contexts by reversely modifying relations through entity replacement.
Counterfactual Data Augmentation: To prevent the model from relying on real-world knowledge shortcuts, we substitute entities in both positive and negative instance pairs with random entities from other documents.

Project Structure

.
├── conf/               # The configs of all experiments in yaml.
├── dataset/            # The classes/functions to convert raw text inputs as tensor and utils for batch sampling.
├── experiments/        # Prediction results on datasets (e.g., .npy file for ReClor leaderboard).
├── general_util/       # Metrics and training utils.
├── models/             # Transformers for pre-training and fine-tuning.
├── modules/            # Core logical processing modules.
├── preprocess/         # Scripts to pre-process Wikipedia documents for pre-training.
├── scripts/            # Bash scripts of our experiments.
├── reclor_trainer....py # Trainers for fine-tuning.
└── trainer_base....py     # Trainers for pre-training.

Installation

1. Clone the repository

git clone [https://github.com/SparkJiao/MERIT.git](https://github.com/SparkJiao/MERIT.git)
cd MERIT

2. Install dependencies

pip install -r requirements.txt

Additional Notes:

fairscale is used for fast and memory-efficient distributed training (optional).
NVIDIA Apex is required if you want to use the FusedLAMB optimizer for pre-training, though AdamW is a viable alternative.
Hardware Requirements: RoBERTa-large requires at least 12GB memory on a single GPU, ALBERT-xxlarge requires at least 14GB. We used A100 GPUs for Deberta-v2-xlarge pre-training.

Checkpoints / Models

The following pre-trained models are available on HuggingFace:

RoBERTa-large-v1: RoBERTa-large-v1
RoBERTa-large-v2: RoBERTa-large-v2
ALBERT-v2-xxlarge-v1: ALBERT-v2-xxlarge-v1
DeBERTa-v2-xlarge-v1: DeBERTa-v2-xlarge-v1
DeBERTa-v2-xxlarge-v1: DeBERTa-v2-xxlarge-v1

Dataset / Benchmark

Our method is evaluated on two challenging logical reasoning benchmarks: ReClor and LogiQA.

Pre-training Data Preprocess

Our pre-training procedure uses Wikipedia data pre-processed by Qin et al., which includes entities and distantly annotated relations.

Raw Wikipedia Dataset: Download
Processed Data: Download

To process the data manually, run:

python preprocess/wiki_entity_path_preprocess_v7.py --input_file <glob path for input data> --output_dir <output path>

Usage

We use hydra to manage experiment configurations.

Pre-Training

To run pre-training:

python -m torch.distributed.launch --nproc_per_node N trainer_base_mul_v2.py -cp <directory of config> -cn <name of the config file>

(DeepSpeed is supported using trainer_base_mul_ds_v1.py)

Fine-Tuning

To run fine-tuning on downstream tasks (ReClor/LogiQA):

python -m torch.distributed.launch --nproc_per_node N reclor_trainer_base_v2.py -cp <directory of config> -cn <name of the config file>

Specific config files for differing models (e.g., RoBERTa, ALBERT, DeBERTa) and hardware setups are detailed in the conf/ directory.

Demo / Visualization

Our experiments demonstrate the superior performance of MERIt across different backbones.

Model	ReClor (Test)	LogiQA (Test)
RoBERTa	55.6	35.3
MERIt (RoBERTa)	59.6	38.9
ALBERT	66.5	37.6
MERIt (ALBERT)	70.1	42.5

For complete evaluation details, ablation studies, and prompt-tuning improvements, please refer to Section 6 of our paper.

Citation

If the paper and code are helpful, please kindly cite our paper:

@inproceedings{Jiao22merit,
  author    = {Fangkai Jiao and
               Yangyang Guo and
               Xuemeng Song and
               Liqiang Nie},
  title     = {{MERI}t: Meta-Path Guided Contrastive Learning for Logical Reasoning},
  booktitle = {Findings of the Association for Computational Linguistics: ACL 2022},
  publisher = {{ACL}},
  year      = {2022},
  pages     = {3496--3509}
}

Acknowledgement

We sincerely appreciate the valuable comments from all the reviewers to help us make the paper polished. We also greatly thank Liqiang Jing and Harry Cheng for their kind suggestions.This work is supported by the National Natural Science Foundation of China (No.: U1936203), the Shandong Provincial Natural Science Foundation (No.: ZR2019JQ23), and the Young creative team in universities of Shandong Province (No.: 2020KJN012).

License

This project is released under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
assets		assets
conf		conf
dataset		dataset
general_util		general_util
models		models
modules		modules
preprocess		preprocess
results		results
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
albert_baseline.py		albert_baseline.py
hf_trainer.py		hf_trainer.py
reclor_trainer_accelerate.py		reclor_trainer_accelerate.py
reclor_trainer_apex.py		reclor_trainer_apex.py
reclor_trainer_base.py		reclor_trainer_base.py
reclor_trainer_base_ds_v1.py		reclor_trainer_base_ds_v1.py
reclor_trainer_base_v2.py		reclor_trainer_base_v2.py
reclor_trainer_base_v3_test.py		reclor_trainer_base_v3_test.py
reclor_trainer_deepspeed.py		reclor_trainer_deepspeed.py
reclor_trainer_vanilla.py		reclor_trainer_vanilla.py
requirements.txt		requirements.txt
run_glue.py		run_glue.py
run_glue.sh		run_glue.sh
run_glue_path.sh		run_glue_path.sh
run_squad.py		run_squad.py
trainer_base_mul.py		trainer_base_mul.py
trainer_base_mul_8bit.py		trainer_base_mul_8bit.py
trainer_base_mul_apex.py		trainer_base_mul_apex.py
trainer_base_mul_ds_v1.py		trainer_base_mul_ds_v1.py
trainer_base_mul_v2.py		trainer_base_mul_v2.py
trainer_base_mul_v3_test.py		trainer_base_mul_v3_test.py
trainer_base_mul_vanilla.py		trainer_base_mul_vanilla.py
utils_qa.py		utils_qa.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning

Authors

Links

Table of Contents

Updates

Introduction

Highlights

Method / Framework

Project Structure

Installation

1. Clone the repository

2. Install dependencies

Checkpoints / Models

Dataset / Benchmark

Pre-training Data Preprocess

Usage

Pre-Training

Fine-Tuning

Demo / Visualization

Citation

Acknowledgement

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning

Authors

Links

Table of Contents

Updates

Introduction

Highlights

Method / Framework

Project Structure

Installation

1. Clone the repository

2. Install dependencies

Checkpoints / Models

Dataset / Benchmark

Pre-training Data Preprocess

Usage

Pre-Training

Fine-Tuning

Demo / Visualization

Citation

Acknowledgement

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages