Skip to content

YangLabHKUST/SyncTrack

Repository files navigation

SyncTrack: Rhythmic Stability and Synchronization in Multi-Track Music Generation

Paper HomePage Conference

📖 Abstract

Multi-track music generation has garnered significant research interest due to its precise mixing and remixing capabilities. However, existing models often overlook essential attributes such as rhythmic stability and synchronization, leading to a focus on differences between tracks rather than their inherent properties.

In this paper, we introduce SyncTrack, a synchronous multi-track waveform music generation model designed to capture the unique characteristics of multi-track music. SyncTrack features a novel architecture that includes a shared module to establish a common rhythm across all tracks and track-specific modules to accommodate diverse timbres and pitch ranges. The shared module employs two cross-track attention mechanisms to synchronize rhythmic information, while the track-specific modules utilize learnable instrument priors to better represent timbre and other unique features.

Additionally, we enhance the evaluation of multi-track music quality by introducing rhythmic consistency through three novel metrics: Inner-track Rhythmic Stability (IRS), Cross-track Beat Synchronization (CBS), and Cross-track Beat Dispersion (CBD). Both objective metrics and subjective listening tests demonstrate that SyncTrack significantly improves multi-track music quality by enhancing rhythmic consistency.

SyncTrack Architecture

🔥 News

  • [2026.03] The official implementation and evaluation metrics for SyncTrack are released.
  • [2026.01] Paper accepted to ICLR 2026!

🛠️ Prerequisites

1. Environment Setup

# Clone the repository
git clone https://github.com/YangLabHKUST/SyncTrack.git
cd SyncTrack

# Install dependencies 
conda create -n synctrack python=3.9
conda activate synctrack
pip install -r requirements.txt

Note: For evaluation metrics, ensure you have the necessary audio processing libraries madmom installed on your system.

2. Model Checkpoints

To run the model, you must download the pre-trained weights for VAE, HiFi-GAN, and MusicLDM.

After downloading, unzip the file and place the contents into the ckpt/ directory. Ensure your directory structure looks like this:

SyncTrack/
├── ckpt/
│   ├── vae-ckpt.ckpt        # (Example name, matches your unzipped files)
│   ├── hifigan-ckpt.ckpt
│   └── musicldm-ckpt.ckpt
├── config/
├── src/
└── ...

🚀 Training

To train the SyncTrack model, use the train_synctrack.py script. The configuration is managed via YAML files.

Configuration Setup

In config/synctrack_train.yaml, please configure the following paths before starting:

  • data.params.path.train_data: Path to your training dataset.
  • data.params.path.valid_data: Path to your validation dataset.
  • model.params.ckpt_path: (Optional) Path to a pre-trained checkpoint to resume training.

Run Training

python train_synctrack.py --config config/synctrack_train.yaml

⚡ Inference & Evaluation

To generate samples or evaluate the model on the test set, use the eval_synctrack.py script.

Configuration Setup

In config/synctrack_eval.yaml, please configure the following:

  • mode: Ensure this is set to test.
  • data.params.path.valid_data: Path to your test dataset.
  • model.params.ckpt_path: Path to the model checkpoint.

🌟 Use Our Pre-trained Model

Run Inference

python eval_synctrack.py --config config/synctrack_eval.yaml

📊 Evaluation Metrics

We provide a comprehensive suite of metrics to measure rhythmic stability and synchronization, located in the eval_metrics directory.

1. Cross-track Beat Dispersion (CBD)

CBD.py quantifies rhythmic synchronization in multitrack music by measuring the dispersion of beat alignment across all pairs of tracks.

Key Parameters:

  • --folder: Path to the directory containing audio stem subfolders (stem_0, stem_1, stem_2 and stem_3).
python eval_metrics/CBD.py --folder /path/to/generated/stems

2. Cross-track Beat Synchronization (CBS)

CBS.py measures rhythmic synchronization among multiple tracks.

Key Parameters:

  • --folder: Path to the directory containing audio stem subfolders (stem_0, stem_1, stem_2 and stem_3).
  • --window_size: Length of the sliding window in seconds (default: 0.15).
python eval_metrics/CBS.py --folder /path/to/generated/stems

3. Inner-track Rhythmic Stability (IRS)

IRS.py quantifies temporal consistency by averaging the standard deviation of the Inter-Beat Interval across all samples for each track.

Key Parameters:

  • --folder: Path to the directory containing audio stem subfolders (stem_0, stem_1, stem_2 and stem_3).
python eval_metrics/IRS.py --folder /path/to/generated/stems

🔗 Citation

If you find this code or our paper useful for your research, please cite:

@inproceedings{wangsynctrack,
  title={SyncTrack: Rhythmic Stability and Synchronization in Multi-Track Music Generation},
  author={Wang, Hongrui and Zhang, Fan and Yu, Zhiyuan and Zhou, Ziya and Chen, Xi and Yang, Can and Wang, Yang},
  booktitle={The Fourteenth International Conference on Learning Representations}
}

🙏 Acknowledgements

This repository is built upon MSG-LD and utilizes Madmom for beat tracking. We thank the authors for their open-source contributions.

📧 Contact

Please feel free to contact Hongrui Wang (hwangfb@connect.ust.hk), Fan Zhang (mafzhang@ust.hk), or Prof. Can Yang (macyang@ust.hk) if you have any questions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages