Multi-track music generation has garnered significant research interest due to its precise mixing and remixing capabilities. However, existing models often overlook essential attributes such as rhythmic stability and synchronization, leading to a focus on differences between tracks rather than their inherent properties.
In this paper, we introduce SyncTrack, a synchronous multi-track waveform music generation model designed to capture the unique characteristics of multi-track music. SyncTrack features a novel architecture that includes a shared module to establish a common rhythm across all tracks and track-specific modules to accommodate diverse timbres and pitch ranges. The shared module employs two cross-track attention mechanisms to synchronize rhythmic information, while the track-specific modules utilize learnable instrument priors to better represent timbre and other unique features.
Additionally, we enhance the evaluation of multi-track music quality by introducing rhythmic consistency through three novel metrics: Inner-track Rhythmic Stability (IRS), Cross-track Beat Synchronization (CBS), and Cross-track Beat Dispersion (CBD). Both objective metrics and subjective listening tests demonstrate that SyncTrack significantly improves multi-track music quality by enhancing rhythmic consistency.
- [2026.03] The official implementation and evaluation metrics for SyncTrack are released.
- [2026.01] Paper accepted to ICLR 2026!
# Clone the repository
git clone https://github.com/YangLabHKUST/SyncTrack.git
cd SyncTrack
# Install dependencies
conda create -n synctrack python=3.9
conda activate synctrack
pip install -r requirements.txtNote: For evaluation metrics, ensure you have the necessary audio processing libraries
madmominstalled on your system.
To run the model, you must download the pre-trained weights for VAE, HiFi-GAN, and MusicLDM.
- 📥 Download Link: Model Checkpoints
After downloading, unzip the file and place the contents into the ckpt/ directory. Ensure your directory structure looks like this:
SyncTrack/
├── ckpt/
│ ├── vae-ckpt.ckpt # (Example name, matches your unzipped files)
│ ├── hifigan-ckpt.ckpt
│ └── musicldm-ckpt.ckpt
├── config/
├── src/
└── ...
To train the SyncTrack model, use the train_synctrack.py script. The configuration is managed via YAML files.
In config/synctrack_train.yaml, please configure the following paths before starting:
data.params.path.train_data: Path to your training dataset.data.params.path.valid_data: Path to your validation dataset.model.params.ckpt_path: (Optional) Path to a pre-trained checkpoint to resume training.
python train_synctrack.py --config config/synctrack_train.yamlTo generate samples or evaluate the model on the test set, use the eval_synctrack.py script.
In config/synctrack_eval.yaml, please configure the following:
mode: Ensure this is set totest.data.params.path.valid_data: Path to your test dataset.model.params.ckpt_path: Path to the model checkpoint.
🌟 Use Our Pre-trained Model
python eval_synctrack.py --config config/synctrack_eval.yamlWe provide a comprehensive suite of metrics to measure rhythmic stability and synchronization, located in the eval_metrics directory.
CBD.py quantifies rhythmic synchronization in multitrack music by measuring the dispersion of beat alignment across all pairs of tracks.
Key Parameters:
--folder: Path to the directory containing audio stem subfolders (stem_0,stem_1,stem_2andstem_3).
python eval_metrics/CBD.py --folder /path/to/generated/stemsCBS.py measures rhythmic synchronization among multiple tracks.
Key Parameters:
--folder: Path to the directory containing audio stem subfolders (stem_0,stem_1,stem_2andstem_3).--window_size: Length of the sliding window in seconds (default:0.15).
python eval_metrics/CBS.py --folder /path/to/generated/stemsIRS.py quantifies temporal consistency by averaging the standard deviation of the Inter-Beat Interval across all samples for each track.
Key Parameters:
--folder: Path to the directory containing audio stem subfolders (stem_0,stem_1,stem_2andstem_3).
python eval_metrics/IRS.py --folder /path/to/generated/stemsIf you find this code or our paper useful for your research, please cite:
@inproceedings{wangsynctrack,
title={SyncTrack: Rhythmic Stability and Synchronization in Multi-Track Music Generation},
author={Wang, Hongrui and Zhang, Fan and Yu, Zhiyuan and Zhou, Ziya and Chen, Xi and Yang, Can and Wang, Yang},
booktitle={The Fourteenth International Conference on Learning Representations}
}This repository is built upon MSG-LD and utilizes Madmom for beat tracking. We thank the authors for their open-source contributions.
Please feel free to contact Hongrui Wang (hwangfb@connect.ust.hk), Fan Zhang (mafzhang@ust.hk), or Prof. Can Yang (macyang@ust.hk) if you have any questions.
