Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "cd-da-reader"
version = "0.3.0"
version = "0.3.1"
edition = "2024"
description = "CD-DA (audio CD) reading library"
repository = "https://github.com/Bloomca/rust-cd-da-reader"
Expand Down
149 changes: 136 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,148 @@
[![Crates.io](https://img.shields.io/crates/v/cd-da-reader.svg)](https://crates.io/crates/cd-da-reader)
[![CI](https://github.com/Bloomca/rust-cd-da-reader/actions/workflows/pull-request-workflow.yaml/badge.svg?branch=main)](https://github.com/Bloomca/rust-cd-da-reader/actions/workflows/pull-request-workflow.yaml)

This is a library to read audio CDs. This is intended to be a fairly low-level library, it intends to read TOC and allow to read raw PCM tracks data (there is a [simple helper](https://docs.rs/cd-da-reader/0.1.0/cd_da_reader/struct.CdReader.html#method.create_wav) to prepend RIFF header to convert raw data to a wav file), but not to provide any encoders to MP3, Vorbis, FLAC, etc -- if you need that, you'd need to compose this library with some others.
This is a simple library to read audio CDs. At the core it was written to enable CD ripping, but you can also implement a live audio CD player with its help. It is cross-platform and tested on Windows, macOS and Linux and abstracts both access to the CD drive and reading the actual data from it. All operations happen in this order on each platform:

It works on Windows, macOS and Linux, although each platform has slightly different behaviour regarding the handle exclusivity. Specifically, on macOS, it will not work if you use the audio CD somewhere -- the library will attempt to unmount it, claim exclusive access and only after read the data from it. After it is done, it will remount the CD back so other apps can use, which will cause the OS to treat as if you just inserted the CD.
1. Get a CD drive's handle
2. Read ToC (table of contents) of the audio CD
3. Read track data using ranges from ToC

For example, if you want to read TOC and save the first track as a WAV file, you can do the following:
Let's go through each concept in order.

## CD access

First thing, we'll need to get a hold of the CD drive. You can see the drive's letter on Windows in File Explorer (although the actual handle will be something like `"\\.\E:"`), with `cat /proc/sys/dev/cdrom/info` on Linux and with `diskutil list` on macOS.

This is a bit brittle, so this library provides a few helper methods to find a correct CD drive. By far the most straightforward approach is to simply open the "default" drive:

```rust
use cd_da_reader::{CdReader};

let reader = CdReader::open_default()?;
```

This code will scan the CD drives and will open the first one with an audio CD in it, and _usually_ this is what you want. If you want to provide a choice, there is an additional function to list all drives:

```rust
use cd_da_reader::{CdReader};

let drives = CdReader::list_drives()?;
```

This will give you a vector of drives, and the struct will have `has_audio_cd` field for audio CDs. Unfortunately, this does not work on macOS due to how CD drive handles are treated. When we execute any command to a CD drive (which we need to check whether the CD is audio or not), we need to claim exclusivity, which will cause it to unmount. If we release the handle, it will cause it to remount, and that will do 2 things:

1. call the default application for an audio CD (probably Apple Music)
2. that app will claim exclusivity, so we won't be able to get it back for some time

Because of that, on macOS you should either provide the name by yourself, or get the default drive:

```rust
use cd_da_reader::{CdReader};

// get the default drive, which should be what you want
let reader = CdReader::open_default()?;

// or read the disk directly
let reader = CdReader::open("disk14")?;
```

## Reading ToC

Each audio CD provides internal Table of Contents, which is an internal map of all the available tracks with the block addresses. The only semantic metadata we get from it is the number of tracks, but it is crucial to read it so that we can issues commands to read actual tracks data.

```rust
use cd_da_reader::{CdReader};

let reader = CdReader::open_default()?;
let toc = reader.read_toc()?;
```

This will give us a struct like:

```
{
first_track: 1,
last_track: 11,
tracks: [{
number: 1,
start_lba: 0,
start_msf: (0, 2, 0),
is_audio: true,
}, {
number: 1,
start_lba: 14675,
start_msf: (3, 15, 50),
is_audio: true,
}, ...],
leadout_lba: 221786
}
```

**LBA (Logical Block Address)** is a simple sequential sector index. LBA 0 is the first readable sector after the 2-second lead-in pre-gap at the start of every disc. It is the most convenient format for issuing read commands and used internally to read data blocks.

**MSF (Minutes:Seconds:Frames)** is a time-based address inherited from the physical disc layout. A "frame" here is one CD sector, and the spec defines 75 frames per second. MSF includes a fixed 2-second (150-frame) offset for the lead-in area, so `MSF (0, 2, 0)` corresponds to LBA 0 — the very start of track data.

The two are fully interchangeable: `LBA + 150 = total frames from disc start`, from which minutes, seconds, and frames are derived by dividing by 75 and 60. You will typically only need LBA values for reading track data, while MSF is required for services like MusicBrainz disc ID calculation.

## Reading tracks

Finally, after we got ToC, we can read tracks. The boundaries for the track are the starting LBA and the starting LBA for the next track (or leadout LBA value for the last track). This library abstracts these things and simply reads provided track numbers. To read a track, all you need to do is call:

```rust
let reader = CDReader::open_default()?;
use cd_da_reader::{CdReader};

let reader = CdReader::open_default()?;
let toc = reader.read_toc()?;
// we assume that track #1 exists for simplicity
let data = reader.read_track(&toc, 1)?;
```

This is a blocking call and takes a lot of time (depends on the track length and CD/drive quality due to retries). If you want to do something with the data as it comes, use streaming API:

let first_audio_track = toc
.tracks
.iter()
.find(|track| track.is_audio)
.ok_or_else(|| std::io::Error::other("no audio tracks in TOC"))?;
```rust
use cd_da_reader::{CdReader, RetryConfig, TrackStreamConfig};

let data = reader.read_track(&toc, last_audio_track.number)?;
let wav_track = CdReader::create_wav(data);
std::fs::write("myfile.wav", wav_track)?;
let reader = CdReader::open_default()?;
let toc = reader.read_toc()?;

let stream_cfg = TrackStreamConfig {
sectors_per_chunk: 27, // ~64 KB per chunk
retry: RetryConfig::default(),
};

let mut stream = reader.open_track_stream(&toc, 1, stream_cfg)?;
while let Some(chunk) = stream.next_chunk()? {
// do something with the chunk directly
}
```

You can open a specific drive, but often the machine will have only 1 valid audio CD, so the default drive method should work in most scenarios. Reading track data is a pretty slow operation due to size and CD reading speeds, so there is a streaming API available if you want to interact with chunks of data directly.
## Track format

The data you receive by reading tracks is [PCM](https://en.wikipedia.org/wiki/Pulse-code_modulation), the same raw format used by WAV files. Audio CDs use 16-bit stereo PCM sampled at 44,100 Hz, so each second of audio is:

```
44,100 samples * 2 channels * 2 bytes = 176,400 bytes/second
```

Each CD sector holds exactly 2,352 bytes of audio payload (176,400 / 75 = 2,352), that's why there are 75 sectors per second. A typical 3-minute track is about 31 MB of raw PCM, and a full 74-minute CD holds ~650 MB.

Converting PCM data to a playable WAV file only requires prepending a 44-byte RIFF header. In fact, there is a helper for that in this library:

```rust
use cd_da_reader::{CdReader};

let reader = CdReader::open_default()?;
let toc = reader.read_toc()?;
// we assume that track #1 exists for simplicity
let data = reader.read_track(&toc, 1)?;
let wav = CdReader::create_wav(data);
std::fs::write("myfile.wav", wav)?;
```

This code will read the first track from the CD file and save it as a WAVE file, which will be playable by any music player.

## What about metadata?

You might have asked why do we expose LBA/MSF values if the track reading is abstracted behind specific track numbers. The reason for that is metadata. Even though there is a command [CD-TEXT](https://en.wikipedia.org/wiki/CD-Text) for storing data directly, it is not exposed in this library due to it being extremely unreliable.

Instead, you can calculate a Disc ID for a service like [MusicBrainz](https://musicbrainz.org/), which requires full ToC for it: [ref](https://musicbrainz.org/doc/Disc_ID_Calculation). You can see an example of how to calculate the ID [here](https://github.com/Bloomca/audio-cd-ripper/blob/main/src/music_brainz/calculate_id.rs).
160 changes: 135 additions & 25 deletions src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,41 +1,151 @@
//! # CD-DA (or audio CD) reading library
//! # CD-DA (audio CD) reading library
//!
//! This library provides cross-platform audio CD reading capability,
//! it works on Windows, macOS and Linux.
//! It is intended to be a low-level library, and only allows you read
//! TOC and tracks, and you need to provide valid CD drive name.
//! Currently, the functionality is very basic, and there is no way to
//! specify subchannel info, access hidden track or read CD text.
//! This library provides cross-platform audio CD reading capabilities
//! (tested on Windows, macOS and Linux). It was written to enable CD ripping,
//! but you can also implement a live audio CD player with its help.
//! The library works by issuing direct SCSI commands and abstracts both
//! access to the CD drive and reading the actual data from it, so you don't
//! deal with the hardware directly.
//!
//! The library works by issuing direct SCSI commands.
//! All operations happen in this order:
//!
//! ## Example
//! 1. Get a CD drive's handle
//! 2. Read the ToC (table of contents) of the audio CD
//! 3. Read track data using ranges from the ToC
//!
//! ## CD access
//!
//! The easiest way to open a drive is to use [`CdReader::open_default`], which scans
//! all drives and opens the first one that contains an audio CD:
//!
//! ```no_run
//! use cd_da_reader::CdReader;
//!
//! let reader = CdReader::open_default()?;
//! # Ok::<(), Box<dyn std::error::Error>>(())
//! ```
//!
//! If you need to pick a specific drive, use [`CdReader::list_drives`] followed
//! by calling [`CdReader::open`] with the specific drive:
//!
//! ```no_run
//! use cd_da_reader::CdReader;
//!
//! // Windows / Linux: enumerate drives and inspect the has_audio_cd field
//! let drives = CdReader::list_drives()?;
//!
//! // Any platform: open a known path directly
//! // Windows: r"\\.\E:"
//! // macOS: "disk6"
//! // Linux: "/dev/sr0"
//! let reader = CdReader::open("disk6")?;
//! # Ok::<(), Box<dyn std::error::Error>>(())
//! ```
//!
//! > **macOS note:** querying drives requires claiming exclusive access, which
//! > unmounts the disc. Releasing it triggers a remount that hands control to
//! > the default app (usually Apple Music). Use `open_default` or `open` with a
//! > known path instead of `list_drives` on macOS.
//!
//! ## Reading ToC
//!
//! Each audio CD carries a Table of Contents with the block address of every
//! track. You need to read it first before issuing any track read commands:
//!
//! ```no_run
//! use cd_da_reader::CdReader;
//!
//! let reader = CdReader::open_default()?;
//! let toc = reader.read_toc()?;
//! # Ok::<(), Box<dyn std::error::Error>>(())
//! ```
//!
//! The returned [`Toc`] contains a [`Vec<Track>`](Track) where each entry has
//! two equivalent address fields:
//!
//! - **`start_lba`** -- Logical Block Address, which is a sector index.
//! LBA 0 is the first readable sector after the 2-second lead-in pre-gap.
//! This is the format used internally for read commands.
//! - **`start_msf`** — Minutes/Seconds/Frames, a time-based address inherited
//! from the physical disc layout. A "frame" is one sector; the spec defines
//! 75 frames per second. MSF includes a fixed 2-second (150-frame) lead-in
//! offset, so `(0, 2, 0)` corresponds to LBA 0. You can convert between them easily:
//! `LBA + 150 = total frames`, then divide by 75 and 60 for M/S/F.
//!
//! ## Reading tracks
//!
//! Pass the [`Toc`] and a track number to [`CdReader::read_track`]. The
//! library calculates the sector boundaries automatically:
//!
//! ```no_run
//! use cd_da_reader::CdReader;
//!
//! fn read_cd() -> Result<(), Box<dyn std::error::Error>> {
//! let reader = CdReader::open(r"\\.\E:")?;
//! let toc = reader.read_toc()?;
//! println!("{:#?}", toc);
//! let data = reader.read_track(&toc, 11)?;
//! let wav_track = CdReader::create_wav(data);
//! std::fs::write("myfile.wav", wav_track)?;
//! Ok(())
//! let reader = CdReader::open_default()?;
//! let toc = reader.read_toc()?;
//! let data = reader.read_track(&toc, 1)?; // we assume track #1 exists and is audio
//! # Ok::<(), Box<dyn std::error::Error>>(())
//! ```
//!
//! This is a blocking call. For a live-playback or progress-reporting use case,
//! use the streaming API instead:
//!
//! ```no_run
//! use cd_da_reader::{CdReader, RetryConfig, TrackStreamConfig};
//!
//! let reader = CdReader::open_default()?;
//! let toc = reader.read_toc()?;
//!
//! let cfg = TrackStreamConfig {
//! sectors_per_chunk: 27, // ~64 KB per chunk
//! retry: RetryConfig::default(),
//! };
//!
//! let mut stream = reader.open_track_stream(&toc, 1, cfg)?;
//! while let Some(chunk) = stream.next_chunk()? {
//! // process chunk — raw PCM, 2 352 bytes per sector
//! }
//! # Ok::<(), Box<dyn std::error::Error>>(())
//! ```
//!
//! This function reads an audio CD on Windows, you can check your drive letter
//! in the File Explorer. On macOS, you can run `diskutil list` and look for the
//! Audio CD in the list (it should be something like "disk4"), and on Linux you
//! can check it using `cat /proc/sys/dev/cdrom/info`, it will be like "/dev/sr0".
//! ## Track format
//!
//! Track data is raw [PCM](https://en.wikipedia.org/wiki/Pulse-code_modulation),
//! the same format used inside WAV files. Audio CDs use 16-bit stereo PCM
//! sampled at 44 100 Hz:
//!
//! ```text
//! 44 100 samples * 2 channels * 2 bytes = 176 400 bytes/second
//! ```
//!
//! Each sector holds exactly 2 352 bytes (176 400 ÷ 75 = 2 352), that's where
//! 75 sectors per second comes from. A typical 3-minute track is
//! ~31 MB; a full 74-minute CD is ~650 MB.
//!
//! Converting raw PCM to a playable WAV file only requires prepending a 44-byte
//! RIFF header — [`CdReader::create_wav`] does exactly that:
//!
//! ```no_run
//! use cd_da_reader::CdReader;
//!
//! let reader = CdReader::open_default()?;
//! let toc = reader.read_toc()?;
//! let data = reader.read_track(&toc, 1)?;
//! let wav = CdReader::create_wav(data);
//! std::fs::write("track01.wav", wav)?;
//! # Ok::<(), Box<dyn std::error::Error>>(())
//! ```
//!
//! ## Metadata
//!
//! This library does not provide any direct metadata, and audio CDs typically do
//! not carry it by themselves. To obtain it, you'd need to get it from a place like
//! [MusicBrainz](https://musicbrainz.org/). You should have all necessary information
//! in the TOC struct to calculate the audio CD ID.
//! Audio CDs carry almost no semantic metadata. [CD-TEXT] exists but is
//! unreliable and because of that is not provided by this lbirary. The practical approach is to
//! calculate a Disc ID from the ToC and look it up on a service such as
//! [MusicBrainz]. The [`Toc`] struct exposes everything required for the
//! [MusicBrainz disc ID algorithm].
//!
//! [CD-TEXT]: https://en.wikipedia.org/wiki/CD-Text
//! [MusicBrainz]: https://musicbrainz.org/
//! [MusicBrainz disc ID algorithm]: https://musicbrainz.org/doc/Disc_ID_Calculation
#[cfg(target_os = "linux")]
mod linux;
#[cfg(target_os = "macos")]
Expand Down
Loading