decimeta

Evaluates the Dewey Decimal System classification for a given query. Since the DDS is closed-source and copyrighted, this project scrapes data from the Melvil Decimal System, which is the next best thing. (Learn more).

How It Works

Decimeta uses two complementary approaches to classify queries:

Compares query embeddings against a vector database of MDS classifications using OpenAI's text-embedding-3-small model and Pinecone. If the query has something to do with its topic, this search works well. However, it may miss nuances or context.
Uses OpenAI's GPT-4.1 to navigate the classification hierarchy step-by-step, narrowing from hundreds to tens to ones place (and into decimals if needed). GPT is prone to hallucinating DDC numbers, which is why we must give it concrete options. GPT-4.1 was picked as it is the cheapest and fastest non-reasoning model.

Setup

Install dependencies:

bun install

Set environment variables:

OPENAI_API_KEY=your_key_here
PINECONE_API_KEY=your_key_here

Usage

Scrape MDS data. This shouldn't be necessary, as scraped data is already provided in mds.json. It's also slow and intensive for LibraryThing, so please use sparingly:

bun run scrape

Generate and store embeddings (this will also take a while):

bun run vectorize

Start the server:

bun start

The web interface will be available at http://localhost:3000. It will use the API routes /api/classify/gpt?query=your_query and /api/classify/embeddings?query=your_query.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data		data
public		public
src		src
.editorconfig		.editorconfig
.env.example		.env.example
.eslintignore		.eslintignore
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
bun.lock		bun.lock
ecosystem.yml		ecosystem.yml
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

decimeta

How It Works

Setup

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Lioness100/decimeta

Folders and files

Latest commit

History

Repository files navigation

decimeta

How It Works

Setup

Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages