🎓 Knowledge-Augmented Academic Assistant (AI Module)

This repository contains the AI core of a Knowledge-Augmented Academic Assistant designed to help students explore universities, get guidance, and receive accurate academic information.

The system is built using a Retrieval-Augmented Generation (RAG) approach, combining:

📊 Structured university data
🤖 Large Language Models (OpenAI)
🎙️ Voice input + transcription pipeline

🌐 Live Website: Visit clasia.net

🚀 Key Features

🤖 AI Chatbot for Students
- Provides guidance on universities, programs, and decisions
- Answers contextual and follow-up questions
📚 RAG-Based Knowledge System
- Retrieves relevant university data
- Augments LLM responses with real information
- Improves factual accuracy and reduces hallucinations
🎙️ Voice Input Support
- Accepts audio queries
- Converts speech → text → AI response
🧠 Context-Aware Conversations
- Maintains chat history
- Generates more relevant responses over time

🧠 RAG Architecture (Core Idea)

This project follows a Retrieval-Augmented Generation pipeline, which works in two main stages:

1. Retrieval Stage

User query is processed
Relevant university data is selected from the internal dataset
Context is prepared dynamically

2. Generation Stage

Retrieved context + user query is sent to the LLM
OpenAI model generates a grounded response

This approach improves reliability because:

The model does not rely only on pre-trained knowledge
It uses real, up-to-date, domain-specific data

RAG systems enhance LLM outputs by injecting external knowledge before generation, improving accuracy and reducing hallucination. :contentReference[oaicite:0]{index=0}

⚙️ How the System Works

Text Flow

User sends a query
System checks:
- Chat history
- University dataset
Relevant context is retrieved
OpenAI generates a structured response

Voice Flow

User provides audio input (.mp3)
Audio is processed and transcribed
Transcribed text follows the same RAG pipeline
Final response is generated

🗂️ Project Structure

├── main.py                       # Core RAG chatbot pipeline (text)
├── modified_main.py              # Enhanced / structured response version
├── audio_main.py                 # Voice input + transcription + RAG pipeline
├── output/                       # Generated outputs and processed data
├── dummy1.mp3 - dummy5.mp3       # Sample audio files for testing

🧩 Tech Stack

Python
OpenAI API (LLM + reasoning)
Pydantic (structured outputs)
Audio processing (local handling)
RAG architecture (custom implementation)

Note: This project primarily uses OpenAI for intelligence. No external speech API dependency is strictly required in this version.

🛠️ Setup Instructions

1. Clone the repository

git clone https://github.com/BusraRafa/Knowledge-Augmented-Academic-Assistant.git
cd Knowledge-Augmented-Academic-Assistant

2. Install dependencies

pip install -r requirements.txt

(If requirements.txt is missing, install manually: openai, python-dotenv, pydantic, pydub)

3. Set environment variables

Create a .env file:

OPENAI_API_KEY=your_openai_api_key

▶️ Usage

Run Text-Based Chatbot

python main.py

Run Voice-Based Assistant

python audio_main.py

🎧 Sample Audio

You can test the system using provided files:

dummy1.mp3
dummy2.mp3
dummy3.mp3
dummy4.mp3
dummy5.mp3

📌 Notes

This repository only contains the AI module
Frontend, UI/UX, and full application integration are handled separately
Designed to be integrated into:
- Web apps
- WhatsApp bots
- Student platforms

💡 Future Improvements

Better retrieval optimization (vector DB / embeddings)
Multilingual support
Better personalization for students
Integration with live university APIs
Text-to-Speech (voice output)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎓 Knowledge-Augmented Academic Assistant (AI Module)

🚀 Key Features

🧠 RAG Architecture (Core Idea)

1. Retrieval Stage

2. Generation Stage

⚙️ How the System Works

Text Flow

Voice Flow

🗂️ Project Structure

🧩 Tech Stack

🛠️ Setup Instructions

1. Clone the repository

2. Install dependencies

3. Set environment variables

▶️ Usage

🎧 Sample Audio

📌 Notes

💡 Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
output		output
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
audio_main.py		audio_main.py
dummy1.mp3		dummy1.mp3
dummy2.mp3		dummy2.mp3
dummy3.mp3		dummy3.mp3
dummy4.mp3		dummy4.mp3
dummy5.mp3		dummy5.mp3
main.py		main.py
modified_main.py		modified_main.py

Folders and files

Latest commit

History

Repository files navigation

🎓 Knowledge-Augmented Academic Assistant (AI Module)

🚀 Key Features

🧠 RAG Architecture (Core Idea)

1. Retrieval Stage

2. Generation Stage

⚙️ How the System Works

Text Flow

Voice Flow

🗂️ Project Structure

🧩 Tech Stack

🛠️ Setup Instructions

1. Clone the repository

2. Install dependencies

3. Set environment variables

▶️ Usage

🎧 Sample Audio

📌 Notes

💡 Future Improvements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages