You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Arabic Chat with PDF is a user-friendly application that lets you interact with Arabic PDF documents. Powered by advanced language models, OCR, and vector search, it allows you to upload PDFs, ask questions, and receive accurate Arabic responses 🚀
This research aims to fine-tune an Arabic OCR model using Tesseract 5.0, enhancing text recognition accuracy through extensive data collection, preprocessing, and image generation. By leveraging advanced training techniques and data augmentation, we achieve significant improvements in word error rates (WER).
Alef-OCR-Image2Html, an OCR model designed to transform Arabic documents including historical texts, scanned pages, and handwritten materials into structured and semantic HTML.
Official code for "Ketaba-OCR at AR-MS NakbaNLP 2026" — QLoRA fine-tuning of a specialized HTR model with Linear+Boost ensemble for Arabic manuscript recognition. 1st place per-line (CER 0.082) and 3rd place official leaderboard at NakbaNLP 2026 (LREC 2026).
Additional experimental model for NakbaNLP 2026 Shared Task (AR-MS) — LoRA/DoRA fine-tuning of Qari-OCR (Qwen2-VL-2B) for Arabic handwritten manuscript recognition on the Omar Al-Saleh Memoir Collection (1951-1965).