feat(docs): Quarto-generated documentation site#96
Conversation
Merge [strategy theirs] of main to bring up to date with latest codebase.
…boto) and backup font (system - Helvetica Neue)
There was a problem hiding this comment.
Really cool to see how the quarto docs work in terms of the yaml and styles. And nice work done on adding the documentation and some of the typing on arguments. Looks good overall, I've added a few questions and suggestions but I think most are just small things.
src/classifai/indexers/__init__.py
Outdated
| @@ -1,4 +1,30 @@ | |||
| """Indexers package.""" | |||
| # pylint: disable=C0301 | |||
| """This module provides functionality for creating a vector index from a text file. | |||
There was a problem hiding this comment.
"from a csv (text) file" instead?
src/classifai/indexers/__init__.py
Outdated
|
|
||
| This class interacts with the Vectoriser class from the vectorisers submodule, | ||
| expecting that any vector model used to generate embeddings used in the | ||
| VectorStore objects is an instance of one of these classes, most notably |
There was a problem hiding this comment.
I would remove the hard reference to there being 3 Vectoriser classes. We might add more, users might make a custom one. Possibly it could refer to the base class instead
There was a problem hiding this comment.
Should this doctoring also capture the other functionalities of the VectorStore such as search() reverse_saerch(0 and embed(), and from_filesepace(). The key features and doctoring currently only refers to what happens in the index create / constructor step.
There was a problem hiding this comment.
Good suggestions - I've updated this docstring in full (will push the updates shortly).
I've dropped the dependency information and usage description to keep the information more targeted towards the indexers module.
src/classifai/indexers/main.py
Outdated
| import shutil | ||
| import time | ||
| import uuid | ||
| from typing import Optional, Self, Union # noqa: F401 |
There was a problem hiding this comment.
Good catch, thanks!
|
|
||
| Returns: | ||
| VectorStore: An instance of the `VectorStore` class. | ||
| (VectorStore): An instance of the `VectorStore` class. |
There was a problem hiding this comment.
I think the parenthesis is not conventional style for writing docstrings - or is this needed for Quarto to work?
https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html
There was a problem hiding this comment.
This was needed to get quartodoc to detect & render it as being a datatype rather than a variable name.
src/classifai/servers/__init__.py
Outdated
| an api-endpoint. | ||
|
|
||
| from .main import get_router, run_server | ||
| These functions interact with the ClassifAI PackageIndexer modules |
There was a problem hiding this comment.
should there be a space between Package and Indexer?
src/classifai/servers/__init__.py
Outdated
| @@ -1,5 +1,12 @@ | |||
| """FastAPI server as a package.""" | |||
| """This module provides functionality for creating a start a restAPI service which | |||
| allows a user to call the search methods of different VectorStore objects, from | |||
There was a problem hiding this comment.
'allows a user to call the search methods of one or more VectorStore objects'?
There was a problem hiding this comment.
Just checking were the TODO sections left intentionally?
There was a problem hiding this comment.
I'd initially wanted to keep the scope of the PR to the quarto static site generation, but as these were the only ones missing I'm happy to add them to it.
|
Final note that I could not get quarto to run locally using the how to test guide. Working on the current branch, I was able to |
…lasses to website, updated docstrings as suggested, merged schema definitions directly into dataclasses
Apologies for that, I did some of this work on a workstation, where I have I'll update the testing instructions for the correct |
frayle-ons
left a comment
There was a problem hiding this comment.
Looks good to me, just one or two final suggestions after re-reviewing. I still couldn't get the local version of the docs to run as I was getting a ModuleNotFoundError: No module named 'griffe' when trying to run uv run quartodoc build but the temp deployed version looks good.
Can we add something to the GitHub readme page that links to the deployed quarto docs? I can't see anything at the moment in Github that points to new docs site. And can that somehow be hidden in the quarto rendered pages so it doesn't link to itself?
| and similarity scores. | ||
|
|
||
| This class represents the output of vector store search operations, containing | ||
| query information, matched documents, and similarity rankings. |
There was a problem hiding this comment.
Thanks, addressed in latest commit
src/classifai/indexers/main.py
Outdated
| batch_size (int): [optional] The batch size for processing the input file and batching to | ||
| vectoriser. Defaults to 8. | ||
| meta_data (dict, optional): key,value pair metadata column names to extract from the input file and their types. | ||
| meta_data (dict): key,value pair metadata column names to extract from the input file and their types. |
There was a problem hiding this comment.
is this one no longer optional?
There was a problem hiding this comment.
Thanks, I should have updated that to
(dict): [optional]
There was a problem hiding this comment.
Addressed in latest commit 👍
Thanks for the review, I'll implement the two changes you suggested. I'll check the
Not at this point - I'll be submitting a follow-up PR to add the static site building & deployment to the CICD workflow, after which we can update the README to reference the GitHub Pages site. The one that's currently deployed is a throwaway one on my personal github pages, for testing / demo purposes.
That might be possible, but it wouldn't be straightforward / typical. I don't think having a self-referencing link to the documentation site in the landing page of the documentation site would be especially off-putting to people though. |
|
Thanks for the review @frayle-ons - I've implemented the changes you recommended, and this is ready for re-review |
📌 Quarto-generated Documentation for the package
✨ Summary
This PR uses Quarto and Quartodoc to build a documentation website for the package, based on the package docstrings, as well as the markdown and notebook files.
Scope - this PR introduces only the requirements to generate the documentation site, and some updates to typehints / docstrings to facilitate forming the documentation.
A short follow-up PR will introduce the CICD YAML changes to build and deploy to GitHub Pages as part of the release workflow.
📜 Changes Introduced
index.qmd) on opening✅ Checklist
terraform fmt&terraform validate)🔍 How to Test
uv syncto get the additional (dev) packages requireduv run quartodoc buildto build the quarto markdown filesuv run quarto renderto render html documentation siteuv run quarto previewto locally host the documentation site - explore it in a browser