Skip to content

Conversation

@kirit93
Copy link
Contributor

@kirit93 kirit93 commented Jan 31, 2026

Updated docs to simplify introduction, talk about library vs. ms, and give more details on inference.

@kirit93 kirit93 requested a review from a team as a code owner January 31, 2026 03:35
@greptile-apps
Copy link

greptile-apps bot commented Jan 31, 2026

Greptile Overview

Greptile Summary

This PR significantly improves the documentation by consolidating the getting started experience and adding comprehensive guides on deployment and performance optimization.

Major improvements:

  • Merged installation.md and quick-start.md into a streamlined index.md that provides installation, setup, and a first dataset example in one place
  • Added architecture-and-performance.md with detailed execution model explanation, concurrency formulas, and tuning guidelines
  • Added deployment-options.md comparing library vs microservice deployment with decision flowchart
  • Introduced a Dev Notes blog section with mkdocs-material blog plugin configuration
  • Updated all internal cross-references from deleted pages to the new consolidated index.md

Known issues (already flagged in previous reviews):

  • Broken link in docs/index.md to notebooks/README.md (should be notebook_source/README.md)
  • Broken link in docs/blog/posts/welcome.md to ../../notebooks/README.md
  • Future date in welcome blog post (2026-01-22 should likely be 2025-01-22)
  • Heading structure inconsistency in deployment-options.md

The documentation restructuring is well-executed and makes the getting started experience much more cohesive. The new performance and deployment guides fill important gaps.

Confidence Score: 4/5

  • This PR is safe to merge with minor documentation issues that should be addressed
  • Documentation changes only with no code modifications. The restructuring is logical and well-executed. The score reflects the presence of broken links and a date issue that were already flagged but should be fixed before or shortly after merge.
  • Fix broken links in docs/index.md and docs/blog/posts/welcome.md, and verify the blog post date

Important Files Changed

Filename Overview
docs/blog/posts/welcome.md Added welcome blog post (has broken link and future date already flagged)
docs/concepts/architecture-and-performance.md Added comprehensive performance tuning guide with execution model and configuration parameters
docs/concepts/deployment-options.md Added deployment guide comparing library vs microservice (has heading structure issue already flagged)
docs/index.md Streamlined welcome page with installation, setup, and first dataset example (has broken link already flagged)
mkdocs.yml Updated navigation to remove deleted pages, add new guides, and configure blog plugin

Sequence Diagram

sequenceDiagram
    participant User
    participant IndexMD as docs/index.md
    participant ArchGuide as architecture-and-performance.md
    participant DeployGuide as deployment-options.md
    participant ModelDocs as Model Documentation
    participant Blog as Dev Notes Blog

    User->>IndexMD: Visit documentation homepage
    Note over IndexMD: Now includes installation & quick start<br/>(merged from deleted files)
    
    IndexMD->>User: Show install, setup, first dataset example
    
    User->>DeployGuide: Learn about deployment options
    Note over DeployGuide: New: Library vs Microservice guide
    DeployGuide->>User: Decision flowchart & comparison
    
    User->>ArchGuide: Optimize performance
    Note over ArchGuide: New: Comprehensive performance guide
    ArchGuide->>User: Execution model, tuning parameters,<br/>concurrency formulas
    
    User->>ModelDocs: Configure models
    Note over ModelDocs: Updated links to index.md<br/>(removed quick-start.md references)
    ModelDocs->>ArchGuide: Cross-reference for concurrency
    
    User->>Blog: Read dev notes
    Note over Blog: New: Blog section with welcome post<br/>(mkdocs-material blog plugin)
    Blog->>User: Team insights & deep dives
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile


The library is the right choice for most users. Choose it if you:

### You Have Access to LLMs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heading structure inconsistent with rest of section - "You Have Access to LLMs" is a level 3 heading but reads like it should be a bullet point or part of the previous section's explanation

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!


- [Inference Architecture](inference-architecture.md): Understanding separation of concerns
- [Model Configuration](models/model-configs.md): Complete model settings reference
- [RunConfig Reference](../code_reference/run_config.md): API documentation
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broken link to non-existent ../code_reference/run_config.md - the code_reference directory doesn't exist in the repository

…g started

- Add deployment-options.md: Library vs. Microservice decision guide
- Add inference-architecture.md: Separation of concerns with LLM servers
- Add performance-tuning.md: Concurrency and batching optimization guide
- Streamline index.md: Merge installation, add quick example, simplify
- Remove quick-start.md: Content merged into welcome page
- Remove installation.md: Content merged into welcome page
- Update model docs: Add concurrency control sections and cross-references
- Update mkdocs.yml: Add new Architecture section to navigation
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 205 to 207
---

---
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate horizontal rules (two --- in a row)

Suggested change
---
---
---
## Next Steps

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

- Remove duplicate max_parallel_requests tables from model-configs.md and inference-parameters.md
- Remove duplicate Concurrency Control section from model-configs.md
- Simplify Concurrency Control in inference-parameters.md to link to performance-tuning.md
- Remove Troubleshooting section from inference-architecture.md (covered in performance-tuning.md)
- performance-tuning.md is now the authoritative source for tuning guidance
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link

greptile-apps bot commented Feb 2, 2026

Additional Comments (1)

mkdocs.yml
missing blog plugin configuration - the PR adds blog files (.authors.yml, blog/index.md, blog/posts/welcome.md) but doesn't configure the MkDocs blog plugin

  - blog
  - search


---

## Execution Model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd perhaps call out that this is our current column-wise dataset generator... with other dataset generator in the works...

```
┌─────── Batch 1 (buffer_size records) ────────────────────────────────────────┐
│ │
│ Column 1 (Sampler): ════════► (non_inference_max_parallel_workers) │
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be worth calling out that this is just an example. While LLM columns will always come after samplers, expression columns may come before certain llm columns

```python
from data_designer.config import RunConfig

run_config = RunConfig(buffer_size=2000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be show how to set this? data_designer.set_run_config(...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we fllow dd.config.RunConfig pattern in these examples?

|-------------------|-------------------|
| NVIDIA API Catalog | 4-8 |
| Self-hosted vLLM (single GPU) | 8-16 |
| Self-hosted vLLM (multi-GPU) | 16-64 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is dependent on the vLLM server stack + model performance. Could easily be as high as 1024, for example. It would be good to suggest running some benchmarks to record how long generation took across different values of parallelism until the vllm sever is saturated and the sdg run time doesn't decrease.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

johnnygreco
johnnygreco previously approved these changes Feb 2, 2026
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile


<div class="grid cards" markdown>

- :material-book-open-variant: **[Tutorials](notebooks/README.md)**
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broken link - notebooks/README.md doesn't exist (file is at notebook_source/README.md). Check if there's a build step that creates it or update the link.


Watch this space for technical deep dives into synthetic data generation, model customization, and more. We'll be sharing insights from our work and the broader community.

In the meantime, check out our [Welcome guide](../../index.md) and [tutorial notebooks](../../notebooks/README.md) to get started with Data Designer.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broken link - ../../notebooks/README.md doesn't exist (file is at ../../notebook_source/README.md)

@@ -0,0 +1,26 @@
---
date: 2026-01-22
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

date is in the future (2026-01-22). Should this be 2025-01-22?

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@kirit93 kirit93 merged commit de7c3ab into main Feb 3, 2026
46 checks passed
@kirit93 kirit93 deleted the doc-improvements branch February 3, 2026 05:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants