Skip to content

🛠️ Refactor: modularize API handler into internal packages#142

Open
lucasew wants to merge 1 commit intomasterfrom
refactor/modularize-api-7015727293619554448
Open

🛠️ Refactor: modularize API handler into internal packages#142
lucasew wants to merge 1 commit intomasterfrom
refactor/modularize-api-7015727293619554448

Conversation

@lucasew
Copy link
Owner

@lucasew lucasew commented Feb 13, 2026

This PR refactors the api/index.go file by extracting logic into a modular internal/ directory structure.

Changes:

  • Created internal/transport for HTTP client and SSRF logic.
  • Created internal/article for fetching and parsing logic.
  • Created internal/request for request utility functions (URL normalization, format detection, LLM detection).
  • Created internal/formatter for response formatting (HTML, Markdown, JSON, etc.).
  • Refactored api/index.go to import and use these packages.
  • Moved and updated tests to correspond with the new package structure.

Justification:

  • Separation of Concerns: Each package now has a single responsibility.
  • Testability: Logic is easier to test in isolation (e.g., Fetch accepts a client).
  • Maintainability: Smaller, focused files are easier to understand and modify.
  • Safety: SSRF protection and other critical logic are preserved and isolated.

Verification:

  • Ran go test ./... to ensure all tests pass.
  • Ran mise run ci to verify linting and formatting.

PR created automatically by Jules for task 7015727293619554448 started by @lucasew

Decomposed the monolithic `api/index.go` into domain-specific internal packages to improve maintainability, testability, and separation of concerns.

- `internal/transport`: HTTP client configuration and SSRF protection (`NewSafeClient`).
- `internal/article`: Article fetching and parsing logic (`Fetch`).
- `internal/request`: Request parsing, validation, and URL reconstruction (`NormalizeURL`, `ReconstructURL`, `GetFormat`).
- `internal/formatter`: Output formatting and rendering (`Render`, `Template`).
- Updated `api/index.go` to serve as a clean orchestration layer using these packages.
- Migrated and split tests into their respective package test files.
- Added integration test in `api/index_test.go` to verify wiring.

This refactoring adheres to the Single Responsibility Principle and makes the codebase easier to navigate and extend.

Co-authored-by: lucasew <15693688+lucasew@users.noreply.github.com>
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@vercel
Copy link

vercel bot commented Feb 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
articleparser Error Error Feb 13, 2026 7:22am

@gemini-code-assist
Copy link

Summary of Changes

Hello @lucasew, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the application's API handler by breaking down a monolithic api/index.go file into a set of specialized internal packages. This architectural change aims to enhance the codebase's organization, making it more maintainable, testable, and scalable by clearly delineating responsibilities for HTTP transport, article processing, request handling, and response formatting. The refactoring ensures that critical security features like SSRF protection are preserved and isolated within their respective modules.

Highlights

  • API Handler Modularization: The core API handler (api/index.go) has been refactored by extracting its functionalities into dedicated internal packages, significantly improving separation of concerns.
  • New Internal Packages: Four new internal packages were created: internal/transport for HTTP client and SSRF logic, internal/article for fetching and parsing, internal/request for URL utilities and format detection, and internal/formatter for response rendering.
  • Improved Testability and Maintainability: Logic is now easier to test in isolation, and smaller, focused files enhance overall maintainability.
  • Test Restructuring: Existing tests were moved and updated to align with the new modular package structure, and redundant test files were removed.
Changelog
  • api/index.go
    • Removed numerous imports, constants, and functions related to HTTP client, URL parsing, article fetching, and response formatting.
    • Added imports for the new internal/article, internal/formatter, internal/request, and internal/transport packages.
    • Updated the handler function to utilize functions from the newly imported internal packages for URL normalization, format detection, article fetching, and rendering.
  • api/index_test.go
    • Removed tests for TestNormalizeAndValidateURL, TestFetchAndParse, and TestSSRFProtection as their corresponding logic moved to internal packages.
    • Added a new test TestHandler_InvalidURL to verify error handling for invalid URLs at the API handler level.
  • api/llm_test.go
    • Removed the file, as its tests for LLM detection and format determination were moved to internal/request/utils_test.go.
  • api/reconstruct_test.go
    • Removed the file, as its tests for URL reconstruction were moved to internal/request/utils_test.go.
  • internal/article/fetch.go
    • Added a new file containing the ReadabilityParser, userAgentPool, getRandomUserAgent, and the Fetch function, which handles retrieving and parsing article content.
  • internal/article/fetch_test.go
    • Added a new file with tests for the Fetch function, verifying its ability to retrieve and parse HTML content.
  • internal/formatter/render.go
    • Added a new file containing HTML template definitions, format handler types, and functions (Render, formatHTML, formatMarkdown, formatJSON, formatText) responsible for rendering article content in various output formats.
  • internal/request/utils.go
    • Added a new file containing llmUserAgents, IsLLM for detecting LLM bots, GetFormat for determining output format, ReconstructURL for handling URL query parameter quirks, and NormalizeURL for cleaning and validating URLs.
  • internal/request/utils_test.go
    • Added a new file with comprehensive tests for IsLLM, GetFormat, ReconstructURL, and NormalizeURL functions.
  • internal/transport/client.go
    • Added a new file defining constants for HTTP client timeouts and redirect limits, and functions (NewSafeClient, newSafeDialer) for creating an HTTP client with SSRF protection.
  • internal/transport/client_test.go
    • Added a new file with tests for TestSSRFProtection, ensuring the custom dialer correctly blocks connections to private and loopback IP addresses.
Activity
  • The pull request was automatically created by Jules for task 7015727293619554448, initiated by @lucasew.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request does a great job of refactoring the monolithic API handler into well-defined internal packages. The new structure improves separation of concerns, testability, and maintainability. The logic has been moved correctly, and the corresponding tests have been updated, ensuring no regressions. I have one suggestion to improve the randomness of user-agent selection, but overall, this is a solid improvement.

Comment on lines +45 to +47
func getRandomUserAgent() string {
return userAgentPool[rand.Intn(len(userAgentPool))]
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using math/rand can lead to predictable user-agent selection if not seeded properly, which can be tricky in serverless environments and with Go versions before 1.20. For better randomness and to avoid any ambiguity with seeding, it's recommended to use the cryptographically secure crypto/rand package.

This change will require updating your imports to include "crypto/rand", "log", and "math/big", while removing "math/rand".

func getRandomUserAgent() string {
	n, err := rand.Int(rand.Reader, big.NewInt(int64(len(userAgentPool))))
	if err != nil {
		log.Printf("crypto/rand failed: %v; falling back to first user agent", err)
		return userAgentPool[0]
	}
	return userAgentPool[n.Int64()]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant