| layout | title | nav_order | has_children | parent |
|---|---|---|---|---|
default |
OpenHands Tutorial - Chapter 5: Testing |
5 |
false |
OpenHands Tutorial |
Welcome to Chapter 5: Testing - Comprehensive Test Suite Generation and Quality Assurance. In this part of OpenHands Tutorial: Autonomous Software Engineering Workflows, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.
Master OpenHands' testing capabilities for creating unit tests, integration tests, performance tests, and automated quality assurance.
OpenHands provides comprehensive testing capabilities, from simple unit tests to complex integration and performance testing. This chapter covers automated test generation, execution, and quality assurance workflows.
from openhands import OpenHands
# Unit test generation agent
test_generator = OpenHands()
# Generate comprehensive unit tests for a calculator module
calculator_tests = test_generator.run("""
Create a comprehensive unit test suite for this calculator module:
```python
class Calculator:
def add(self, a, b):
return a + b
def subtract(self, a, b):
return a - b
def multiply(self, a, b):
return a * b
def divide(self, a, b):
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
def power(self, base, exponent):
return base ** exponent
def sqrt(self, x):
if x < 0:
raise ValueError("Cannot take square root of negative number")
return x ** 0.5Generate tests covering:
- Normal operation for all methods
- Edge cases (zero, negative numbers, large numbers)
- Error conditions and exception handling
- Type validation (integers, floats, invalid types)
- Boundary conditions
- Performance considerations
- Parametrized tests for multiple input combinations """)
test_framework = test_generator.run(""" Set up a complete testing framework with:
- pytest configuration and plugins
- Test fixtures and setup/teardown
- Mocking and patching utilities
- Test data factories
- Coverage reporting
- Test categorization (unit, integration, e2e)
- Parallel test execution
- CI/CD integration """)
advanced_patterns = test_generator.run(""" Implement advanced testing patterns:
- Property-based testing with hypothesis
- Fuzz testing for input validation
- Mutation testing for test quality assessment
- Test-driven development helpers
- Behavior-driven development with Gherkin
- Contract testing for APIs """)
### Test Quality Assessment
```python
# Test quality evaluation
quality_assessor = OpenHands()
# Evaluate test suite quality
quality_evaluation = quality_assessor.run("""
Assess the quality of this test suite and suggest improvements:
```python
import pytest
from calculator import Calculator
import math
class TestCalculator:
def setup_method(self):
self.calc = Calculator()
def test_add(self):
assert self.calc.add(2, 3) == 5
assert self.calc.add(-1, 1) == 0
assert self.calc.add(0, 0) == 0
def test_divide(self):
assert self.calc.divide(6, 2) == 3
assert self.calc.divide(5, 2) == 2.5
def test_divide_by_zero(self):
with pytest.raises(ValueError, match="divide by zero"):
self.calc.divide(5, 0)
Evaluate:
- Test coverage (statement, branch, condition)
- Test effectiveness (mutation score)
- Edge case coverage
- Maintainability and readability
- Performance impact
- CI/CD integration readiness """)
coverage_analysis = quality_assessor.run(""" Create test coverage analysis tools:
- Line coverage measurement
- Branch coverage analysis
- Condition coverage tracking
- Function coverage reporting
- Class and method coverage
- Integration with coverage tools (coverage.py)
- Coverage reporting and visualization
- Coverage goals and thresholds """)
## Integration Testing
### API Integration Tests
```python
# API testing agent
api_tester = OpenHands()
# Generate comprehensive API tests
api_tests = api_tester.run("""
Create comprehensive API integration tests for a user management service:
API Endpoints:
- POST /users - Create user
- GET /users - List users
- GET /users/{id} - Get user by ID
- PUT /users/{id} - Update user
- DELETE /users/{id} - Delete user
Test scenarios:
1. Successful CRUD operations
2. Input validation and error responses
3. Authentication and authorization
4. Rate limiting behavior
5. Concurrent request handling
6. Database transaction integrity
7. External service dependencies
8. Edge cases (non-existent users, invalid data)
9. Performance under load
10. Security vulnerability testing
Include test data setup, teardown, and mocking of external dependencies.
""")
# Load testing integration
load_tests = api_tester.run("""
Implement load testing for the API:
1. Gradual load increase testing
2. Spike testing for sudden traffic increases
3. Stress testing to find breaking points
4. Endurance testing for prolonged load
5. Volume testing for large data sets
6. Scalability testing across multiple instances
7. Resource usage monitoring
8. Performance bottleneck identification
Use tools like Locust, JMeter, or custom async load testing.
""")
# Database testing agent
db_tester = OpenHands()
# Database integration tests
db_integration = db_tester.run("""
Create database integration tests for an e-commerce system:
Database Schema:
- users (id, name, email, created_at)
- products (id, name, price, category, stock)
- orders (id, user_id, total, status, created_at)
- order_items (id, order_id, product_id, quantity, price)
Test scenarios:
1. Data integrity and referential constraints
2. Transaction isolation levels
3. Concurrent access and locking
4. Database migration testing
5. Backup and restore procedures
6. Performance with large datasets
7. Index effectiveness and query optimization
8. Connection pooling and resource management
9. Error handling and recovery
10. Data validation and sanitization
Include test data generation, database fixtures, and cleanup procedures.
""")
# Database migration testing
migration_tests = db_tester.run("""
Implement database migration testing:
1. Forward migration testing
2. Rollback migration testing
3. Data migration integrity
4. Schema compatibility testing
5. Migration performance testing
6. Migration conflict resolution
7. Migration dependency management
8. Migration rollback procedures
9. Migration testing in CI/CD
10. Migration documentation generation
""")# E2E testing agent
e2e_tester = OpenHands()
# Complete e2e test suite
e2e_tests = e2e_tester.run("""
Create end-to-end tests for a complete web application:
Application: Task Management System
Components:
- Frontend: React with routing
- Backend: FastAPI with database
- Database: PostgreSQL
- Authentication: JWT tokens
- File uploads: Image attachments
E2E Test Scenarios:
1. User registration and login flow
2. Task creation, editing, and deletion
3. Task assignment and status updates
4. File upload and attachment handling
5. Search and filtering functionality
6. User permission and access control
7. Real-time updates and notifications
8. Mobile responsiveness testing
9. Cross-browser compatibility
10. Performance and load testing
Include test automation setup, test data management, and CI/CD integration.
""")
# Browser automation testing
browser_tests = e2e_tester.run("""
Implement browser automation testing:
1. Selenium WebDriver setup and configuration
2. Page Object Model implementation
3. Test data management and fixtures
4. Screenshot capture on test failures
5. Video recording for test debugging
6. Cross-browser testing (Chrome, Firefox, Safari, Edge)
7. Mobile device emulation
8. Headless browser testing for CI/CD
9. Visual regression testing
10. Accessibility testing integration
Include test reporting, parallel execution, and failure analysis.
""")# Performance testing agent
perf_tester = OpenHands()
# Load testing implementation
load_testing = perf_tester.run("""
Create comprehensive load testing for a web application:
Application Metrics:
- Response time < 200ms for 95th percentile
- Error rate < 1% under normal load
- Throughput: 1000 requests/second
- Concurrent users: 5000
Load Testing Scenarios:
1. Ramp-up load testing (gradual increase)
2. Spike testing (sudden load increases)
3. Stress testing (beyond normal capacity)
4. Endurance testing (prolonged load)
5. Volume testing (large data processing)
6. Scalability testing (horizontal scaling)
7. Failover testing (server failures)
8. Network degradation testing
Include monitoring, metrics collection, and performance analysis.
""")
# Performance profiling
profiling_tests = perf_tester.run("""
Implement performance profiling and optimization:
1. CPU profiling and bottleneck identification
2. Memory usage analysis and leak detection
3. Database query performance analysis
4. API response time profiling
5. Frontend performance metrics
6. Network latency measurement
7. Caching effectiveness analysis
8. Resource utilization monitoring
Include profiling tools integration and automated performance regression detection.
""")# Security testing agent
security_tester = OpenHands()
# Security vulnerability testing
security_tests = security_tester.run("""
Create comprehensive security testing suite:
Security Test Categories:
1. Input Validation Testing
- SQL injection attempts
- XSS (Cross-Site Scripting) attacks
- Command injection testing
- Path traversal attacks
2. Authentication & Authorization Testing
- Brute force attack prevention
- Session management security
- JWT token security
- Password policy enforcement
3. API Security Testing
- Rate limiting effectiveness
- CORS configuration testing
- API key validation
- Request size limits
4. Data Protection Testing
- Encryption at rest validation
- Data transmission security
- Secure data disposal
- GDPR compliance testing
5. Infrastructure Security
- Container security scanning
- Dependency vulnerability checking
- Configuration security analysis
- Network security testing
Include automated scanning tools, custom security tests, and compliance reporting.
""")
# Penetration testing automation
penetration_tests = security_tester.run("""
Implement automated penetration testing:
1. Vulnerability scanning with automated tools
2. Common attack vector testing
3. Custom exploit development and testing
4. Security regression testing
5. Third-party component security analysis
6. Configuration security auditing
7. Network security assessment
8. Wireless security testing (where applicable)
Include ethical hacking methodologies, reporting, and remediation tracking.
""")# CI/CD integration agent
ci_cd_agent = OpenHands()
# Complete CI/CD testing pipeline
pipeline_integration = ci_cd_agent.run("""
Create comprehensive CI/CD testing pipeline:
Pipeline Stages:
1. **Code Quality Gates**
- Linting and static analysis
- Code formatting checks
- Security scanning
- License compliance checking
2. **Unit Testing**
- Parallel test execution
- Coverage reporting
- Test result analysis
- Performance regression detection
3. **Integration Testing**
- API testing automation
- Database integration tests
- External service mocking
- Contract testing
4. **End-to-End Testing**
- Browser automation testing
- Mobile app testing
- API integration testing
- Performance testing
5. **Security Testing**
- SAST (Static Application Security Testing)
- DAST (Dynamic Application Security Testing)
- Dependency vulnerability scanning
- Container security scanning
6. **Performance Testing**
- Load testing automation
- Performance regression detection
- Scalability testing
- Resource usage monitoring
Include GitHub Actions, GitLab CI, Jenkins, or other CI/CD platform configurations.
""")
# Automated deployment testing
deployment_tests = ci_cd_agent.run("""
Implement deployment testing and validation:
1. Blue-green deployment testing
2. Canary deployment validation
3. Rollback testing and automation
4. Configuration testing across environments
5. Database migration testing
6. Infrastructure as Code testing
7. Service mesh and orchestration testing
8. Monitoring and alerting validation
Include deployment pipeline automation, environment management, and post-deployment validation.
""")# Test data management agent
data_manager = OpenHands()
# Comprehensive test data management
test_data_management = data_manager.run("""
Create comprehensive test data management system:
1. **Test Data Generation**
- Realistic data generation with Faker
- Edge case and boundary data creation
- Statistical distribution matching
- Multi-language and cultural data support
2. **Test Data Management**
- Database fixtures and seeding
- API response mocking
- File and media test data
- Environment-specific data sets
3. **Data Privacy and Security**
- PII data masking and anonymization
- GDPR compliance for test data
- Data encryption for sensitive information
- Secure data storage and access control
4. **Data Lifecycle Management**
- Test data cleanup and disposal
- Data versioning and rollback
- Data sharing across test environments
- Data consistency validation
5. **Performance and Scalability**
- Large dataset generation for performance testing
- Distributed data generation
- Memory-efficient data handling
- Parallel data processing
Include tools for data generation, management scripts, and integration with testing frameworks.
""")
# Synthetic data generation
synthetic_data = data_manager.run("""
Implement synthetic data generation for testing:
1. **Structured Data Generation**
- User profiles with realistic attributes
- Product catalogs with pricing and descriptions
- Transaction data with temporal patterns
- Hierarchical organizational data
2. **Unstructured Data Generation**
- Natural language text generation
- Image and media content creation
- Time series data with trends and seasonality
- Graph and network data structures
3. **Domain-Specific Generators**
- Healthcare data (HIPAA compliant)
- Financial data with regulatory compliance
- E-commerce data with purchasing patterns
- Social media data with engagement metrics
4. **Quality Assurance**
- Data validity and consistency checks
- Statistical property validation
- Business rule compliance
- Performance and scalability testing
Include configuration files, generation scripts, and validation tools.
""")# Test reporting agent
reporting_agent = OpenHands()
# Advanced test reporting
test_reporting = reporting_agent.run("""
Create comprehensive test reporting and analytics system:
1. **Test Result Aggregation**
- Multi-format test result collection
- Historical trend analysis
- Test execution time tracking
- Failure pattern identification
2. **Reporting Formats**
- HTML dashboards with interactive charts
- PDF reports for stakeholders
- JSON/XML APIs for integration
- Real-time dashboards
3. **Analytics and Insights**
- Test failure prediction
- Code coverage trend analysis
- Performance regression detection
- Risk assessment and prioritization
4. **Visualization**
- Test execution timelines
- Failure heatmaps
- Coverage visualization
- Performance metrics graphs
5. **Integration and Notification**
- Slack/Teams notifications
- Email reports with summaries
- Dashboard integrations
- Alert system for critical failures
Include reporting tools, dashboard templates, and automation scripts.
""")
# Test metrics and KPIs
test_metrics = reporting_agent.run("""
Implement test metrics and KPI tracking:
1. **Quality Metrics**
- Test coverage percentage
- Defect detection effectiveness
- Test execution reliability
- Code quality scores
2. **Performance Metrics**
- Test execution time
- Resource utilization
- Scalability measurements
- Response time percentiles
3. **Process Metrics**
- Test case authoring time
- Defect resolution time
- Test automation percentage
- Continuous integration health
4. **Business Impact Metrics**
- Production defect rates
- Time-to-market improvements
- Customer satisfaction scores
- Cost savings from automation
Include KPI dashboards, trend analysis, and predictive analytics.
""")In this chapter, we've covered OpenHands' comprehensive testing capabilities:
- Unit Testing: Automated test generation, quality assessment, coverage analysis
- Integration Testing: API tests, database tests, load testing
- End-to-End Testing: Full application testing, browser automation
- Performance Testing: Load testing, profiling, optimization
- Security Testing: Vulnerability scanning, penetration testing
- CI/CD Integration: Pipeline automation, deployment testing
- Test Data Management: Generation, privacy, lifecycle management
- Reporting & Analytics: Comprehensive reporting, KPI tracking
OpenHands can create complete testing suites from requirements, execute them, and provide detailed analysis and reporting.
- Comprehensive Coverage: Unit, integration, E2E, performance, and security testing
- Automated Generation: Create tests from specifications and code analysis
- Quality Assurance: Coverage analysis, quality metrics, and continuous improvement
- CI/CD Integration: Seamless integration with deployment pipelines
- Data Management: Secure, realistic test data generation and management
- Analytics & Reporting: Detailed insights and actionable metrics
Next, we'll explore refactoring - OpenHands' ability to improve code structure, performance, and maintainability.
Ready for the next chapter? Chapter 6: Refactoring
Generated for Awesome Code Docs
Most teams struggle here because the hard part is not writing more code, but deciding clear boundaries for testing, Test, test so behavior stays predictable as complexity grows.
In practical terms, this chapter helps you avoid three common failures:
- coupling core logic too tightly to one implementation path
- missing the handoff boundaries between setup, execution, and validation
- shipping changes without clear rollback or observability strategy
After working through this chapter, you should be able to reason about Chapter 5: Testing - Comprehensive Test Suite Generation and Quality Assurance as an operating subsystem inside OpenHands Tutorial: Autonomous Software Engineering Workflows, with explicit contracts for inputs, state transitions, and outputs.
Use the implementation notes around Performance, integration, analysis as your checklist when adapting these patterns to your own repository.
Under the hood, Chapter 5: Testing - Comprehensive Test Suite Generation and Quality Assurance usually follows a repeatable control path:
- Context bootstrap: initialize runtime config and prerequisites for
testing. - Input normalization: shape incoming data so
Testreceives stable contracts. - Core execution: run the main logic branch and propagate intermediate state through
test. - Policy and safety checks: enforce limits, auth scopes, and failure boundaries.
- Output composition: return canonical result payloads for downstream consumers.
- Operational telemetry: emit logs/metrics needed for debugging and performance tuning.
When debugging, walk this sequence in order and confirm each stage has explicit success/failure conditions.
Use the following upstream sources to verify implementation details while reading this chapter:
- OpenHands Repository
Why it matters: authoritative reference on
OpenHands Repository(github.com). - OpenHands Docs
Why it matters: authoritative reference on
OpenHands Docs(docs.openhands.dev). - OpenHands Releases
Why it matters: authoritative reference on
OpenHands Releases(github.com).
Suggested trace strategy:
- search upstream code for
testingandTestto map concrete implementation paths - compare docs claims against actual runtime/config code before reusing patterns in production