Chapter 10
Team Workflows with Cursor
Individual productivity gains mean nothing if teams can't scale them. A developer using Cursor might ship features twice as fast—but if that code creates review bottlenecks, breaks conventions, or bypasses testing standards, the team actually slows down. Velocity without coordination creates chaos.
This chapter shifts focus from personal workflows to team systems: how to structure pair programming with AI, run code reviews that catch AI risks, track velocity honestly when AI contributes significant code, and build shared prompt libraries that scale expertise across the organization.
The goal isn't just faster coding. It's sustainable, high-quality delivery at team scale—where AI amplifies coordination rather than fragmenting it.
Pair Programming with Cursor
Traditional pair programming meant two humans at one keyboard: a navigator directing strategy and a driver writing code. With Cursor, that dynamic shifts fundamentally. The AI becomes the tireless driver, generating implementations across multiple files instantly. We become architects, validators, and domain experts—responsible for correctness, security, and business alignment.
The Test-First Iteration Loop
The most successful Cursor teams adopt a disciplined cycle that prevents hallucination and maintains control:
1. Define Intent (Set Direction)
Before opening Composer, write a 2-3 sentence summary describing what we're building and why. This becomes our context anchor.
// INTENT: Add rate limiting to login endpoint to prevent brute force.
// Must allow 5 attempts per minute per IP, return 429 with Retry-After.
// Integrate with existing Redis cache for distributed tracking.2. Write Tests First (Define Success)
We don't ask AI to write implementation yet. We start with tests describing expected behavior:
Write Jest tests for rate limiter middleware:
- Allows 5 requests within a minute
- Blocks 6th request with 429 status
- Resets counter after 60 seconds
- Tracks requests by IP correctly
- Returns Retry-After header
Use Redis mock for isolation.Why tests first? They force us to think through requirements clearly and give AI a precise specification.
3. Generate Implementation (AI Writes Code)
Implement rate limiter middleware to pass these tests:
@tests/rateLimiter.test.js
Requirements:
- Use Redis with key pattern: rate_limit:login:{ip}
- Set TTL to 60 seconds
- Follow patterns in @middleware/auth.js
- Add structured logging with request_id
Keep under 50 lines.4. Verify and Refine (Validate)
We run tests locally. If they fail, we paste errors back to Composer with context. The rhythm becomes: Generate → Test → Refine → Repeat. We never accept code without running it.
5. Add Instrumentation (Enhance Observability)
Add structured logging to rate limiter:
- Log when limit hit (WARN level)
- Include IP, endpoint, remaining quota
- Use logger from @utils/logger
- Include request_id for tracingRole Clarity in Human-AI Pairing
| Role | Responsibility | Examples |
|---|---|---|
| Human (Navigator) | Strategy, architecture, validation | Write tests, define requirements, review security, make design decisions |
| AI (Driver) | Implementation, refactoring, boilerplate | Generate code, suggest patterns, write docs, add logging |
Critical principle: The navigator never codes blindly. Every AI output must be read, understood, and validated before acceptance.
Context Management Best Practices
Start Fresh for New Features
Long conversations accumulate irrelevant context. For each new feature, start a new Composer chat to keep the AI focused.
Tag Files Explicitly
While Cursor can auto-fetch context, explicit tagging ensures precision:
Refactor @components/UserProfile.tsx to use new auth hook:
@hooks/useAuth.ts
Ensure compatibility with @types/User.ts interface.Use Memory for Recurring Corrections
If Cursor repeatedly makes the same mistake, correct it once explicitly and it will apply the correction throughout the session.
Common Pitfalls and Guardrails
Pitfall: Blind Acceptance
Risk: Silent bugs, security holes, code that works until edge cases hit production.
Solution: Always ask: "Can I explain this code to a teammate?" If not, request Composer explain it or refactor for clarity.
⚠️Pitfall: Over-Delegation
Risk: Asking AI to "build the entire authentication system" produces plausible-looking code with subtle flaws.
Solution: Break large tasks into small, testable pieces. Implement incrementally with validation at each step.
Pitfall: Skipping Security Reviews
Risk: Vulnerabilities like SQL injection, XSS, CSRF, or insecure defaults.
Solution: Apply a security checklist to all AI code:
- ✅ Input validation and sanitization present?
- ✅ SQL queries use parameterized statements?
- ✅ Authentication checks in place?
- ✅ Rate limiting for sensitive endpoints?
- ✅ Error messages don't leak sensitive info?
Code Reviews in an AI-First World
AI-generated code moves fast. Reviews must keep pace without sacrificing quality. The challenge: how do we review 300 lines of AI code as thoroughly as 50 lines of human code, without missing critical issues?
The Human-AI Review Partnership
What AI Reviews Well
- Style consistency (formatting, naming, conventions)
- Basic logic errors (null pointers, type mismatches)
- Security scanning (SQL injection, XSS, hardcoded secrets)
- Documentation gaps (missing docstrings, unclear names)
What Humans Must Review
- Architectural fit within the system
- Business logic correctness
- Complex edge cases (race conditions, distributed issues)
- Context-specific security threats
The model: AI does the first pass, humans focus on high-leverage concerns.
Best Practices for AI-Augmented Reviews
1. Security-First Checklist
For AI-generated code touching sensitive areas:
Authentication & Authorization:
- ✅ Auth checks present and correct?
- ✅ Role/permission validation implemented?
- ✅ Session management secure?
Data Handling:
- ✅ SQL queries parameterized?
- ✅ Input validated and sanitized?
- ✅ Output encoded to prevent XSS?
Secrets & Config:
- ✅ No hardcoded credentials?
- ✅ Environment variables used?
- ✅ Secrets not logged?
2. The "Explain This" Test
If an AI-generated block looks complex, verify understanding:
Explain the security implications of this code:
@auth/sessionManager.ts lines 45-67
Specifically:
- How are sessions invalidated?
- What prevents session fixation?
- How are race conditions handled?3. PR Template for AI-Assisted Development
## What Changed
[3-5 bullet summary]
## AI Involvement
- **Tool**: Cursor Composer v0.42
- **AI-Generated**:
- src/auth/rateLimiter.ts (lines 1-89)
- tests/auth/rateLimiter.test.ts (lines 1-120)
- **Suggestion Acceptance**: 87% accepted, 13% modified
## Validation
- **Tests**: 12 unit, 3 integration (all passing)
- **Reviewed By**: @senior-dev
- **Security Scans**: ✅ Passed
- **Manual Testing**: ✅ Verified in staging
## Risks
[Known limitations or follow-up work]This provides traceability, helps measure AI contribution accurately, and creates accountability for human validation.
Sprint Velocity Tracking with AI Contribution
When AI writes 30-40% of a team's code, traditional velocity metrics break. Story points estimated for "human effort" no longer predict capacity. We face a choice: recalibrate metrics or watch forecasts drift into fantasy.
The Velocity Inflation Problem
With AI assistance:
- Tasks estimated at 5 points now take 2 points of human effort
- Teams complete more story points without increasing capacity
- Velocity numbers climb, but forecasting accuracy collapses
⚠️The trap
Moving to Flow Metrics
Instead of tracking story points, we adopt DORA (DevOps Research and Assessment) metrics:
| Metric | Definition | What It Reveals |
|---|---|---|
| Lead Time | Commit to production | End-to-end efficiency |
| Cycle Time | Work start to completion | Dev + review speed |
| Deployment Frequency | How often code ships | Team throughput |
| Change Failure Rate | % of deployments causing failures | Quality under velocity |
| MTTR | Time to fix production incidents | Resilience |
Why these work with AI: They measure outcomes, not effort. Whether human or AI wrote code, what matters is how quickly it reaches users and how reliably it works.
Measuring AI Contribution Without Gaming Metrics
1. AI Contribution Rate
Definition: Percentage of accepted code that was AI-generated.
git commit -m "Add rate limiting to login endpoint
AI-generated: src/auth/rateLimiter.ts (lines 1-89)
Human-written: src/auth/rateLimiter.ts (lines 90-110)
AI-assisted: tests/auth/rateLimiter.test.ts"Parse commit messages in CI to calculate contribution rate per sprint. Target: 30-50% is typical for teams actively using Cursor.
2. Quality Index (Composite Metric)
Track quality alongside velocity:
Quality Index = (Test Coverage × 0.3) +
(Static Analysis × 0.2) +
(Security Pass Rate × 0.3) +
(Review Approval Time × 0.2)
Goal: Maintain Quality Index >75% even as velocity increases.Building Prompt Repositories
A team without shared prompts is like a team without code standards. Everyone reinvents solutions, quality drifts, and onboarding becomes trial-and-error. Prompt repositories solve this by capturing, versioning, and distributing collective AI expertise.
Structure of an Effective Repository
prompt-library/
├── README.md
├── backend/
│ ├── api/
│ │ ├── crud-endpoint.yaml
│ │ └── graphql-resolver.yaml
│ ├── security/
│ │ ├── rate-limiter.yaml
│ │ └── input-validation.yaml
│ └── database/
│ └── migration.yaml
├── frontend/
│ ├── components/
│ │ └── form-validation.yaml
│ └── hooks/
│ └── api-integration.yaml
├── testing/
│ ├── unit-test.yaml
│ └── integration-test.yaml
└── refactoring/
└── clean-code.yamlExample Prompt Templates
1. Code Generation
id: prompt-rate-limiter
title: "Generate rate limiting middleware"
category: backend/security
prompt_text: |
Create Express.js rate limiting middleware:
Requirements:
- Redis key pattern: rate_limit:{endpoint}:{ip}
- Default: 5 requests/minute per IP
- Return 429 with Retry-After header
- Support custom limits via config
- Structured logging with request_id
- Follow patterns from @middleware/auth.js
Include middleware function, unit tests, and docs.2. Security Review
id: prompt-security-review
prompt_text: |
Review for security vulnerabilities:
@[target-file]
Focus:
- OWASP Top 10 issues
- Input validation gaps
- SQL injection risks
- XSS vulnerabilities
- Auth/authz gaps
- Secrets exposure
For each: explain vulnerability, show exploit,
provide secure example.Governance: Treat Prompts as Code
Version prompts like we version code:
# Create branch for new prompt
git checkout -b prompts/add-api-endpoint-generator
# Add prompt file
vim backend/api/crud-endpoint.yaml
# Test the prompt
npm run test-prompt backend/api/crud-endpoint.yaml
# Commit with description
git commit -m "Add CRUD endpoint generator
- Generates RESTful endpoints with validation
- Includes Zod schemas and auth middleware
- Tested with User resource"
# Create PR for review
gh pr createReview criteria:
- ✅ Produces consistent, high-quality output?
- ✅ Includes examples and tests?
- ✅ Follows team standards?
- ✅ Metadata complete (owner, tags, last-reviewed)?
Key Takeaways
✅Pair Programming
ℹ️Code Reviews
✅Velocity Tracking
ℹ️Prompt Libraries
The meta-pattern: AI amplifies coordination when teams build shared systems.
Without structure, AI accelerates chaos. With discipline, it unlocks unprecedented team velocity—while maintaining the quality and resilience that matter most.