Chapter 10

Team Workflows with Cursor

Individual productivity gains mean nothing if teams can't scale them. A developer using Cursor might ship features twice as fast—but if that code creates review bottlenecks, breaks conventions, or bypasses testing standards, the team actually slows down. Velocity without coordination creates chaos.

This chapter shifts focus from personal workflows to team systems: how to structure pair programming with AI, run code reviews that catch AI risks, track velocity honestly when AI contributes significant code, and build shared prompt libraries that scale expertise across the organization.

The goal isn't just faster coding. It's sustainable, high-quality delivery at team scale—where AI amplifies coordination rather than fragmenting it.

Pair Programming with Cursor

Traditional pair programming meant two humans at one keyboard: a navigator directing strategy and a driver writing code. With Cursor, that dynamic shifts fundamentally. The AI becomes the tireless driver, generating implementations across multiple files instantly. We become architects, validators, and domain experts—responsible for correctness, security, and business alignment.

The Test-First Iteration Loop

The most successful Cursor teams adopt a disciplined cycle that prevents hallucination and maintains control:

1. Define Intent (Set Direction)

Before opening Composer, write a 2-3 sentence summary describing what we're building and why. This becomes our context anchor.

// INTENT: Add rate limiting to login endpoint to prevent brute force.
// Must allow 5 attempts per minute per IP, return 429 with Retry-After.
// Integrate with existing Redis cache for distributed tracking.

2. Write Tests First (Define Success)

We don't ask AI to write implementation yet. We start with tests describing expected behavior:

Write Jest tests for rate limiter middleware:
- Allows 5 requests within a minute
- Blocks 6th request with 429 status
- Resets counter after 60 seconds
- Tracks requests by IP correctly
- Returns Retry-After header

Use Redis mock for isolation.

Why tests first? They force us to think through requirements clearly and give AI a precise specification.

3. Generate Implementation (AI Writes Code)

Implement rate limiter middleware to pass these tests:
@tests/rateLimiter.test.js

Requirements:
- Use Redis with key pattern: rate_limit:login:{ip}
- Set TTL to 60 seconds
- Follow patterns in @middleware/auth.js
- Add structured logging with request_id

Keep under 50 lines.

4. Verify and Refine (Validate)

We run tests locally. If they fail, we paste errors back to Composer with context. The rhythm becomes: Generate → Test → Refine → Repeat. We never accept code without running it.

5. Add Instrumentation (Enhance Observability)

Add structured logging to rate limiter:
- Log when limit hit (WARN level)
- Include IP, endpoint, remaining quota
- Use logger from @utils/logger
- Include request_id for tracing

Role Clarity in Human-AI Pairing

Role	Responsibility	Examples
Human (Navigator)	Strategy, architecture, validation	Write tests, define requirements, review security, make design decisions
AI (Driver)	Implementation, refactoring, boilerplate	Generate code, suggest patterns, write docs, add logging

Critical principle: The navigator never codes blindly. Every AI output must be read, understood, and validated before acceptance.

Context Management Best Practices

Start Fresh for New Features

Long conversations accumulate irrelevant context. For each new feature, start a new Composer chat to keep the AI focused.

Tag Files Explicitly

While Cursor can auto-fetch context, explicit tagging ensures precision:

Refactor @components/UserProfile.tsx to use new auth hook:
@hooks/useAuth.ts

Ensure compatibility with @types/User.ts interface.

Use Memory for Recurring Corrections

If Cursor repeatedly makes the same mistake, correct it once explicitly and it will apply the correction throughout the session.

Common Pitfalls and Guardrails

Pitfall: Blind Acceptance

Risk: Silent bugs, security holes, code that works until edge cases hit production.

Solution: Always ask: "Can I explain this code to a teammate?" If not, request Composer explain it or refactor for clarity.

⚠️
Pitfall: Over-Delegation

Risk: Asking AI to "build the entire authentication system" produces plausible-looking code with subtle flaws.

Solution: Break large tasks into small, testable pieces. Implement incrementally with validation at each step.

Pitfall: Skipping Security Reviews

Risk: Vulnerabilities like SQL injection, XSS, CSRF, or insecure defaults.

Solution: Apply a security checklist to all AI code:

✅ Input validation and sanitization present?
✅ SQL queries use parameterized statements?
✅ Authentication checks in place?
✅ Rate limiting for sensitive endpoints?
✅ Error messages don't leak sensitive info?

Code Reviews in an AI-First World

AI-generated code moves fast. Reviews must keep pace without sacrificing quality. The challenge: how do we review 300 lines of AI code as thoroughly as 50 lines of human code, without missing critical issues?

The Human-AI Review Partnership

What AI Reviews Well

Style consistency (formatting, naming, conventions)
Basic logic errors (null pointers, type mismatches)
Security scanning (SQL injection, XSS, hardcoded secrets)
Documentation gaps (missing docstrings, unclear names)

What Humans Must Review

Architectural fit within the system
Business logic correctness
Complex edge cases (race conditions, distributed issues)
Context-specific security threats

The model: AI does the first pass, humans focus on high-leverage concerns.

Best Practices for AI-Augmented Reviews

1. Security-First Checklist

For AI-generated code touching sensitive areas:

Authentication & Authorization:

✅ Auth checks present and correct?
✅ Role/permission validation implemented?
✅ Session management secure?

Data Handling:

✅ SQL queries parameterized?
✅ Input validated and sanitized?
✅ Output encoded to prevent XSS?

Secrets & Config:

✅ No hardcoded credentials?
✅ Environment variables used?
✅ Secrets not logged?

2. The "Explain This" Test

If an AI-generated block looks complex, verify understanding:

Explain the security implications of this code:
@auth/sessionManager.ts lines 45-67

Specifically:
- How are sessions invalidated?
- What prevents session fixation?
- How are race conditions handled?

3. PR Template for AI-Assisted Development

## What Changed
[3-5 bullet summary]

## AI Involvement
- **Tool**: Cursor Composer v0.42
- **AI-Generated**:
  - src/auth/rateLimiter.ts (lines 1-89)
  - tests/auth/rateLimiter.test.ts (lines 1-120)
- **Suggestion Acceptance**: 87% accepted, 13% modified

## Validation
- **Tests**: 12 unit, 3 integration (all passing)
- **Reviewed By**: @senior-dev
- **Security Scans**: ✅ Passed
- **Manual Testing**: ✅ Verified in staging

## Risks
[Known limitations or follow-up work]

This provides traceability, helps measure AI contribution accurately, and creates accountability for human validation.

Sprint Velocity Tracking with AI Contribution

When AI writes 30-40% of a team's code, traditional velocity metrics break. Story points estimated for "human effort" no longer predict capacity. We face a choice: recalibrate metrics or watch forecasts drift into fantasy.

The Velocity Inflation Problem

With AI assistance:

Tasks estimated at 5 points now take 2 points of human effort
Teams complete more story points without increasing capacity
Velocity numbers climb, but forecasting accuracy collapses

⚠️
The trap

Management sees higher velocity and expects it to continue indefinitely, creating unsustainable pressure.

Moving to Flow Metrics

Instead of tracking story points, we adopt DORA (DevOps Research and Assessment) metrics:

Metric	Definition	What It Reveals
Lead Time	Commit to production	End-to-end efficiency
Cycle Time	Work start to completion	Dev + review speed
Deployment Frequency	How often code ships	Team throughput
Change Failure Rate	% of deployments causing failures	Quality under velocity
MTTR	Time to fix production incidents	Resilience

Why these work with AI: They measure outcomes, not effort. Whether human or AI wrote code, what matters is how quickly it reaches users and how reliably it works.

Measuring AI Contribution Without Gaming Metrics

1. AI Contribution Rate

Definition: Percentage of accepted code that was AI-generated.

git commit -m "Add rate limiting to login endpoint

AI-generated: src/auth/rateLimiter.ts (lines 1-89)
Human-written: src/auth/rateLimiter.ts (lines 90-110)
AI-assisted: tests/auth/rateLimiter.test.ts"

Parse commit messages in CI to calculate contribution rate per sprint. Target: 30-50% is typical for teams actively using Cursor.

2. Quality Index (Composite Metric)

Track quality alongside velocity:

Quality Index = (Test Coverage × 0.3) +
                (Static Analysis × 0.2) +
                (Security Pass Rate × 0.3) +
                (Review Approval Time × 0.2)

Goal: Maintain Quality Index >75% even as velocity increases.

Building Prompt Repositories

A team without shared prompts is like a team without code standards. Everyone reinvents solutions, quality drifts, and onboarding becomes trial-and-error. Prompt repositories solve this by capturing, versioning, and distributing collective AI expertise.

Structure of an Effective Repository

prompt-library/
├── README.md
├── backend/
│   ├── api/
│   │   ├── crud-endpoint.yaml
│   │   └── graphql-resolver.yaml
│   ├── security/
│   │   ├── rate-limiter.yaml
│   │   └── input-validation.yaml
│   └── database/
│       └── migration.yaml
├── frontend/
│   ├── components/
│   │   └── form-validation.yaml
│   └── hooks/
│       └── api-integration.yaml
├── testing/
│   ├── unit-test.yaml
│   └── integration-test.yaml
└── refactoring/
    └── clean-code.yaml

Example Prompt Templates

1. Code Generation

id: prompt-rate-limiter
title: "Generate rate limiting middleware"
category: backend/security
prompt_text: |
  Create Express.js rate limiting middleware:
  
  Requirements:
  - Redis key pattern: rate_limit:{endpoint}:{ip}
  - Default: 5 requests/minute per IP
  - Return 429 with Retry-After header
  - Support custom limits via config
  - Structured logging with request_id
  - Follow patterns from @middleware/auth.js
  
  Include middleware function, unit tests, and docs.

2. Security Review

id: prompt-security-review
prompt_text: |
  Review for security vulnerabilities:
  @[target-file]
  
  Focus:
  - OWASP Top 10 issues
  - Input validation gaps
  - SQL injection risks
  - XSS vulnerabilities
  - Auth/authz gaps
  - Secrets exposure
  
  For each: explain vulnerability, show exploit,
  provide secure example.

Governance: Treat Prompts as Code

Version prompts like we version code:

# Create branch for new prompt
git checkout -b prompts/add-api-endpoint-generator

# Add prompt file
vim backend/api/crud-endpoint.yaml

# Test the prompt
npm run test-prompt backend/api/crud-endpoint.yaml

# Commit with description
git commit -m "Add CRUD endpoint generator
- Generates RESTful endpoints with validation
- Includes Zod schemas and auth middleware
- Tested with User resource"

# Create PR for review
gh pr create

Review criteria:

✅ Produces consistent, high-quality output?
✅ Includes examples and tests?
✅ Follows team standards?
✅ Metadata complete (owner, tags, last-reviewed)?

Key Takeaways

✅
Pair Programming

Treat AI as a tireless driver while staying in the navigator seat. Test-first workflows prevent hallucination. Context management is critical.

ℹ️
Code Reviews

Use AI (Bugbot/CLI) for first-pass reviews, humans for architecture and security. Track false positives to improve accuracy over time.

✅
Velocity Tracking

Move from story points to flow metrics (DORA). Measure AI contribution separately. Adjust capacity forecasting, not estimates.

ℹ️
Prompt Libraries

Centralize team expertise, enforce standards through prompts, version control like code. Measure ROI through time savings and quality improvements.

The meta-pattern: AI amplifies coordination when teams build shared systems.

Without structure, AI accelerates chaos. With discipline, it unlocks unprecedented team velocity—while maintaining the quality and resilience that matter most.