background-shape
A Year With GitHub Copilot in Production
December 2, 2022 · 4 min read · by Muhammad Amal ai

TL;DR — One year of daily Copilot use. Estimated ~25% productivity gain on boilerplate-heavy code (tests, CRUD, glue), ~5% on novel architecture. Most-misleading suggestions: APIs that look real but don’t exist. Workflow shift is real; the language for it (“AI-native engineer”) only stuck mid-2022.

December starts the AI-augmented dev theme. After a year using GitHub Copilot, this post is the honest retrospective.

I subscribed to Copilot in November 2021. It’s been ~13 months of daily use across Go, Python, TypeScript, PHP, Rust. This isn’t a Copilot promotional post; it isn’t a Copilot hate piece either. It’s what I’ve actually observed.

What I use it for, ranked by value

Highest value:

  • Boilerplate: handler signatures, struct definitions, repetitive CRUD. The exact value AI is good at.
  • Test scaffolding: “given this function, generate three test cases.” 80% correct; I edit assertions.
  • DTO mappings: convert a domain entity to a JSON response. Mechanical; Copilot handles it.
  • Documentation comments: starts a docstring; I correct it.
  • Repetitive refactors: “do this same change to 12 functions.” Copilot suggests each next one.

Medium value:

  • Algorithm starters: I describe what I want in a comment; Copilot gives a starting point. Usually 60% right.
  • Regex: I write // match phone numbers; it generates one. ~50% right.
  • Library API discovery: it suggests function calls that often exist in the library. Saves docs lookup.

Low value / negative:

  • Anything novel architectural: it pattern-matches against what’s typical; what’s typical is sometimes wrong for my situation.
  • Anything security-sensitive: it sometimes suggests insecure patterns (SQL string concat, weak crypto).
  • Anything domain-specific: business logic for my specific app is wrong without exception.

Estimated productivity gain

Hard to measure precisely. My best estimate:

Task type Time savings
Boilerplate (CRUD, mappers, tests) ~40-50%
Code reviews ~10-15%
New feature design ~5-10%
Debugging ~5%
Novel architecture ~0%

Weighted by how much time I spend on each: maybe 25% overall.

That’s significant. Not 10× (“AI will replace developers”). Not 0% (“AI is hype”). Somewhere in the middle.

What Copilot consistently gets wrong

Three patterns I see repeatedly:

1. Hallucinated APIs. Generates code calling lodash.flatMapDeep when the function is actually _.flatMapDeep — and sometimes invents APIs that don’t exist at all. The code looks right; it doesn’t run. ~20% of my “Tab to accept” sessions need correction.

2. Wrong context. Suggests Python-3 syntax in a Python-2 codebase. Suggests React 18 patterns in a React 16 codebase. It uses statistical familiarity; it doesn’t deeply check the project.

3. Subtle bugs. Off-by-one errors. Wrong sign on a comparison. Missing await. The code compiles and looks fine; misbehavior surfaces during testing.

For each, the workflow is: accept, scan, often correct. Faster than typing from scratch; not free.

The workflow shift

Some real changes in how I work:

More planning out loud. I write more comments BEFORE writing code, because Copilot’s suggestion quality scales with context. “// this function takes a userID and returns… " is now my normal opening. Side effect: better code structure.

Less stack overflow. I haven’t searched “how to do X in Python” in months. Copilot autocompletes the syntax I’m trying to remember.

More skim-reading. Less time typing; more time eyeballing what was generated. Skimming code is a real skill now.

More test-writing. Trivial cost to add 5 more test cases. I write more tests than before.

What’s NOT covered by Copilot

Copilot helps within a function or across nearby functions. It doesn’t:

  • Suggest architectural patterns
  • Find bugs in existing code (that’s a different product)
  • Refactor across modules
  • Generate PRs from descriptions
  • Plan work

For those, GPT-4 in ChatGPT (released March 2023 — out of scope here) starts being useful but is its own product. Copilot is autocomplete++.

When I turn it off

Three contexts I disable Copilot:

  • Security-sensitive code (auth, crypto, payment). The risk of an insecure pattern slipping through outweighs the speed gain.
  • Code I’m trying to learn through. Working through Rust ownership the first time, autocomplete robs me of the learning.
  • Long writing sessions in markdown. Constant suggestions distract.

It has a button. I use it.

The license thing

Mid-year (June 2022) the lawsuit against GitHub Copilot for training-data licensing was filed. As of December 2022 still in litigation. Companies with strict open-source compliance concerns have policies; mine doesn’t have an issue.

I’ll cover the legal angle separately. For most teams in 2022: the practical impact is “your code may resemble training data.” Audit accordingly.

What this month covers

12 more posts on AI-augmented dev:

Wrapping Up

Copilot is good, not magic, sometimes wrong. A year in, I won’t go back. Monday: what it’s specifically good at and what it isn’t.