Skip to content

Commit 70886b9

Browse files
committed
chapter 16
1 parent 8ecafda commit 70886b9

12 files changed

Lines changed: 4022 additions & 44 deletions

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ Over the past years working in AI/ML, I filled notebooks with intuition first, r
2828
| 12 | [Graph Neural Networks](chapter%2012%3A%20graph%20neural%20networks/01.%20geometric%20deep%20learning.md) | geometric deep learning, graph theory, GNNs, graph attention, Graph Transformers, 3D equivariant networks | Available |
2929
| 13 | [Computing & OS](chapter%2013%3A%20computing%20and%20OS/01.%20discrete%20maths.md) | discrete maths, computer architecture, operating systems, concurrency, parallelism, programming languages | Available |
3030
| 14 | [Data Structures & Algorithms](chapter%2014%3A%20data%20structures%20and%20algorithms/00.%20foundations.md) | Big O, recursion, backtracking, DP, arrays, hashing, linked lists, stacks, trees, graphs, sorting, binary search | Available |
31-
| 15 | Production Software Engineering | Linux fundamentals, Git fundamentals, codebase design patterns, testing | Coming |
32-
| 16 | SIMD & GPU Programming | ARM & NEON, X86 chips, RISC ships, GPUs, TPUs, triton, CUDA, Vulkan | Coming |
31+
| 15 | [Production Software Engineering](chapter%2015%3A%20production%20software%20engineering/01.%20linux%20and%20CMD.md) | Linux, Git, codebase design, testing, CI/CD, Docker, model serving, MLOps, monitoring, best way to use coding agents | Available |
32+
| 16 | [SIMD & GPU Programming](chapter%2016%3A%20SIMD%20and%20GPU%20programming/00.%20why%20C%2B%2B%20and%20how%20ML%20frameworks%20work.md) | C++ for ML, how frameworks work, hardware fundamentals, ARM NEON/I8MM/SME2, x86 AVX, GPU/CUDA, Triton, TPUs, RISC-V, Vulkan, WebGPU | Available |
3333
| 17 | ML Systems Design | systems design fundamentals, cloud computing, large scale infra, ML systems design examples | Coming |
3434
| 18 | AI Inference | quantisation, streamingLLMs, continuous batching, edge inference, | Coming |
3535
| 19 | Applied AI | Ai in finance, healthcare, protein, drug discovery | Coming |

chapter 15: production software engineering/03. codebase design.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -303,3 +303,81 @@ pip install -e ".[dev]" # install in editable mode with dev dependencies
303303
- **Editable install** (`-e`): changes to your source code are immediately reflected without reinstalling. Essential during development.
304304

305305
- **Pinning dependencies**: `requirements.txt` with exact versions (`torch==2.2.1`, not `torch>=2.0`) ensures reproducibility. Use `pip freeze > requirements.txt` to capture your current environment. For more sophisticated dependency management, use `uv`, `poetry`, or `pip-tools`.
306+
307+
## Working with AI Coding Agents
308+
309+
- AI coding agents (Claude Code, GitHub Copilot, Cursor, etc.) are now part of the professional engineering workflow. Used well, they dramatically accelerate development. Used poorly, they introduce subtle bugs, erode your understanding of your own codebase, and create a false sense of productivity.
310+
311+
- The right mental model: **an agent is a fast but inexperienced pair programmer**. It can write code quickly, knows syntax and standard patterns, and has read more documentation than you ever will. But it does not understand your specific system, your business constraints, your edge cases, or the *why* behind your design decisions. You are the senior engineer; the agent is the junior. You direct, review, and take responsibility.
312+
313+
### When Agents Excel
314+
315+
- **Boilerplate and scaffolding**: generating Dockerfiles, CI configs, test fixtures, data class definitions, argparse setups. These follow well-known patterns and are tedious to write by hand. Let the agent generate them, then review for correctness.
316+
317+
- **Writing tests**: describe the function's behaviour, and the agent generates test cases. It often catches edge cases you would miss (empty input, negative values, Unicode). Always read the generated tests — they verify your assumptions, not just your code.
318+
319+
- **Refactoring**: "extract this block into a function," "convert this class to use dataclasses," "add type hints to this module." Mechanical transformations where the intent is clear and the risk of subtle errors is low.
320+
321+
- **Exploration and prototyping**: "write a quick script to benchmark inference latency" or "show me how to use the HuggingFace tokeniser API." The agent gets you a working starting point faster than reading documentation.
322+
323+
- **Documentation and docstrings**: the agent can generate documentation from your code structure. Review for accuracy, but the grunt work is automated.
324+
325+
- **Debugging assistance**: paste an error traceback and ask for diagnosis. The agent can often identify the root cause and suggest a fix, especially for common issues (shape mismatches, import errors, CUDA out of memory).
326+
327+
### When to NOT Rely on Agents
328+
329+
- **Novel architecture decisions**: if you are designing a new training pipeline, the agent will give you a generic answer. It does not know your data constraints, latency requirements, or team expertise. Use the agent to implement the design you have already thought through.
330+
331+
- **Security-critical code**: authentication, encryption, input sanitisation. The agent may generate code that looks correct but has subtle vulnerabilities (SQL injection, insecure defaults, timing attacks). Security code should be written by someone who understands the threat model, and reviewed by someone else.
332+
333+
- **Performance-critical inner loops**: the agent will write correct but naive code. For GPU kernels, memory-critical data structures, or latency-sensitive serving paths, you need to understand the hardware constraints (chapter 13, chapter 16) and optimise deliberately.
334+
335+
- **Code you don't understand**: if the agent generates 200 lines and you cannot explain what each line does, do not commit it. You are now maintaining code you do not understand, and when it breaks (it will), you cannot debug it. This is the most common and most dangerous failure mode.
336+
337+
### The Review Discipline
338+
339+
- **Always read every line** of generated code before committing. This is not optional. The agent's code is a draft, not a finished product. Treat it exactly like a pull request from a colleague: review it critically.
340+
341+
- **What to check**:
342+
- **Correctness**: does it actually do what you asked? Agents often solve a subtly different problem than the one you intended.
343+
- **Edge cases**: does it handle empty inputs, None values, negative numbers, very large inputs? Agents frequently omit edge case handling.
344+
- **Hallucinated APIs**: the agent may call functions or use parameters that do not exist, especially for newer or less common libraries. Verify that every API call is real.
345+
- **Over-engineering**: agents tend to produce more code than necessary. A 50-line solution to a 10-line problem adds complexity without benefit. Simplify ruthlessly.
346+
- **Security**: hardcoded secrets, unsanitised user input, insecure defaults. The agent does not think adversarially.
347+
- **Style consistency**: does the generated code match your project's conventions (naming, patterns, error handling)?
348+
349+
### How to Write Good Prompts
350+
351+
- The quality of the agent's output is directly proportional to the quality of your instruction. Vague prompts get vague code.
352+
353+
- **Bad**: "write a data loader"
354+
- **Good**: "write a PyTorch DataLoader for a CSV file with columns 'text' and 'label'. Tokenise the text using the HuggingFace tokeniser 'bert-base-uncased' with max_length=512. Return input_ids, attention_mask, and label as tensors. Handle the case where the CSV has missing values in the label column by skipping those rows."
355+
356+
- **Provide context**: tell the agent about your project structure, existing code, constraints, and conventions. The more context, the better the output.
357+
358+
- **Specify constraints**: "use only the standard library," "must work with Python 3.10," "do not use global variables," "follow the existing pattern in `src/models/transformer.py`."
359+
360+
- **Ask for explanations**: "implement X and explain the key design decisions." This forces the agent to articulate its reasoning, making it easier for you to spot flawed assumptions.
361+
362+
### Using Quality Gates to Catch Agent Mistakes
363+
364+
- Your existing quality infrastructure (file 04) catches agent errors just as well as human errors:
365+
366+
- **Type checking (mypy)**: catches hallucinated API signatures and type mismatches.
367+
- **Linting (ruff)**: catches unused imports, undefined variables, and style violations.
368+
- **Tests (pytest)**: if the agent's code passes your test suite, it is more likely correct. If you do not have tests, write them *before* asking the agent to implement the feature (test-driven development works especially well with agents).
369+
- **CI pipeline**: runs all of the above automatically on every commit.
370+
371+
- The combination of "agent writes code" + "quality gates verify it" is more productive than either alone. The agent is fast but sloppy; the gates are thorough but do not write code. Together, you get speed and correctness.
372+
373+
### The Productivity Trap
374+
375+
- The biggest risk of coding agents is **the illusion of productivity**. You can generate 500 lines of code in 10 minutes. But if you spend 2 hours debugging those 500 lines because you did not understand them, you were slower than writing 200 lines yourself in 30 minutes.
376+
377+
- True productivity with agents comes from:
378+
1. **Staying in control**: you decide the architecture, the agent fills in the implementation.
379+
2. **Understanding what is generated**: if you cannot explain it, rewrite it or ask the agent to simplify.
380+
3. **Investing in quality gates**: tests, types, and linting amortise their cost across every agent interaction.
381+
4. **Using the agent for your weaknesses**: if you are great at algorithms but slow at writing tests, let the agent write tests. If you are fast at UI code but unfamiliar with database queries, let the agent draft the SQL. Play to your strengths, delegate your gaps.
382+
383+
- The engineers who get the most out of coding agents are the ones who already know how to code well. The agent amplifies your existing skill; it does not replace it. Understanding data structures, algorithms, system design, and software engineering (this entire chapter) is what lets you direct the agent effectively and evaluate its output critically.

0 commit comments

Comments
 (0)