First impressions of Pocket Flow’s tutorial generator§

These are my initial notes on Tutorial-Codebase-Knowledge (TCK) by Pocket Flow. I first heard about this project here: Show HN: I built an AI that turns GitHub codebases into easy tutorials.

Summary§

With its default settings, the output from TCK was frankly unusable. It did not produce a tutorial, and the writing was not geared towards codebase contributors. BUT! With very little tweaking, I was able to get content that is very well-suited for codebase contributors. I was not able to get it to produce veritable tutorial content, though.

Background§

In its README, TCK describes itself like this:

Ever stared at a new codebase written by others feeling completely lost? This tutorial shows you how to build an AI agent that analyzes GitHub repositories and creates beginner-friendly tutorials explaining exactly how the code works.

As a technical writer, I have a specific understanding of tutorials. A tutorial gives you hands-on experience in building up a specific skill. You start from a very specific point A and end at a very specific point B. I believe that most technical writers agree on this definition but I know that other roles (e.g. software engineers) use different definitions. See Optimize website speed and Tour of Pigweed for examples of tutorials, written by yours truly!

The explaining exactly how the code works part of the description suggests that TCK is specifically intended to help onboard new codebase contributors. I.e. it’s not intended to create end user tutorials. E.g. if I provide it the React codebase, I expect to get a tutorial that teaches me how to contribute bug fixes or new features to the React codebase, not how to build websites with React.

First attempt§

Let’s start with the Sphinx codebase. I use Sphinx in most of my docs projects. One of my top goals this year is to contribute to the upstream Sphinx codebase more.

Setup§

I like the simple setup process:

$ fish  # The activate command below assumes a fish shell.
$ mkdir pocketflow
$ cd pocketflow
$ git clone git@github.com:sphinx-doc/sphinx.git
$ git clone https://github.com/The-Pocket/Tutorial-Codebase-Knowledge.git
$ cd Tutorial-Codebase-Knowledge
# Edit utils/call_llm.py to use your API key.
# The language model can also be configured from this file.
$ python3 -m venv venv
$ . venv/bin/activate.fish
$ python3 -m pip install -r requirements.txt
$ time python main.py --dir ../sphinx

The default model is gemini-2.5-pro-exp-03-25. I quickly hit rate limits with that one. The rate limit error message told me to use gemini-2.5-pro-preview-03-25 instead but that one also hit rate limits. I then tried gemini-2.5-flash-preview-04-17 but again, rate limits. I finally downgraded all the way to gemini-2.0-flash. That resolved the rate limit issues but another problem popped up:

The input token count (1366296) exceeds the maximum
number of tokens allowed (1000000).

The fact that we’re hitting an input limit here suggests that TCK feeds in all of the source files as input to the language model.

To proceed, I guess I either need to reduce the amount of input, or choose a smaller project. Let’s try excluding all of Sphinx’s documentation (*.rst) files. In real-world usage, I imagine that codebase owners will use TCK to kickstart their own docs. In other words, their codebase won’t have any docs, and they will use TCK to get a first draft of the docs going.

$ time python main.py --dir ../sphinx --exclude "*.rst"

Still hitting input limits:

The input token count (1569286) exceeds the maximum
number of tokens allowed (1000000).

I’ll give up on Sphinx for now and try the microbit Rust crate instead. This crate lets you write embedded software for the BBC micro:bit in Rust.

$ cd ..
$ git clone https://github.com/nrf-rs/microbit
$ cd Tutorial-Codebase-Knowledge
$ time python main.py --dir ../microbit

This time it worked:

Starting tutorial generation for: ../microbit in English language
Crawling directory: ../microbit...
Fetched 2 files.
Identifying abstractions using LLM...
Identified 5 abstractions.
Analyzing relationships using LLM...
Generated project summary and relationship details.
Determining chapter order using LLM...
Determined chapter order (indices): [0, 1, 2, 3, 4]
Preparing to write 5 chapters...
Writing chapter 1 for: microbit (crate)
 using LLM...
Writing chapter 2 for: Board
 using LLM...
Writing chapter 3 for: Display
 using LLM...
Writing chapter 4 for: GPIO (General Purpose Input/Output) Pins
 using LLM...
Writing chapter 5 for: HAL (Hardware Abstraction Layer)
 using LLM...
Finished writing 5 chapters.
Combining tutorial into directory: output/microbit
  - Wrote output/microbit/index.md
  - Wrote output/microbit/01_microbit__crate__.md
  - Wrote output/microbit/02_board_.md
  - Wrote output/microbit/03_display_.md
  - Wrote output/microbit/04_gpio__general_purpose_input_output__pins_.md
  - Wrote output/microbit/05_hal__hardware_abstraction_layer__.md

Tutorial generation complete! Files are in: output/microbit

________________________________________________________
Executed in   62.90 secs      fish           external
   usr time  575.98 millis  998.00 micros  574.99 millis
   sys time   57.69 millis  256.00 micros   57.43 millis

Evaluation§

Here are my notes about each generated chapter. Clicking the link to a chapter takes you to the generated Markdown for that page.

Index

Pros:

  • I like that the page is concise.

  • The Mermaid diagram is attractive.

  • The components of the diagram seem to be in the correct places.

Cons:

  • The first paragraph claims that the crate provides an “operating system” for the micro:bit. Considering that this is supposed to be a tutorial for codebase contributors, that sounds very misleading. I’m pretty sure this crate gives you bare metal control of the micro:bit. The micro:bit is not powerful enough to run most full-fledged operating systems. I don’t think the crate even uses an RTOS.

  • The diagram seems incomplete. The crate provides examples of interfacing with the micro:bit’s ADC, magnetometer, random number generator, serial, servo, microphone, and speaker. I would expect some of those to be covered in the diagram.

  • Usually, tutorials start by declaring what you’ll learn. After reading this page I have a sense of what the codebase does, but I’m still not sure about what I’ll accomplish by the end of the tutorial.

Chapter 1: microbit (crate)

Pros:

  • It calls out the 2 versions of the micro:bit upfront.

  • It generated a timing diagram!

Cons:

  • The writing is heavily geared towards beginners. This writing style does not seem appropriate for codebase contributors who are usually assumed to be proficient programmers.

  • The first code example is incomplete. There should be more indication that this code won’t work.

  • The display_character code example is useless.

Chapter 2: Board

Cons:

  • The “main character in a video game” analogy is strange.

  • One of the diagrams does not render. Given the fact that GitHub was able to render the other Mermaid diagrams, I presume that TCK itself generated incorrect Mermaid code.

  • The simplified internal implementation seems pretty far removed from the real implementation. If our goal is to onboard new codebase contributors, I’m not sure that’s a good idea.

  • At this point I’m pretty sure that none of the code examples are actually going to work. The lack of #![no_main] and #![no_std] is a giveaway that none of the code is complete.

Chapter 3: Display

Pros:

  • The “key concepts” section seems like a decent overview.

Chapter 4: GPIO (General Purpose Input/Output) Pins

Pros:

  • The code examples are starting to look more fleshed out.

  • If I were brand new to GPIO, this seems like a decent introduction.

Cons:

  • The content is definitely not geared towards codebase contributors.

Chapter 5: HAL (Hardware Abstraction Layer)

Pros:

  • The conceptual explanation of HALs looks solid.

  • The level of technical depth is starting to look more aligned with what new codebase contributors would need.

Cons:

  • The page ends by saying In the next chapter, we'll explore further concepts. but there is no next chapter. This is the last chapter.

Conclusions§

The stated goal of the project is to create tutorials that help onboard new codebase contributors. The default TCK logic running on gemini-2.0-flash does not accomplish this goal. It does not generate tutorials, and the writing is not targeted at codebase contributors.

However! I’m not done. It gets very interesting, very quickly.

Second attempt§

An exciting thing about this project is that it’s all open source and the TCK repo itself is quite simple. I’m also personally enjoying the “f*ck you simplicity” of core Pocket Flow itself. Check the Pocket Flow docs to see what I mean.

Setup§

Let’s try to customize TCK to fix the issues that we encountered in the first evaluation. The only file that we need to touch is nodes.py. The creation of the tutorial happens through a series of tasks (“nodes”) in a certain order (“flow”). To start, we don’t even need to mess with the tasks or the ordering of tasks. We just tweak some of the prompting in some of the tasks. Here’s a diff of the prompts that I changed:

diff --git a/nodes.py b/nodes.py
index 67ab034..0efa6d1 100644
--- a/nodes.py
+++ b/nodes.py
@@ -117,11 +134,12 @@ Codebase Context:
 {context}
 
 {language_instruction}Analyze the codebase context.
-Identify the top 5-10 core most important abstractions to help those new to the codebase.
+Identify the core abstractions. Our goal is to help onboard
+new contributors into this codebase. Assume that they are proficient software programmers.
 
 For each abstraction, provide:
 1. A concise `name`{name_lang_hint}.
-2. A beginner-friendly `description` explaining what it is with a simple analogy, in around 100 words{desc_lang_hint}.
+2. A concise, technical `description` explaining the abstraction in 100-300 words{desc_lang_hint}.
 3. A list of relevant `file_indices` (integers) using the format `idx # path/comment`.
 
 List of file indices and paths present in the context:
@@ -255,7 +296,7 @@ Context (Abstractions, Descriptions, Code):
 {context}
 
 {language_instruction}Please provide:
-1. A high-level `summary` of the project's main purpose and functionality in a few beginner-friendly sentences{lang_hint}. Use markdown formatting with **bold** and *italic* text to highlight important concepts.
+1. A high-level `summary` of the abstraction's main purpose and functionality{lang_hint}. Use markdown formatting with **bold** and *italic* text to highlight important concepts.
 2. A list (`relationships`) describing the key interactions between these abstractions. For each relationship, specify:
     - `from_abstraction`: Index of the source abstraction (e.g., `0 # AbstractionName1`)
     - `to_abstraction`: Index of the target abstraction (e.g., `1 # AbstractionName2`)
@@ -263,7 +304,7 @@ Context (Abstractions, Descriptions, Code):
     Ideally the relationship should be backed by one abstraction calling or passing parameters to another.
     Simplify the relationship and exclude those non-important ones.
 
-IMPORTANT: Make sure EVERY abstraction is involved in at least ONE relationship (either as source or target). Each abstraction index must appear at least once across all relationships.
+IMPORTANT: Make sure EVERY abstraction is involved in at least ONE relationship (either as source or target).
 
 Format the output as YAML:
 
@@ -379,6 +446,7 @@ Abstractions (Index # Name){list_lang_note}:
 Context about relationships and project summary:
 {context}
 
+A tutorial is a practical activity, in which the student learns by doing something meaningful, towards some achievable goal.
 If you are going to make a tutorial for ```` {project_name} ````, what is the best order to explain these abstractions, from first to last?
 Ideally, first explain those that are the most important or foundational, perhaps user-facing concepts or entry points. Then move to more detailed, lower-level implementation details or supporting concepts.
 
@@ -542,12 +641,14 @@ class WriteChapters(BatchNode):
-{language_instruction}Write a very beginner-friendly tutorial chapter (in Markdown format) for the project `{project_name}` about the concept: "{abstraction_name}". This is Chapter {chapter_num}.
+{language_instruction}Write a tutorial chapter (in Markdown format) for the project `{project_name}` about the concept: "{abstraction_name}". This is Chapter {chapter_num}.
+The tutorial must walk the user through a guided, hands-on learning experience. The goal is to help new codebase contributors onboard into our codebase. By the end of the chapter they should be able to contribute code to our codebase. You can assume that the reader is a proficient software programmer.
 
 Concept Details{concept_details_note}:
 - Name: {abstraction_name}
@@ -568,48 +669,48 @@ Instructions for the chapter (Generate content in {language.capitalize()} unless
 
 - If this is not the first chapter, begin with a brief transition from the previous chapter{instruction_lang_note}, referencing it with a proper Markdown link using its name{link_lang_note}.
 
-- Begin with a high-level motivation explaining what problem this abstraction solves{instruction_lang_note}. Start with a central use case as a concrete example. The whole chapter should guide the reader to understand how to solve this use case. Make it very minimal and friendly to beginners.
+- Begin with a high-level motivation explaining what problem this abstraction solves{instruction_lang_note}. Start with a central use case as a concrete example. The whole chapter should guide the reader to understand how to solve this use case. Make it minimal but complete.
 
-- If the abstraction is complex, break it down into key concepts. Explain each concept one-by-one in a very beginner-friendly way{instruction_lang_note}.
+- If the abstraction is complex, break it down into key concepts. Explain each concept one-by-one{instruction_lang_note}.
 
 - Explain how to use this abstraction to solve the use case{instruction_lang_note}. Give example inputs and outputs for code snippets (if the output isn't values, describe at a high level what will happen{instruction_lang_note}).
 
-- Each code block should be BELOW 20 lines! If longer code blocks are needed, break them down into smaller pieces and walk through them one-by-one. Aggresively simplify the code to make it minimal. Use comments{code_comment_note} to skip non-important implementation details. Each code block should have a beginner friendly explanation right after it{instruction_lang_note}.
+- Simplify the code to make it minimal but it must remain technically accurate. Use comments{code_comment_note} to skip non-important implementation details. Each code block should have a concise explanation right after it{instruction_lang_note}.
 
-- Describe the internal implementation to help understand what's under the hood{instruction_lang_note}. First provide a non-code or code-light walkthrough on what happens step-by-step when the abstraction is called{instruction_lang_note}. It's recommended to use a simple sequenceDiagram with a dummy example - keep it minimal with at most 5 participants to ensure clarity. If participant name has space, use: `participant QP as Query Processing`. {mermaid_lang_note}.
+- Describe the internal implementation to help understand what's under the hood{instruction_lang_note}. Provide a concise walkthrough on what happens step-by-step when the abstraction is called{instruction_lang_note}. Use a sequenceDiagram when appropriate. If participant name has space, use something like this: `participant QP as Query Processing`. {mermaid_lang_note}.
 
-- Then dive deeper into code for the internal implementation with references to files. Provide example code blocks, but make them similarly simple and beginner-friendly. Explain{instruction_lang_note}.
+- Then dive deeper into code for the internal implementation with references to files. Provide example code blocks. Explain{instruction_lang_note}.
 
 - IMPORTANT: When you need to refer to other core abstractions covered in other chapters, ALWAYS use proper Markdown links like this: [Chapter Title](filename.md). Use the Complete Tutorial Structure above to find the correct filename and the chapter title{link_lang_note}. Translate the surrounding text.
 
-- Use mermaid diagrams to illustrate complex concepts (```mermaid``` format). {mermaid_lang_note}.
-
-- Heavily use analogies and examples throughout{instruction_lang_note} to help beginners understand.
+- Use Mermaid diagrams to illustrate complex concepts (```mermaid``` format). {mermaid_lang_note}.
 
 - End the chapter with a brief conclusion that summarizes what was learned{instruction_lang_note} and provides a transition to the next chapter{instruction_lang_note}. If there is a next chapter, use a proper Markdown link: [Next Chapter Title](next_chapter_filename){link_lang_note}.
 
-- Ensure the tone is welcoming and easy for a newcomer to understand{tone_note}.
+- Ensure the tone is concise, friendly, and professional{tone_note}.
 
 - Output *only* the Markdown content for this chapter.
 
-Now, directly provide a super beginner-friendly Markdown output (DON'T need ```markdown``` tags):
+Now, directly provide Markdown output (DON'T need ```markdown``` tags):
 """

Everything else was the same. I’m still using gemini-flash-2.0.

Evaluation§

Index

Pros:

  • The summary looks completely correct now.

Cons:

  • The diagram is a lot more confusing now.

Chapter 1: microbit V1 and V2 crates

Pros:

  • There are some new technical details that are very relevant for new codebase contributors, such as the fact that the v1 and v2 boards use different MCUs and cross-compilation toolchains. (I haven’t fact-checked that.)

  • There’s an end-user-focused section on how to use the crates, but I actually think it’s appropriate to ground new codebase contributors in how end users are expected to get value from the codebase. “Focus on the user” etc.

  • The implementation section is grounded in real code now.

Cons:

  • This tutorial is supposed to introduce the most important concepts first. It’s debatable whether the v1 and v2 hardware editions is the most important concept.

  • The cited implementation file is incorrect. TCK mentions microbit-v2/src/lib.rs but the code seems to actually exist in microbit-common/src/v2/board.rs.

Chapter 2: Board struct

Pros:

  • The motivation sections seem to provide the kind of context that new codebase contributors would need.

  • The “using the board struct” code example looks complete.

Cons:

  • TCK seemed to generate incorrect Mermaid diagram code again.

  • It got the file path wrong again.

Chapter 3: Non-Blocking Display

Pros:

  • It’s very cool that TCK is now suggesting ways for the reader to contribute code.

Chapter 4: GPIO Module

Cons:

  • The “Contributing Code” section is light on details regarding the actual mechanics of contributing code. E.g. it says to follow existing code style but doesn’t tell you how to auto-format and lint your code.

Chapter 5: HAL (Hardware Abstraction Layer)

By now it’s very clear that each chapter of a TCK-generated tutorial will follow a very specific structure i.e. template. That is both a pro and a con. Predictable structure can aid learning. But it can also become boring.

Chapter 6: RTIC (Real-Time Interrupt-driven Concurrency) Integration

Pros:

  • This seems like a topic that is relevant to new codebase contributors that was not covered in the first generated tutorial.

  • This time, TCK correctly states that this is the end of the “tutorial”.

Conclusions§

The writing style and content is markedly improved. It’s much more geared towards new codebase contributors.

However, TCK still did not produce a tutorial, even though I defined that term pretty clearly in the prompts. This “book” is more of an architectural overview of the codebase. Which could still be super valuable! But the fact remains that this workflow did not generate a hands-on learning experience with specific start and stop points.

So, short story long, Pocket Flow has potential for bona fide documentation automation!

Open questions§

  • How do I hack TCK to work with large codebases?

  • How does TCK perform on a truly unknown codebase? It’s hard to evaluate TCK against a codebase like requests because requests is an extremely popular library. The underlying language model has been trained on lots of content related to proper requests usage.

  • How does TCK perform with more powerful models? Remember that I used gemini-2.0-flash for both of my attempts. The result in the second attempt is already pretty promising. How much better will it be with the new hotness known as Gemini 2.5 Pro??

  • I didn’t manipulate the nodes or graph at all. Is it easy? Will it dramatically change the generated output as expected?

  • Currently, it can only generate codebase architecture overviews. Can we actually get it to generate proper tutorials and how-to guides?

  • What the heck does “pocket flow” mean????