I just spent $13.5 for type annotations on 2.8k lines of code and I think you should, too

I just type-annotated the entire codebase of openedx/code-annotations. The pull request is here: chore: add strict type annotations on the entire codebase by regisb · Pull Request #169 · openedx/code-annotations · GitHub

It’s taken me ~4h, during which I used Claude Code and spent $13.5 in API usage. Most of the time was spent adding some tooling on top of the repository (make targets, requirements compilation, etc.) and reviewing changes made by Claude.

That last step was absolutely necessary, because (surprise surprise) AI models make mistakes. Still, generative AI allowed me to reach 100% coverage about 3-5x faster than if I had done it manually. After this last verification step, I feel pretty happy with the result.

I’m not going to share the exact Claude prompt I used, because it was very basic. It was something along those lines:

Annotate the code_annotations/ directory with types such that mypy --strict is successful. Don’t use typing.List/Dict/Tuple, but use the core list/dict/tuple classes instead.

And then rinse and repeat with the tests/ and test_utils/ subdirectories. Each time, I committed changes locally such that I could rollback the changes. Sometimes Claude was digging itself into a hole, especially with Sphinx annotations; in those cases, I had to intervene manually.

Some of the mistakes included things like:

result = some_function_that_can_return_none()
do_something_with(result)

that were changed to:

result = some_function_that_can_return_none()
if result is not None:
    do_something_with(result)

which is very wrong.

I’m a big fan of type annotations in Python, and especially in Open edX. I really wish we could fully annotate the entire Python codebase, which would give us a lot more confidence in using the APIs, refactoring and testing changes. So this is a step forward that makes me super happy.

If the community feels good about this, and I find some free time, I’d like to apply the same technique to the other repositories that I maintain: edx-toggles (600 lines of code), django-config-models (1k LOC) and edx-django-utils (4.7k LOC).

10 Likes

Nice! I think this sort of thing is a great use of AI. I might try it out with JSX → TSX conversions.

1 Like

Here’s my first attempt. Cost about $8 and required more hand-holding than I’d hoped, but I think it did a decent job: Convert the "Taxonomy" app to 100% TypeScript by bradenmacdonald · Pull Request #2025 · openedx/frontend-app-authoring · GitHub

In retrospect it could have worked better if I’d given it better prompts and/or if we had a Claude.md file with instructions about the best way to perform certain tasks in that repo. I also wish I could interact with it on GitHub via regular code review instead of running it locally and having it push commits as me.

2 Likes

I’d be curious how this stacks up against tools people have already written for annotating existing Python codebases with types.

1 Like

I recently came across this demo of the Zed editor, which I really like because

  • The tasks can be described in a plain language without much specification (Claude.md, .cursorrules..etc.,). The editor provides necessary tool calls to read the necessary files contextually.
  • Follow along live in the editor or review diffs for reviewing.

I tried it for generating tests for a typescript component and found that mocking data and setting up boilerplate is the best help it can offer at the moment. Test implementation itself needed to be done by hand, even though the component is fully implemented. I guess it speaks to the “doesn’t understand” part of the LLMs.

I am yet to give it a try on Python codebase and typing seems to be a really good use case. Predictable but mundane for humans to do by hand.

I wasn’t aware such tools existed :sweat_smile: Looks like the most mature of the bunch is Instagram’s MonkeyType. If I had known, I would have used both MonkeyType and Claude Code together to iterate on the code base. Out of the box, it seems that MonkeyType would not be sufficient by itself:

  • Some annotations are missing
  • Some annotations are too verbose
  • MonkeyType can’t help resolving issues raised by mypy concerning typing inconsistencies

I believe that using both tools would have resulted in a better, faster and cheaper result than with just one of the two.

1 Like

Just to follow up on this post, Axim has just released guidance for working with AI code, and folks should be aware of it and what their responsibilities are when using this tools in the Open edX org. :heart:

3 Likes