Jun 21, 2025

Vibe-coding Minecraft mods

and the lessons learned

If you aren't familiar with the extent of the Minecraft modding scene, it's big. There are mods with more thought put into them than many AAA games.

The best example is Create:

So I was pretty lost when I started playing for the first time in a long time.

Table of Contents

The Problem

As some friends and I started playing the Divine Journey 2 modpack, I was overwhelmed by the number of modded blocks and items.

To give you a taste, there are 54 "generators" from 16 mods:

Finding them around our base was a pain. My friends knew where everything was, and would be telling me things like "put coal in the hopper that feeds the Coke Oven into the Blast Furnace by the biodiesel setup"

There's one obvious solution:

The Solution: more mods

I ended up creating two mods with a mix of OpenAI Codex (the cli), Claude Code, and Cursor to help me out:

  • TileFinder (GitHub, CurseForge): Search, filter, and easily locate TileEntities (fancy/complex blocks) from all mods
  • Unnamed Project Management mod: a multiplayer mod project management mod

I suffered no harm wrote no Java in the making of this project. More information on the models/coding experience is below; you can collapse the details if you aren't interested in the mods.

TileFinder

Unnamed Project Management mod

Learnings for vibecoding

  • o3 is a great agentic model. It's price reduction makes it cheaper than Sonnet 4.
  • Sonnet did a good job when I tried it too. But I felt like it generally did a worse job using tools to e.g. explore the code base. The exception is when inside Claude Code, Sonnet did great work.
  • Write rules files when you see common mistakes. I noticed all models kept grepping the codebase and getting back a lot of built/unnecessary files, which unnecesarily takes up context. Adding rules files (files automatically injected into the context) lets you address these issues.
  • OpenAI Codex as a CLI is good-not-great. I didn't try their web offering, which is probably better, but I'm not sure you can setup an entire environment for it to have a real agentic feedback loop.
  • I could have made every mod in its entirety with any of the tools I tried. But perhaps obviously Cursor + o3 was by far the most effective combo for me as a developer.

Learnings for developing

  • Errors need to suggest fixes.
    • If there is no one clear fix, they can suggest multiple solutions, or explain in plain-text why the issue occurred. They should do that second part anyways
  • More (AI) tooling is needed around targeting specific package versions.
    • Minecraft is tricky because it has many versions with many breaking changes, and the LLMs are trained across all the versions. I think one reason I had a lot of success is I was targeting a fairly old version of Minecraft known for its mods (1.12.2).
    • I've had this same issue at work with npm packages and libraries. Tools/MCPs like Context7 can help substantially with this.

Both mods (total) cost me about $8 in usage.