New on Data Points: A Workflow for LLM-Augmented Codebook Generation

March 3, 2026

Workflow for LLM-Assisted Codebook Generation

New on Data Points: Yuxin Liang, the data scientist at DDDI, shares a practical workflow developed as part of our Data Science for Social Good (DSSG) program's The Immigration Courts: Processing and Analyzing Data from The Executive Office for Immigration Review project.

The post walks through how to combine a manually curated codebook with targeted LLM inference to fill in missing variable metadata from EOIR immigration court record — feeding the model structured context and explicit guardrails rather than asking it to generate a codebook from scratch.

Check out the full post here!

Interested in sharing your work? Contributions welcome!

Data Points showcases the work of Penn’s data science community through concise, engaging articles. We can help you craft your post with technical and editorial guidance so it's as impactful as the research it represents. To get started, just send us an email.