Demystifying Code Generation: Building Programs That Build Programs
July 17, 2023

Keith Casey
ngrok

Code generation is the art of writing programs that write other programs. The most common place to use code generation is for generating libraries. In those scenarios, you have a fixed specification in a domain-specific language (DSL) and a code generator uses that to create the function or API calls representing individual requests or pieces of functionality. While code generation seems simple at first, there are many sharp corners and hidden surprises in anything beyond the most trivial scenarios.

But before you decide whether to build a code generator or not, exploring the depths, complexity, and tradeoffs is key. Sometimes, this exploration will lead you to realize your time is better spent extending common tools to meet your unique use case. But there are many scenarios where creating a custom code generator is your best option. Before you explore, let’s consider one concrete scenario were you’d want a code generator and the best practices to make it useful in the long term.

When to Write a Custom Code Generator

Imagine creating a web service that requires email address validation. To ensure the email addresses are valid, the "correct" validation regular expression is wildly complex and obtuse. That said, there are a handful of easier aspects to check — like too many characters or missing an "@" — so we can start there. Initially, you rely on standard library methods like fmt.Errorf to report errors but will find that it doesn’t give the user enough information about the error. To counter that, you add two unique error codes, "email too long" and "invalid email," to make it easier for the user to understand why their email address won’t validate.

You could embed the unique error codes as ad hoc strings within existing strings, but this lacks structure and depends on downstream developers or the user to understand your new pseudo-convention. Alternatively, you could create a function that takes a unique error code as a parameter but that still relies on strings and will lead to inconsistent implementations. So how can you uplevel the implementation?

One solution is to generate unique functions for each error you want to handle. This allows for specialized error types, facilitating observability and enabling higher-level code components to work with specific error conditions. However, manually defining all these functions is time-consuming, especially for programs with hundreds or thousands of errors.

This is where writing a custom code generator comes in handy. By defining errors in a reusable format like a YAML file, you can generate the necessary functions. This approach simplifies error management, promotes consistency, reuses these checks across projects and enables the generation of functions for multiple programming languages. You can create a unified error repository for the software, ensuring clear communication and streamlined error handling.

Code Generation Best Practices

When writing a custom code generator, you should first consider the scope and complexity of creating it. There are always trade-offs between customization and simplicity, and developers should aim to understand these up front before diving in. Once you decide to write a custom code generator, developers must keep these best practices in mind.

1. Use comments to prevent edits: You should use comments to give other developers instructions on how to use and edit the code. Comments can tell other teammates not to modify the generated code directly and point them to the source of truth where they should make changes.

2. Isolate generated files: Developers should separate generated files from human-written code by using distinct suffixes or separate directories. Isolation enhances developer experience and makes it easier to identify machine-generated files, simplifying tasks like file filtering or removing outdated generated files.

3. Use a consistent template structure: A consistent template file is crucial for writing an effective code generator because it promotes a standardized format, reduces duplication, and allows for easy modification and maintenance. It facilitates collaboration among developers and enables extensibility for future enhancements.

How to Leverage Code Generation

If you’re considering code generation, start small. Write custom tooling that you never plan on releasing to the world. Explore, play, break things, figure out strengths and weaknesses and find the bounds of what’s possible. That way, when you inevitably consider code generation in practice, you’ll better understand the effort it takes to write and maintain a custom code generator or tailor something that already exists.

With the right tools and practices, code generation is an accessible and valuable tool in your toolbox. It offers a straightforward approach to automating repetitive tasks and establishing conventions. It’s one of the many ways you can solve complex problems, and it just might help you do it in fewer lines of code.

Keith Casey is Director of Product Marketing at ngrok
Share this

Industry News

November 20, 2024

Spectro Cloud completed a $75 million Series C funding round led by Growth Equity at Goldman Sachs Alternatives with participation from existing Spectro Cloud investors.

November 20, 2024

The Cloud Native Computing Foundation® (CNCF®), which builds sustainable ecosystems for cloud native software, has announced significant momentum around cloud native training and certifications with the addition of three new project-centric certifications and a series of new Platform Engineering-specific certifications:

November 20, 2024

Red Hat announced the latest version of Red Hat OpenShift AI, its artificial intelligence (AI) and machine learning (ML) platform built on Red Hat OpenShift that enables enterprises to create and deliver AI-enabled applications at scale across the hybrid cloud.

November 20, 2024

Salesforce announced agentic lifecycle management tools to automate Agentforce testing, prototype agents in secure Sandbox environments, and transparently manage usage at scale.

November 19, 2024

OpenText™ unveiled Cloud Editions (CE) 24.4, presenting a suite of transformative advancements in Business Cloud, AI, and Technology to empower the future of AI-driven knowledge work.

November 19, 2024

Red Hat announced new capabilities and enhancements for Red Hat Developer Hub, Red Hat’s enterprise-grade developer portal based on the Backstage project.

November 19, 2024

Pegasystems announced the availability of new AI-driven legacy discovery capabilities in Pega GenAI Blueprint™ to accelerate the daunting task of modernizing legacy systems that hold organizations back.

November 19, 2024

Tricentis launched enhanced cloud capabilities for its flagship solution, Tricentis Tosca, bringing enterprise-ready end-to-end test automation to the cloud.

November 19, 2024

Rafay Systems announced new platform advancements that help enterprises and GPU cloud providers deliver developer-friendly consumption workflows for GPU infrastructure.

November 19, 2024

Apiiro introduced Code-to-Runtime, a new capability using Apiiro’s deep code analysis (DCA) technology to map software architecture and trace all types of software components including APIs, open source software (OSS), and containers to code owners while enriching it with business impact.

November 19, 2024

Zesty announced the launch of Kompass, its automated Kubernetes optimization platform.

November 18, 2024

MacStadium announced the launch of Orka Engine, the latest addition to its Orka product line.

November 18, 2024

Elastic announced its AI ecosystem to help enterprise developers accelerate building and deploying their Retrieval Augmented Generation (RAG) applications.

Read the full news on APMdigest

November 18, 2024

Red Hat introduced new capabilities and enhancements for Red Hat OpenShift, a hybrid cloud application platform powered by Kubernetes, as well as the technology preview of Red Hat OpenShift Lightspeed.

November 18, 2024

Traefik Labs announced API Sandbox as a Service to streamline and accelerate mock API development, and Traefik Proxy v3.2.