The C strcpy function is a common sight in typical C programs. It’s also a source of buffer overflow defects, so linters and code reviewers commonly recommend alternatives such as strncpy (difficult to use correctly; mismatched semantics), strlcpy (non-standard), or C11’s optional strcpy_s (no correct or practical implementations). Besides their individual shortcomings, these answers are incorrect. strcpy and friends are, at best, incredibly niche, and the correct replacement is memcpy.

If strcpy is not easily replaced with memcpy then the code is fundamentally wrong. Either it’s not using strcpy safely or it’s doing something dumb and should be rewritten. Highlighting such problems is part of what makes memcpy such an effective replacement.

Note: Everything here applies just as much to strcat and friends.

Read More

Whenever we speak about indexes, especially in PostgreSQL context, there is a lot to talk about: B-tree, Hash, GiST, SP-GiST, GIN, BRIN, RUM. But what if I tell you that even the first item in this list alone hiding astonishing number of interesting details and years of research? In this blog post I’ll try to prove this statement, and we will be concerned mostly with B-tree as a data structure.

OpenTTD is a free, open-source recreation of the Chris Sawyer masterpiece: Transport Tycoon. I lost many hours playing this after opening it for Christmas in 1994. (alongside Outpost, and Global Domination – thanks, dad!)

The three goals for this project:

  • Share steps that beginners can follow to take apart any large open-source project

  • Do it by example with OpenTTD

  • Add an active, intermediate-difficulty project to my library of decoded work

This project deviates from my usual practice of documenting every line of code. It’s not practical to take apart over 200,000 lines, especially for an actively developed project. Instead, I will pick and choose the key ideas that an aspiring contributor may want to dig in to. Above all, this exercise should be transferable to other projects.

Here’s the plan of attack:

  1. The setup

  2. Code overview

  3. Organization

  4. Game Object Model

  5. Engine core

  6. Startup & Initialization

  7. Game loop

  8. Honorable Mentions

  9. Appendix

Read More

Welcome to part 1 of Modern C++ for C Programmers, please see the introduction for the goals and context of this series.

In this part we start with C++ features that you can use to spice up your code ‘line by line’, without immediately having to use all 1400 pages of ‘The C++ Programming Language’.

Various code samples discussed here can be found on GitHub.

Reflection is often presented as a feature that makes software harder to understand. In this article, I will present ways to approximate some level of static reflection in pure C++, thanks to C++17 and C++20 features, show how that tool can considerably simplify a class of programs and libraries, and more generally enable ontologies to be specified and implemented in code.

Have you ever envisioned the daily C preprocessor as a tool for some decent metaprogramming?

Have you ever envisioned the C preprocessor as a tool that can improve the correctness, clarity, and overall maintainability of your code, when used sanely?

I did. And I have done everything dependent on me to make it real.

Meet Metalang99, a simple functional language that allows you to create complex metaprograms. It represents a header-only macro library, so everything you need to set it up is -Imetalang99/include and a C99 compiler ^
. However, today I shall focus only on two accompanying libraries – Datatype99 and Interface99. Being implemented atop of Metalang99, they unleash the potential of preprocessor metaprogramming at the full scale, and therefore are more useful for an average C programmer.

I shall also address a few captious questions regarding compilation times, compilation errors, and applicability of my method to the real world.

Nuff said, let us dive into it!

Read More

An important feature of transactional databases like SQLite is “atomic commit”. Atomic commit means that either all database changes within a single transaction occur or none of them occur. With atomic commit, it is as if many different writes to different sections of the database file occur instantaneously and simultaneously. Real hardware serializes writes to mass storage, and writing a single sector takes a finite amount of time. So it is impossible to truly write many different sectors of a database file simultaneously and/or instantaneously. But the atomic commit logic within SQLite makes it appear as if the changes for a transaction are all written instantaneously and simultaneously.

SQLite has the important property that transactions appear to be atomic even if the transaction is interrupted by an operating system crash or power failure.

This article describes the techniques used by SQLite to create the illusion of atomic commit.

Read More

Shell languages such as Bash excel at certain tasks, such as gluing programs together or quickly automating a set of command line steps. In contrast to those strengths, using a Shell to parse an INI config file is a bit like writing a poem in mud, you might succeed, but the result will probably be inscrutable and your swear jar will be full! As this wonderful Stack Overflow post attests there are many different ways to parse an INI file in Bash, but few of the answers provided are elegant.

So if you have a task poorly suited to Bash, what are your options?

  1. Choose another language for the task? (Perhaps sensible, but not always fun.)

  2. Write a custom Bash builtin to extend Bash for the task? (Spoiler, this is the route we will choose!)

Read More

Cyclomatic Complexity was initially formulated as a measurement of the “testability and maintainability” of the control flow of a module. While it excels at measuring the former, its underlying mathematical model is unsatisfactory at producing a value that measures the latter. This white paper describes a new metric that breaks from the use of mathematical models to evaluate code in order to remedy Cyclomatic Complexity’s shortcomings and produce a measurement that more accurately reflects the relative difficulty of understanding, and therefore of maintaining methods, classes, and applications.

Read More

Many years ago, Peter Norvig wrote a beautiful article about creating a lisp interpreter in Python. It’s the most fun tutorial I’ve seen, not just because it teaches you about my favorite language family (Lisp), but because it cuts through to the essence of interpreters, is fun to follow and quick to finish.

Recently, I had some time and wanted to learn Rust. It’s a beautiful systems language, and I’ve seen some great work come out from those who adopt it. I thought, what better way to learn Rust, than to create a lisp interpreter in it?

Hence, Risp — a lisp in rust — was born. In this essay you and I will follow along with Norvig’s Lispy, but instead of Python, we’ll do it in Rust 🙂.

Read More

We wrote a Brainfuck compiler back in April. That one developed a simple architecture to support multiple backends. Today, I want to write a new Brainfuck compiler. One that can only run on a single system, OpenBSD/amd64. And one that is as small as I can possibly make it. I broke into the sub-256 bytes club with my compiler, so I thought it was worthwhile sharing and talking about how I was able to pull it off.

While this works on my machine, there may be subtle differences about how your machine works that cause this compiler not to work. Make sure to check your machine’s behavior if there are any differences!

We are optimizing exclusively for size. That means that we may make decisions that are less performant than other ways of writing a Brainfuck compiler. That’s OK.

The assembly code will be written using AT&T syntax. Last time I had a blog post with assembly code, someone complained that it was not Intel syntax. I am not much bothered by either and I teach my undergraduates both syntaxes because I think it is important to be able to read both. If you are an Intel syntax person, take this post as an opportunity to improve your AT&T syntax skills.

Read More

The core message that I want people to take away is that there is potentially a huge amount of value to be unlocked by replacing SQL, and more generally in rethinking where and how we draw the lines between databases, query languages and programming languages.

In the title, I used the word “const” because const is a keyword every C++ developer knows (I hope). However, the generic concept this article is about is actually called “immutability”, and not “constness”. const is a keyword used in some languages (C++, Javascript, etc.) but the concept of immutability exists in other languages (in some of them, there is even no keyword attached to the concept, like in Rust).

Over several years I’ve had several conversations with people about static integer types in programming languages. Given the number of subtleties in the design, and pitfalls in the use, of integer types it’s not surprising that most programmers are unaware of at least some of them. However, what is perhaps more surprising is that most programming language designers and implementers aren’t fully aware of some of the subtleties and pitfalls either. That leads to languages and their implementations baking in unintended, and sometimes unfortunate, design decisions. At the end of conversations on this topic, I’ve sometimes said “this stuff is hard and someone should think more about this and write it up” but it’s an unsatisfying conclusion. So even though I don’t have all the answers – indeed, I’m fairly sure that I don’t know all the questions – I’m going to jot down my current thinking on this subject.

Read More