A few months back (at the start of this blog) I was thinking about interesting things you can find in C++, then I realized one thing: the keyword
classdoesn’t really have a good reason to exists!
This may seem a bit harsh said that way, but I will explain my statement later in the article.
Anyway, studying the reason for the word class to exist lead me to look into the history of the language. It was very interesting and I really enjoyed it.
So I decided to write a mini-series of articles about the history of C++ aiming to present concepts that may be outdated today, in C++20, and explain why some strange or debatable things were made that way in the past.
Hello! I was talking to a friend yesterday who was studying for a programming interview and trying to learn some algorithms basics.
The topic of quadratic-time vs linear-time algorithms came up, I thought this would be fun to write about here because avoiding quadratic-time algorithms isn’t just important in interviews – it’s sometimes good to know about in real life too! I’ll explain what a “quadratic-time algorithm is” in a minute :)
here are the 3 things we’ll talk about:
quadratic time functions are WAY WAY WAY slower than linear time functions
sometimes you can make a quadratic algorithm into a linear algorithm by using a hashmap
this is because hashmaps lookups are very fast (instant!)
I’m going to try to keep the math jargon to a minimum and focus on real code examples and how fast/slow they are.
Engineering alone can’t fix what’s wrong with the internet - a short reflection on the lessons we learn by caring for the software we create.
This article attempts to give a sort of ‘orientation tour’ for people whose previous programming background is in high (ish) level languages such as Java or Python, and who now find that they need or want to learn C.
C is quite different, at a fundamental level, from languages like Java and Python. However, well-known books on C (such as the venerable Kernighan & Ritchie) tend to have been written before Java and Python changed everyone’s expectations of a programming language, so they might well not stop to explain the fundamental differences in outlook before getting into the nitty-gritty language details. Someone with experience of higher-level languages might therefore suffer a certain amount of culture shock when picking up such a book. My aim is to help prevent that, by warning about the culture shocks in advance.
This article will not actually teach C: I’ll show the occasional code snippet for illustration and explain as much as I need to make my points, but I won’t explain the language syntax or semantics in any complete or organised way. Instead, my aim is to give an idea of how you should expect C to differ from languages you previously knew about, so that when you do pick up an actual C book, you won’t be distracted from the details by the fundamental weirdness.
I’m mostly aiming this article at people who are learning C in order to work with existing C programs. So I’ll discuss ways in which things are commonly done, and things you’re likely to encounter in real-world code, but not things that are theoretically possible but rare. (I do have other articles describing some of those.)
Agilebits recently caused a stir with their announcement that they’ve rewritten 1Password 8 as a cross-platform Electron app, replacing their well-loved native Mac app. The takes came hot and fast. Like many developers, I love and appreciate a well-crafted native UI, and I’ve been somewhat skeptical of the consistent trend towards cross-platform app UIs.
Now, the discourse around cross-platform app technologies has traditionally revolved around a simple idea: cross-platform development is cheaper, while native development leads to better apps.
I recently read Ruben Schade’s I’m not sure that UNIX won (via) and had a number of reactions to it. One of them is about the portability of programs among Unixes, which is one of the issues that Schade sees as a problem today. Unfortunately, I have bad news for people who are yearning for the (good) old days. The reality is that significant Unix programs have never been really portable between Unix variants, and if anything today is at an all-time high for program portability by default between Unixes.
Back in the days (the late 1980s and early 1990s specifically), one of the things that Larry Wall was justly famous for was his large, intricate, and comprehensive configure scripts that made rn and Perl build on pretty much any Unix and Unix-like system that you could name. Wall’s approach of configure scripts was generalized and broadened by GNU Autoconf, GNU Autotools, and so on. These tools did not automatically make your complex programs portable between different Unixes, but they gave you the tools that you could use to sort out how to achieve that, and to automatically detect various things you needed to do to adopt to the local Unix (and if you used some of them, you automatically got pointed to the right include directories and the right libraries to link with).
It is not difficult to find some incredibly shitty takes on Electron, and every time it boils down to: It’s slow. Downloads are huge, and it uses a lot of memory. Electron apps are just websites. Developers that are using Electron are taking the lazy or easy approach to cross-platform development. Native apps are just better in every single way.
And somehow, these arguments often come from Apple fans when they discover one of their apps isn’t “native” or when a macOS favourite is considering moving to Electron. How dare they!
And on the surface, I agree with pretty much everything that people say about Electron. And at the same time, I don’t care at all. And neither should you.
Let’s take a look at some of these arguments:
VLA is an acronym of variable-length array, which is an array (actual array, not just block of memory which can act like one) that have size (at least one dimension) determined during runtime (instead of at compile time).
Languages supporting VLAs in one form or another include: Ada, Algol 68, APL, C, C#, COBOL, Fortran, J and Object Pascal. As you may notice, beside C and C#, those aren’t languages one would call mainstream nowadays.
VLAs were introduced with the revision C99 of the C standard. At first glance they seem convenient and efficient, but it’s just an illusion. In reality they are just sources of constant issues. Truly a stain on otherwise really good revision.
As you could have guessed by the quote at the beginning, project which used to rely on VLA quite extensively is nothing else than Linux kernel. Maintainers spent a lot of effort to get rid of all VLA and as of version 4.20 (year 2018) it’s completely VLA-free.
In some domains of programming it’s common to want to write a data structure or algorithm that can work with elements of many different types, such as a generic list or a sorting algorithm that only needs a comparison function. Different programming languages have come up with all sorts of solutions to this problem: From just pointing people to existing general features that can be useful for the purpose (e.g C, Go) to generics systems so powerful they become Turing-complete (e.g. Rust, C++). In this post I’m going to take you on a tour of the generics systems in many different languages and how they are implemented. I’ll start from how languages without a special generics system like C solve the problem and then I’ll show how gradually adding extensions in different directions leads to the systems found in other languages.
One reason I think generics are an interesting case is that they’re a simple case of the general problem of metaprogramming: writing programs that can generate classes of other programs. As evidence I’ll describe how three different fully general metaprogramming methods can be seen as extensions from different directions in the space of generics systems: dynamic languages like Python, procedural macro systems like Template Haskell, and staged compilation like Zig and Terra.
Say you’re building a new (compiled) programming language from scratch. You’ll inevitably have to debug programs written in it, and worse, many of these problems will lead you into deep magic, as you uncover problems with your compiler or runtime. And as you find yourself diving into the arcane arts, your tools may be painfully lacking: how do you debug code written in a language for which debuggers and other tooling simply has not been written yet?
In the implementation of my own programming language, I have faced this problem many times, and developed, by necessity, some skills around debugging with crippled tools that may lack an awareness of your language. Of course, the ultimate goal is to build out first-class debugging support, but we must have a language in the first place before we can write tools to debug it. If you find yourself in this situation, here are my recommendations.
Everyone uses math libraries (i.e., libm), which provide approximations for elementary functions. Surprisingly (or not), mainstream math libraries (i.e., Intel’s and GCC’s libm) do not produce correct results for several thousands of inputs! Developers of modern software systems are seldom aware of them, which affects the reproducibility and portability of software systems. This blog post describes our new method for synthesizing elementary functions that produces correct results for all inputs but is also significantly faster than the state of the art libraries, which have been optimized for decades.
Modern C++ has a lot of cool features. Move semantics means passing around structs in functions is cheap.
std::shared_ptrmeans I don’t have to manage any memory; no more
delete! (But try as I might to understand
std::unique_ptr, I’m just not there yet.)
The syntax has also gotten some treatment with
autoand tuple destructuring.
In order to test out this modern C++ I wanted a small but meaningful project that operates on very dynamic data. The two that always come to mind are JSON parsers or Lisp interpreters.
This post walks through writing a basic JSON library from scratch using only the standard library. The source code for the resulting library is available on Github.
The biggest simplification we’ll make is that rather than full JSON numbers, we’ll only allow integers.
Mafia: Definitive Edition (2020) is a remake of the much-loved gangster classic Mafia (2002), originally released for PS2 and Xbox. The game is relatively linear and very story focused, whose narrative I personally found gripping and worthy of being compared to Scarface or Goodfellas. Hangar 13 use their own technology to take on open worlds and stories, previously used for Mafia III, to bring Tommy and the Salieri family to life. It is a DX11 deferred engine on PC, and RenderDoc 1.13 was used to capture and analyze.
C++20 introduces concepts as a way to write powerful self-documenting templates. At their core, concepts are logical binary expressions that can be used to constrain the template parameters of any class or function template. These logical expressions are evaluated at compile time to determine whether the template parameters satisfy the given constraints.
The purpose of this tutorial is to be a definitive introduction point for C++20 concepts.
Unicode definitely added a lot of complexity to string handling, and people who use languages with ASCII alphabets exclusively may think it’s unjustified. However, I’m a non-ASCII language speaker who’s old enough to remember the alternatives, and the alternatives are far worse than the complexity of Unicode.
For the record, I got started with computers in the early 2000’s Russia. The Russian language uses an alphabet based on the Cyrillic script. As far as non-ASCII languages go, it’s relatively simple: 33 letters, each has uppercase and lowercase versions, and the upper/lower case transform is invertible. Since it’s just 33 letters, you can fit it in an 8-bit encoding and still have room for pseudographics.
The catch is that, for a time, Russian had three almost equally common encodings.
A few years ago, I came across the blog post Why Why Functional Programming Matters Matters. It was a major factor in my decision to invest heavily into learning functional programming, so it’s safe to say it’s had a huge impact on my development as a software engineer. In fact, I highly recommend reading everything Reginald Braithwaite has ever written; I can’t claim to have read it all, but I can say that I’ve never regretted any minute spent doing it.
Today, however, I want to talk about the paper that blog post talks about. It is titled “Why Functional Programming Matters”, and if you have never read it, you should go read it right now. Don’t worry, I’m not going anywhere. As a piece of static text, I have all the patience in the world.
As a brief summary, the paper argues that functional programming yields better programs because it adds constraints on programs, which in turn lets the programmer make assumptions about program segments (i.e. functions), which increases reasoning ability, and thus composability, and therefore makes it possible to use new types of “glue” between code segments, thereby improving code reuse. I’m not going to repeat all of the arguments, you just read them. (Seriously, go read the paper if you haven’t.)
A couple weeks ago I decided to reread the paper, to check if everything I’ve learned in the past dozen years since I first read it would allow me to gather any new insight. I decided that a superficial reading would not be enough, and that to force me to really carefully read everything I would reimplement the code samples. I also thought this would be a nice test of how Clojure holds up to Hughes’ definition of functional programming.
This has led me to a few realizations that I had not made while simply reading the paper, which is the subject of this blog post. If you’re interested in more details than are presented here, you can take a look at the entire Clojure code here.
There was a SEV review meeting once upon a time, and in it, we had reviewed some incident where something bad happened involving something that ran as root. I don’t remember the finer points of that one any more, but the resident VP who ran the show wondered aloud how much “root exposure” we had in our infra in general. That got my attention.
I decided to try to get an answer to it. While in the meeting, I hit up my little dataset of everything running on any machine in production, then narrowed it down to anything running as root (that is, uid 0), and also with some (TCP) network listening ports. I figured those would be the easiest to “pop”. Think “SELECT … FROM … WHERE uid = 0 AND ports IS NOT NULL” type of thing.