Why The Forum Was Better In The Old Days

You may have noticed, if you’ve participated in online communities or forums*, that the old guard always thinks the forum used to be better, and simultaneously the newer members often think that it’s pretty great as it is now. I propose that this is a necessary consequence of having a community.

The Beginning

You want to start a forum or blog or whatever to talk about something that’s interesting to you and doesn’t have a lot of interesting discussion elsewhere that you can find. Let’s say, for the sake of argument, the topic you’re really interested in is Shakespeare’s poetry (and, again for the sake of argument, let’s pretend there aren’t a thousand other places that people like to discuss this). So you start up a blog and you post really interesting thoughts about the Sonnets, and you engage with the academic criticism, and so on. People start noticing that you’re insightful and knowledgeable, and you start getting regular readers commenting on your posts, saying things like, “hey, that was really interesting – it reminded me of this paper I read recently, what do you think?” and in general a conversation is started and a community begins to form.

Then one day, one of your community members links to a blog post they found interesting about Edmund Spenser. Spenser is also an important part of the canon of 16th century English poetry, so you let it slide, and people start talking about Spenser. They bring up interesting points and are friendly with each other, so all is well.

The Spiral

Unfortunately, these people were too interesting, and now you have a regular “Spenser thread” commented on each of your new posts where people congregate to discuss The Faerie Queene or whatever. This isn’t the end of the world — most of the people in your community are still mostly discussing Shakespeare, after all, and if they’re getting to know each others’ tastes a bit and finding more common interests to discuss, isn’t that a good thing?

Before you know it, people who are only interested in Spenser are popping by your comment threads, and they’ve only ever read Hamlet and none of the sonnets! You’re a little miffed by this, but your Shakespeare-focused fans who also like Spenser think they’re adding a lot to the conversations, so you let it slide yet again.

And so on. You get someone talking about John Donne, and you get threads derailed by conversations on T.S. Eliot’s criticism of Hamlet, and then someone starts talking about how great Eliot’s poetry was, and suddenly your community isn’t about Shakespeare anymore! It’s a bunch of relatively like-minded people just talking about poetry in general.

And There We Have It

Now the people who joined at the beginning, who are truly fanatical about Shakespeare, start bemoaning that there’s hardly any good discussion of Henry IV anymore, and who gives a shit about Rembrandt or whatever has come up the last two weeks, and why is this community such a shithole these days?

And the answer is: it’s become a community, and communities are tied together by friendship and conversation and humour and so on, not just shared devotion to a particular topic. The small number of people who really just want to discuss Shakespeare constantly are massively outnumbered by the large number of people who do genuinely love Shakespeare but want to talk about lots of other stuff, too, with other people who “get” them.

So? What’s wrong with communities?

Nothing! But I think this pattern is approximately inevitable unless you take a very strict stance from day one in terms of what people can discuss in your forum, and then you end up with a very small group of people who don’t feel especially connected to one another. I think your choices are basically either to have a small, focused group that is extremely insular and not socially connected, or to have a sprawling, loosely-connected group of people that just starts to feel like any regular community over time.


* I imagine this occurs in meatspace communities, too, but I’m going to stick to online ones for this post.

Posted in sociology, Uncategorized | Leave a comment

Rap Quines

A Quine is a form of poetry in which a poem, through formalism or sometimes more literally, states that it is going to state part of itself, and then does so. This is related to the modern programming art form of a quine, which is a computer program that outputs itself when run.

Rap in particular offers some notable quines and quine-forms, some of which I would like to present today. The simplest and perhaps most illuminating example is the following, a construction taken from a not-entirely-respectful caricature of rap to its obvious conclusion; that is:

My name is Alexander, and I’m here to say:

“My name is Alexander, and I’m here to say!”

In this short rap verse, line 1 is a promise to the audience that the narrator has something to say; in line 2 the tension is momentarily broken by revealing what it is Alexander set out to say. However, then we wonder: has Alexander indeed said, “My name is Alexander and I’m here to say,” or is this a promise of future effusion? But we think back to the first line, and realize the narrator has indeed said what he promised to, in addition to promising to say it. Thus the quine is complete, and the effect is one of whiplash between future promises and their already-past satisfaction.

We will scrutinize one more example, this one from popular culture: Taio Cruz’s spectacular self-referential and obviously heavily quine-influenced song, “Dynamite.” In it, Cruz masterfully intertwines promises to act or explain with both previous and past references to performing that act or giving the promised explanation. To start with, the song opens with the following:

I throw my hands up in the air sometimes,

Saying AYO! Gotta let go!

This is already a complex lyrical dance. We have what begins as a simple opening clause: “I throw my hands up in the air,” a relatively standard gesture of celebration. This immediately comes down with the addition of the word “sometimes” — is Cruz throwing his hands up now, and shouting the declaration “gotta let go,” or merely letting us know that he sometimes does? Is this describing an action, or foreshadowing later revelations?

To skip ahead somewhat (Cruz did need to get this masterful formalist poetry on the airwaves after all; some of it is straightforward), we find the following in the chorus:

‘Cause we gon’ rock this club, we gon’ go all night,
We gon’ light it up like it’s dynamite!
‘Cause I told you once, now I told you twice,
We gon’ light it up like it’s dynamite!

The chorus on its own is very nearly a quine! The first line is somewhat out of place without the context of the rest of the song serving as setting, but otherwise this holds its own amongst historical quines. It inverts the normal poetical structure of a quine somewhat: normally, as in the first example we saw, a quine explains what it’s going to say before saying it (or simultaneously). In this case, however, we have a symmetry around line 3, reminiscent of classic presentation advice: first Cruz says what he’s going to say, then he explains very clearly what’s up next: he’s going to repeat himself. And he faithfully does.

Interestingly, this refrain is repeated word for word later in the song. Cruz is trying to get us to carefully evaluate the number of times he’s told us we’re going to “light it up like it’s dynamite,” by saying it for the third time, then claiming he “told [us] once, now [he’ll tell us] twice”, but actually following up by telling us for the fourth time! In one sense, this is inaccurate, but since when is poesy supposed to be factual? He is both telling us another two times, and at the same time in a sense resetting us to the earlier appearance of the lines, reminding us of the cyclical, symmetric nature of his proclamations.

I don’t want to ruin the fun of analyzing all the depth in this poem for you, but let me leave you with a final observation. Harken back to that ambiguity from the introduction: is Cruz declaring that he is raising his hands, or suggesting his intention to, at a later point? While I cannot claim a final answer to this enigma, I will say that whatever his earlier status was, he surely raises his hands during the triumphant bridge:

I’m gonna put my hands in the air!
Hands, hands in the air!
Put your hands in the air!

This is, while not the end of the song, the moment where the potential energy set up at the beginning (and repeated before each chorus) is definitively resolved, allowing the listener to release her breath. Cruz insists repeatedly that he “sometimes” puts his hands in the air. Here, though, it is realized: in another masterful use of three-line symmetry, Cruz is first reaching tentatively, as if unsure of himself, to raise his own hands. In the middle line, his hands are clearly in the air, and in the final line he turns it around on us: now you, too, must put your hands on the air and join in the aura of celebration.

It is worth noting that, enhancing the reflection about the middle line here, the middle line itself can be read as a part with either of the others: “I’m gonna put my hands in the air/ [my] hands, [my] hands [are] in the air!” reads just as naturally as “Hands, hands in the air!” being in the imperative voice, the way a police officer might say it, which aligns with the unambiguous exultation of “put your hands in the air!”

The allusion to police language in a rap song by a black man could not possibly be accidental, but I will leave it to the interested reader to look further into it.



Posted in jokes | Leave a comment

On Trolley Problems

If you choose not to decide, you still have made a choice

There’s a philosophical question that people seem to struggle with, commonly called the trolley problem, that I never really got the difficulty of until fairly recently. The original form, the one I was first introduced to, is formulated like so:

You are on a train, and coming up on a junction. The tracks are currently set such that the course the train will follow will run over 5 people tied to the tracks. However, you are able to switch the tracks to a side channel. This alternative course will run over 1 person tied to the tracks. Do you pull the lever to change to the alternate route?

This is excruciatingly clear to me: of course you change routes. The problem reduces to “is it better to kill one person or five?” which has a pretty obvious answer. And so I never understood why there’s any struggle over it. (Incidentally, I am deeply suspicious of philosophers finding ways to argue about nonsense, which perhaps biased me to miss important aspects of the question).

There are extensions of the trolley problem that all retain the basic form: should you kill one person, or kill five? One of these I really struggled with, which is a formation like the following:

A train is barreling, undivertably, toward five people tied on the tracks. You are on an overpass that looks over the tracks. You are standing next to an impossibly fat person who, if you pushed them off the ledge in front of the train, will slow it down sufficiently to save the five trapped people, but at the cost of the fat person’s life.

The question — and, thus, the answer — are the same: five people are worth more than one person, all else being equal (and despite the weird contrivances to make this situation seem vaguely plausible, we assume all else is equal). But this one made me recoil and struggle with my own understanding of the problem: imagining pushing someone over the tracks to save five lives makes me viscerally uncomfortable.

This brewed at the back of my mind for some time: the well-understood belief that killing the one to save the five is the correct course of action, alongside the deeply held feeling that it was wrong. Feeling so strongly repulsed by an action you thoroughly believe is morally correct is an uncomfortable thing, and so I tried to understand the disconnect here: surely there is something I am missing (I notice I am confused).

I had a few ideas about this — perhaps the net decrease in happiness caused by the pushing outweighs the net increase of +4 lives, perhaps I am using as a crutch some pre-cached idea that pushing people off of bridges is never the right answer, etc — but they all felt somewhat hollow.

Recently I’ve come to what I think is a more insightful explanation for the discomfort. I think that, as a matter of human nature, a bug in our programming (be it social or neural; irrelevant here), humans categorize “do something” and “do nothing” separately, which is completely factually incorrect.

The idea of “no action” as being a thing a person can take is actually incoherent. You are at all times doing something; what is meant by “do nothing” has to be in reference to a particular situation. I can choose to do nothing about a scenario (not pushing a person off a bridge) but in that case I am merely doing something else (standing around feeling unhappy) instead.

Inaction towards a particular scenario is not a different category from action. It is merely a human perception of inertia, or of some “default.” If I am in the “fat person” trolley scenario, I have two choices: push a person off a bridge, or stand idly by. These choices have to be considered as fully equal in terms of agency. And it is only by incorrectly considering one choice as “nothing” and the other choice as “intervening,” or similar, that I can even start to absolve myself of the moral guilt of the murder of four people.


When choosing your actions, do not forget that inaction is a choice on the exact same level as any other choice. When you start to treat inaction as a separate category, you wind up in scenarios that lead to such famous quotes as “The only thing necessary for the triumph of evil is for good men to do nothing” and “then they came for the Jews, and I did not speak out— because I was not a Jew.”

Posted in morality, philosophy | Leave a comment

ORMs model databases, databases don’t model objects

I was discussing the merits of a couple different ways to abstract some data common to multiple models at work, and my coworkers asked some questions that forced me to articulate exactly why I have such strong opposition to things like “abstract models.” My conclusion, basically, is that I think ORMs serve as models of the database, not as arbitrary app-level objects, and should be designed as such.

Let’s make up some models here to demonstrate what was going on. We start with a model for cats, of course, called Cat, of course, that looks like this*:

class Cat(Model):
    # cat-specific
    fur = ChoiceField('short', 'medium', 'long')
    breed = CharField()
    # vitals
    heartrate = IntegerField()
    core_temp = FloatField()

but now we’ve decided we want to model frogs, too (god knows why!).

We have a couple different ways of modeling this. Coming at it from the code viewpoint, you might have something like this:

class AnimalMixin(object):
    heartrate = Integerfield()
    core_temp = FloatField()

class Cat(Model, AnimalMixin):
    fur = ...

class Frog(Model, AnimalMixin):
    jump_height = ...

This is the kind of thing that your average OO programmer would reach for (well, perhaps AnimalMixin would merely be a base class called Animal) in app code. And, in that context, it’s probably even defensible!** But this is an extremely code-centric view of things that basically ignores the database. What happens in this case? What tables and columns get created?

Well, it sort of depends. In Django, anyway, there’s a couple ways to do table inheritance: there’s a way where the base class just adds fields to all inherited classes; i.e. you would have a Cat table with heartrate and core_temp columns, and you would have a Frog table with heartrate and core_temp columns. The other way is to do a bunch of joins; i.e. you have a table with heartrate and core_temp columns, and your Cat class has a foreign key back to that table, but in the code you just have access to cat.heartrate and it doesn’t feel like a join.

These two options both have significant drawbacks: the first duplicates columns across multiple tables, and the second turns innocent-looking field access in your code into joins! It is highly non-obvious from reading the code whether an attribute access is going to reference a column of the object (fast, already in memory) or a join to a different table (slow, goes to the DB unless prefetched).

I always sort of flinch when I see these approaches used or recommended, but I’ve never really clarified my thinking around why until today. The issue is that in both cases you’re trying to hide the database from the programmer, and that is not a good thing! Don’t get me wrong, there’s lots of aspects of the database I want hidden: connection strings spring to mind, spelling out long JOIN statements, etc, but the fundamental structure of the tables is extremely relevant to thinking about your application.

So enter the third way of doing this, which gives you slightly-less-convenient app code in exchange for significantly-more-obvious database patterns:

class Vitals(Model):
    heartrate = IntegerField()
    core_temp = FloatField()

class Cat(Model):
    vitals = ForeignKey('Vitals')
    fur = ...

class Frog(Model):
    vitals = ForeignKey('Vitals')
    jump_height = ...

Now what’s a join vs what’s a field is obvious at first glance. If you call my_frog.vitals.heartrate you know you’re going through a foreign key, and if you just access my_frog.jump_height, it’s obvious that was an existing field. Obvious to someone who has some experience with Django, anyway.

I think the difference is which you think of first when you’re coming up with persistent models: your database and the data in it, or your app code and the operations of that data. I give the data primacy, and I think the code should model that. The alternative approach, where the code should be maximally convenient and the database should be molded to make it work, makes me frustrated. Give me nice, clean, well-organized data and I’ll swallow a lot of code duplication for it.

The ORM is in service of the database, not the other way around, and don’t you forget it.

* All examples made up and not checked for correctness.

** I find inheritance loathsome. Do not @ me.

Posted in computer science, software | Leave a comment

How To DDoS Your Own Website With Python

Have you ever thought, “huh, I wonder how easily I could DDoS myself despite all the hard work of library and framework authors?” In that case, have I got news for you!

In the vein of Uses and Abuses Of Python’s Excessive Dynamism, I have found a way to DDoS yourself via hash collision in a measly 3-ish lines of python!


Since I am an unbearable pedant and this blog post would be about 10 lines long if I didn’t write a bunch of unnecessary exposition, I am going to start out with a brief explanation of hash tables/sets!

Alternatively, you could just read Wikipedia’s explanation.

A hash table is a data structure that is commonly used to implement maps, which you learned about last time. It’s also used to implement sets, which are like maps except, instead mapping keys to values, they just store a bunch of values.

The basic way a hash table works is that we start out with an array of some size, and we consider each index of the array a “bucket.” Then, each time we insert a key (or value, in the case of a set), we take a hash of the key and use that to determine what bucket to insert it in. How a hash is calculated is not important (especially not for this post, as you’ll see!), but it is a deterministic one-way function from any object to an integer. Deterministic means if I ask you for the hash of an object multiple times, it must return the same value, or everything will be ruined for ever.

Once we have the hash, we find the value of that hash modulo the length of the array. For example, if we were working with a backing array of length 10, and our hash was 1238, we would select bucket 8. Then we find what bucket 8 points at, which is generally a linked-list of values of whatever type we’re putting in.

Okay, back up, for concreteness, let’s assume we’re looking at a hash set (not table) of strings. So our backing array is a length-ten array of linked-lists of strings.

That’s better. So we’re at the linked-list that bucket 8 of our array points to. Now we iterate through this list, and make sure that the string we’re putting in isn’t already in there. This means that our string has to be compared with every other string we’ve already put into bucket 8. So every string for which our hash function ends in an 8 will go in this bucket, and we have to compare against each of those every time we add a new string with hash ending in 8.

Now this sounds somewhat scary, right? The whole point of hash sets is that they have O(1) (that is, constant time, also known as “screaming fast”) insertion time and membership test (i.e. “is this string already in my set). But if I have a gazillion strings that hash to something that is 8 mod 10, I have to do a gazillion string comparisons to verify whether or not this particular string that goes in this bucket is, in fact, new or not!

All that is to say that if your hash function has “bad” behaviour (where bad behaviour means hashing a lot of different objects to the same bucket) and/or your backing array is too small for the size of the set you’re working with, all this stuff you expect to be screaming fast will be super slow instead, which is just the worst if you’re a developer. There of course has been a lot of work done by serious computer scientists to come up with great hashing algorithms and heuristics for when it’s time to resize your array to accommodate more objects (strings, in our case), but nothing is stopping you from defining a hash function that is super terrible.

Okay, can we ruin everything now?

Here’s a great way to ruin everything:

class Foo:
  def __hash__(self):
    return 1

Now every time you try to add a new Foo object to a hash set, you’re gonna get them all in the same bucket (this is what we call a hash collision). So just throw a ton of those into a set every time your website loads and your customers will all hate you.

But wait… we can do better than that, right? If we can only make our own objects degenerate, we’ll have to go out of our way to slow our website to a crawl. Couldn’t we improve that somehow? Couldn’t we make our website terrible even in the face of all the hard work computer scientists and framework authors have done to keep things running smoothly?

Of course we can! This is Python! With adequate cleverness, we can ruin anything!

Just stick something like this into your codebase:

  for name, thing in globals().items():
    if isinstance(thing, type):
      thing.__hash__ = lambda self: 1

Now all user-defined types will have horrible hashing algorithms, guaranteeing that anywhere they’re used in sets or as keys in a set, every insertion and membership test will be O(n) instead of O(1)! Take that, helpful, thoughtful library authors!

Another terrible option

Try this one on for size:

class Foo:
  def __hash__(self):
    return 1
f = Foo()
s = {f}
print(f in s)
Foo.__hash__ = lambda self: 2
print(f in s)

You can turn every dict and set in your codebase into a liar in one swell foop!

Merry self-destruction, everyone.

Unfortunate addendum

You can run something like __builtins__.hash = lambda *args: 1, but it does not appear to affect the algorithm that’s used to hash strings. This is presumably some implementation detail of Python internals that we can’t mess with. str.__hash__ = lambda *args: 1 throws an error. Sadly, we cannot ruin the most common type of map, string -> x, so easily 😦

Posted in computer science, pedantry, software | Leave a comment

Common Data Structures You Already Secretly Understand

Let’s talk about data structures that have close analogies to things you already do or know! This post will not be educational or entertaining, I promise. Enjoy!

Linked Lists

What is it?

A linked list is a way of storing related information consecutively, such that you can only navigate forwards.

Where do I already use it?

Most people store the alphabet in their heads this way! You can easily recite the alphabet forwards (it’s even got a catchy tune), but few people can fluently recite the alphabet backwards. That’s exactly what a linked list is like: going forward is easy, going backwards is hard.


If you can cite the alphabet backwards, you probably remember it analogously to what we call a doubly-linked list. It’s easy to go forwards and backwards (quick, what letter comes before v?) but it’s much more difficult to tell me what the nth letter is without counting (quick, what is the 17th letter?)

There’s a fairly uncommon but cool data structure called a “skip list,” too. This is basically a linked list, but where you don’t always want to navigate through every element to move forward because it’s a lot of elements. So you have your linked list, but then you also have a linked list that skips bunches of the list. The only example that’s coming to mind for me here is learning to speedrun a video game: You have a linked list of what you’d call the “route,” e.g. the order in which you go from place to place doing stuff. Then, at each place you go, there’s a bunch of smaller steps to do said stuff. So here you have the base linked list which is very detailed and tedious (e.g. walk 3 squares left, slash your sword then move down, etc etc) but you have a list on top of that that’s much coarser, and you can sort of go up and down between the two. So you’d think of it like: go to dungeon 1 (skip list), collect item 2 (skip list), (now drop down to the more detailed linked list) then move down two squares before wiggling away from some enemy. If this explanation makes no sense, don’t worry! I told you this wouldn’t be educational or entertaining, didn’t I?


What is it?

An array is a lot like a linked list in these abstract, analogy-driven terms: it’s easy to navigate forward. It’s also easy to navigate backwards (like a doubly-linked list). However, it’s also very easy to access the nth element.

Where do I already use it?

Probably in how you think of months! You can tell me the months in order, and you could go backwards (it might be a bit stilted but you wouldn’t have any real trouble), but you can also probably tell me offhand exactly what the 5th month is.

You probably also think of days of the week this way, with some possible complications (e.g. is the first day of the week sunday or monday?). Although that’s hard to really say; 7 elements is tiny.


These are all terms that usually refer to the same thing. Sometimes they hint at underlying implementations, which are far too educational for this blog.

What is it?

It’s a lot like what it sounds like: a mapping from something to something else. A mapping allows very fast access to any value based on its key. It’s not very good at going backwards, though!

Where do I already use it?

Probably something similar is happening all over your brain! But I can’t convince myself any of that is sufficiently map-like, because cognition is basically a black box.

Some more concrete examples: if I ask you to tell me what the word “alopecoid” means, you can go look it up in a dictionary easily*. But if I ask you to tell me a word that means “like a fox,” that’s a lot harder! Maybe you come up with “vulpine,” and I say, oh, no, that’s not the one I was thinking of, can you give me another? There’s no way get to a word based on its definition besides reading the whole dictionary, so you’re out of luck.


What is it?

A tree is a way of storing hierarchical data; that is, there is some “root” to the tree that has some number of “children,” each of which can have its own children (or not), etc. It’s useful for lots of stuff: it can make various searching procedures faster, it’s (generally) quite fast for inserting new values while maintaining sortedness, it’s the only sensible way of representing certain kinds of data at all.

Where do I already use it?

Company hierarchies work like this! There’s some big boss CEO at the top, who has some number of direct reports, and they have direct reports, etc. Presumably you know at least a substantial portion of the tree that includes you: you should know your boss, and probably their boss, and on up. You should also know all your direct reports, should you have any. But who cares about the hierarchy in departments you’re not in? I sure don’t!

Another example is family trees. Given a sufficiently correct family tree, you can go up and down and left and right to define relationships! For example, instead of your “boss” being above you, it would be your parents. Your children are your… well, children (one of those references is to human children, of course; either will do). If you go up a level and over a level, you get an aunt or an uncle**; if you go up a level, right a level, up another level (but not to the same place you’d get to going up two from the start!) and then down some more, you’ll end up with a 9th step-cousin π times removed (I think).


What is it?

A graph is a lot like a tree, except the connections between elements don’t have to go straight up and down***. They can go anywhere! Anything can be connected to anything else!!

Where do I already use it?

This is how Facebook works! I’m friends with a bunch of people, and they’re friends with a bunch of people, and some of those people are friends with me, etc, etc. Just lots and lots of people and lots of connections between them.

For an example that might make it more clear why graphs are interesting, consider playing the “Six degrees of separation” game, where you try to figure out how you’re connected to <cool celebrity X>. You start at yourself, and then you figure out the next connection: maybe your mom knew someone who grew up in X’s hometown. Then from there, you find out that this person knew X’s childhood barber. Now you’ve gotten there in just four jumps: you -> your mom -> person from X’s hometown -> X’s childhood barber -> X! This is a useful (if somewhat silly) way to use a graph.


Some graphs are what we call “Directed Acyclic Graphs,” or DAGs. These are an important subset of all graphs, with some restrictions on the connections between elements: the connections have a direction (such as parent-child or boss-employee) and there are no cycles (that is, if we have people A, B, and C, and we have that A is B’s boss, and B is C’s boss, we know for sure that C can’t be A’s boss, because that would form the cycle A->B->C->A).

Do you play video games (or maybe pen and paper RPGs)? If so, the “skill tree” you may have encountered in that setting are named pretty appropriately! There’s a “base skill” (or more than one, in which case you can think of the true root of the tree being a skill you have from the beginning), and after you’ve obtained that maybe you get a couple options for new skills. But sometimes you need more than one prerequisite — these skill “trees” should really be called skill DAGs! In this example, the direction is a “prerequisite” relationship, and there can’t be any cycles. If there were a cycle in a skill DAG, all the skills in that cycle would be unlearnable, since they all depend on each other (directly or indirectly).

Maybe you went to college? College degree requirements can be thought of as a DAG. Taking topology requires completing calc 3 and also algebra, which don’t depend on each other!

Lots of things are graphs. You can think of roads as connections on a graph, and intersections as the elements. Graphs show up everywhere! Some graphs can be directed and cyclic (think of a city neighbourhood with a lot of one way streets). Some graphs can be undirected and acyclic (I presume, I can’t actually think of a real-life example at the moment).

Bloom Filter

What is it?

A way to check whether we’ve already seen a certain element, without using much storage space (memory) to keep track. But there’s a twist… sometimes it lies! It might tell us we’ve seen something that we have not (but never the opposite).

Where do I already use it?

Hahahahaha, just kidding, you don’t.

* “Easily” is a substantial overstatement here! For years it wasn’t defined in any online dictionary I could find, and missing from many text ones. Even now, check out the Merriam Webster page for it, namely:

Wait, there’s more! This word doesn’t usually appear in our free dictionary, but we’ve shared just a bit of the information that appears in our premium Unabridged Dictionary. There’s more definition detail there.

** Is there a gender-neutral term for “sibling of parent?”

*** I sort of pretended that “right” and “left” are givens in trees, since they’re often represented directly in family trees. In general, that’s not the case, and navigating to a sibling is a process of going up to the shared parent and then back down to the sibling; they generally aren’t stored, in a computer, with a direct sibling connection. This does differentiate them from graphs importantly.

Posted in computer science, pedantry | 2 Comments

Why Is No One Writing Language Runtimes?

A conversation with a coworker recently lead me to wonder why none of the innovation in writing software recently has been in terms of runtimes, or at least not pluggable ones.  Before I get into that in any detail, I’ll go over a bit of what a “runtime” even is, and how it’s distinct from other components of software.

This article is probably appropriate reading for someone who has written code in at least one language, is vaguely aware of some of the differences in the ways other languages run, and is willing to take some things on faith. Any more knowledge will just mean more and more of it, up to the last two headings, will be a review.

How Software Runs

There are two major, coarse-grained ways of defining how software is actually run: what we call “compiled” and “interpreted” code.

Compiled code is written, like all* code, in plain text. You open up a text file, write some human-readable characters (and some *s and ->s and so forth). Then when you want to actually run it, you use what you’ve written as input to a program called, unsurprisingly, a compiler, which takes the plain text you’ve written and turns it into “machine code,” which is just a whole lot of bits that your computer can actually run. The compiler outputs an executable, and then you just run that on its own. This looks something like gcc hello.c -o hello ; ./hello – first you compile (gcc stands for “GNU C Compiler”) some source code (in the file you write, “hello.c”) into an executable named hello (specified by the “-o hello” part). Then you run it (“./hello” means execute the program in the current directory named hello).

Interpreted code is, of course, written in plain text as well. There is no compilation step separate from running the code, though; instead you feed your code as input to an interpreter directly. This looks something like python hello.py. What’s going on under the hood is that python is a program that’s already compiled and runs on your machine successfully, and it opens up your file and executes specific instructions for each line that it reads.

There’s also a third, sort of in-between way of executing code. Some programming languages (perhaps most famously Java) have a separate compilation step but also still run their own interpreter on the compiled output. What’s happening is that the Java compiler does not compile to machine code, it compiles to what we usually refer to as bytecode. Java then reads in the bytecode and executes machine code based on it. In keeping with the last two sections, this looks something like javac Hello.java ; java Hello – you’ll note it has aspects of both of the prior examples.


No program is an island. Every program has to hook into the operating system at some point, if nothing else. The environment in which an executing program operates is sometimes called its “runtime environment,” or just runtime. When you write a C program, your runtime environment is very bare – the C compiler turns your code into machine code and really adds very little beyond that. When you write a Python program, however, the runtime is rich: the interpreter is the thing that’s actually running, and does things like managing your memory and doing function dispatch for you.

Other languages are in between. Since Java compiles your source text to bytecode before being executed, the only thing that’s running when you execute your code is the Java Virtual Machine (JVM), which is not a Java source compiler. It does things like manage memory allocations and deletions for you, which C won’t, but you can’t so much add a new data type at runtime, or change the object hierarchy around, like you can in Python.**

Different runtimes make different guarantees, and different languages provide their guarantees in different ways. For example, Python won’t leak memory because the interpreter creates all the objects for you and can’t lose track of them. Rust programs, however, won’t leak memory because the compiler makes sure you didn’t write code that could leak memory, and then writes down the parts about freeing memory into the final executable; the safety guarantees in the Rust case are not provided by the program actually running in the end, but by the other program (the Rust compiler) that created it. Yes, this is very difficult to understand at first, and I am not the greatest teacher; please don’t feel bad if your head is hurting.

Why Is Bytecode Important?

Normally, when you’re creating a new programming language, you have a whole lot of work to do: you need to define how the language looks (syntax) and works (semantics). Once you have your ideas for that, you need to write some programs that will actually take source text that humans write and run it. There’s a lot of different ways to do this! More now than ever. The major ways are basically writing a compiler or an interpreter. But writing compilers is really fucking hard, and writing interpreters isn’t a ton easier (and it makes the resultant code much slower, in general). So what if there were a way to get around having to do all that work, but still end up with a fast, correct executable at the end?

The key insight here is that you can write a compiler from your language’s source code to someone else‘s bytecode. If you can just get that one pass done, from your language into, say, Java bytecode, you get the whole JVM for free. The JVM gives you garbage collection and very fast execution (along, of course, with great library support, among other things, which isn’t really relevant here but is to many language designers). Instead of writing a compiler that takes human-readable text and turns it into machine-executable code, you can do the much easier task of turning that text into JVM-executable code.

Beyond Bytecode: Intermediate Representations

Bytecode is similar in some ways to a thing called intermediate representation, or IR, that many compilers use. This is basically breaking down the source text -> machine executable into an extra step, but then instead of leaving it as code for a runtime to execute, you finish up: source text -> IR -> executable.

Let’s summarize here, with examples of how some languages do things:

C Source text -> machine executable
Objective-C Source text -> LLVM IR -> machine executable
Java Source text -> JVM bytecode (bytecode executed by the runtime)
Python Source text is executed directly by the runtime

Every language starts with source text; some move through an IR or bytecode step, and some go all the way to an executable, while the others have an interpreter to run as the last step.

Plugging In

All the languages I’ve used as examples so far are self-contained within the parse-compile-execute zone. Many languages are in fact built as I alluded to a moment ago – they compile from their source down to an IR or bytecode format that is then compiled or executed by a compiler or interpreter the language authors did not write. The example on my mind today is Rust, which compiles down to LLVM IR, which is then compiled by LLVM into machine code. The Rust authors don’t have to write their own compilation to machine code, and can instead leverage the expertise of the LLVM team. Other examples include languages like Clojure and Scala, which compile to Java bytecode and are then run on the JVM. Notably, LLVM and JVM are by far the most prominent targets in this scenario: they’re both open-source and well-documented targets that provide a ton of firepower, and there frankly aren’t a ton of other options.***

Where Are We Going?

Currently we’ve talked about a few approaches. You can own the entire source-code-to-running-program pipeline, or you can abandon it somewhere before the end. If you want machine code, you can compile it yourself or only compile to LLVM IR and then let the LLVM toolchain finish the job. If you want the entire weight (both in terms of heaviness and in terms of force, here) of the JVM, you can compile to Java bytecode.

What I don’t see is anyone writing a runtime that can be compiled to. The JVM is very powerful, but it has its own potential issues (type erasure and garbage collection come to mind). But what if you want a runtime with some features? what if you want, say, a runtime with very powerful coroutine support?

The answer is: you write Go.

The answer should be: you write a language that compiles to Go’s IR.

From some quick googling, it seems like there in fact is a separate IR for go, but I was unable to dig up information about how to compile that IR. I also found no evidence that anyone has tried one way or the other (compiling it or compiling to it). The fact that the Go team seems dead set on naming everything as inconveniently as possible for googling**** (the language is Go, and they call their IR “assembly,” which means that googling for “go assembly” mostly finds questions about using assembly language inside Go, not using Go’s “assembly” language) makes it hard to be confident in my assertion that no one is doing it, but it is at best niche and rare.

So Let’s Write Runtimes!

From everything I’ve heard on the internet and my coworkers, the Go language is mostly pretty crappy, or, at least, has several very serious flaws, but the performance is good and it’s excellent for high-concurrency web stuff, mostly because you get coroutines for free. Unfortunately, the two are shackled together, and I will have to wait for someone who actually has the time, energy, and skill to write a runtime with first-class coroutines as a pluggable backend that I can compile to. I mean, realistically I’ll keep writing slow, non-concurrent, interpreted Python, but perhaps you, dear reader, have an actual need for such a thing.

More than that, I’m partly just curious why no one is working on this. The JVM was never written to be a generic runtime, but it’s been wildly successful in that role. LLVM was written on purpose to be a generic compiler backend, and it’s also wildly successful. Why is no one looking at reproducing the success of the JVM, without all the baggage? There is a project called Parrot VM that was an attempt to do just this, provide a generic IR format for executing dynamic languages, but it never took off, probably because it was originally tied to the tragic development of Perl 6, and thus didn’t gain traction outside of that community. Given the number of new languages that leverage the JVM or other virtual machines (Elixir is a notable example that runs on the Erlang VM), there’s clearly a demand for this kind of thing.


* Very, very nearly all

** Probably not strictly true. I don’t know the exact limits of the JVM runtime. Last time I looked at it, you could certainly look up methods by name, which would be very difficult in C, but I don’t think you could create new ones at runtime. Even if this stuff is strictly possible though, the point is that Java bytecode is somewhere in-between Python and C in terms of the runtime environment.

*** One important other option is using another programming language as your IR! If you can compile (or perhaps more accurately transpile) your language into C, then you can leverage gcc (or clang, etc) into compiling that all the way down into machine code. This isn’t relevant to the post, though, really, despite being interesting. There’s also CIL, which is basically Microsoft’s version of Java bytecode, interpreted by the CLR. This is not different enough from the JVM situation for me to include it much elsewhere.

**** Is this on purpose? They work at Google! Maybe they want it to take several queries instead of one, which then allows them to put ads in front of your face more times in a row? I can’t believe anyone is that nefarious; given all the dumb decisions behind Go, I’ll have to assume Hanlon’s Razor here.

Posted in software | 13 Comments

Programming Language Misfeatures With Little Comment

(An incomplete and unordered list)

  1. inheritance
  2. !=
  3. ===
  4. not in  from python
  5. .. and ... (flip-flop operators)
  6. implicit type coercion
    1. (except arguably short -> long style)
  7. null
  8. case fallthroughs
  9. non-exhaustive case statements
  10. the switch/case construct in general
  11. lambda‘s restrictions in python
  12. ruby having 3 (? maybe more?) types of function objects
  13. memory-location equality comparisons
    1. ok, these are sometimes fine, but should never be the default
  14. arguments that can be either positional or named when called
  15. anything that exposes the difference between co- and contravariance because of a type error (lookin’ at you, Java arrays)
  16. constructs whose behaviour is partly controlled by the parser (e.g. php’s instanceof)
  17. constructs that affect the runtime behaviour of the interpreter (e.g. php’s __halt_compiler or ruby’s __END__)
  18. operator overloading where you have to (can?) define your own precedence/associativity (no, seriously)
  19. compiler warnings
    1. ESPECIALLY warnings that cannot be turned into errors
  20. function scoping
  21. closing over variables
  22. closing over values
    1. why is it never the one i want!!
  23. hygienic macros
  24. unhygienic macros
    1. why is it never the one i want!!
  25. raw-text replacement as “macros” (e.g. the C preprocessor)
Posted in software | Leave a comment

Probabilistic Morality, or: Ghosting On Spherical Cows

A Lighthearted Preamble

I have, of late, you’ll notice, been trying to find a date. In said pursuit, I have also begun using Tinder in an effort to hopefully delude some young woman into believing I am worth spending time with*.

I recently went on a nice date with a a nice young woman. Unfortunately, she… either doesn’t like talking to me, or is the world’s worst texter. We set up a tentative date, and I said “I can be there at 6?” — the question mark indicates that I was opening it up for her to confirm, deny, or choose a different time. She didn’t respond! When it came time for me to leave work, and I had to choose whether to go to the train or home, I finally had to follow up** and ask if that worked — and she quickly responded yes, we met at the decided-upon time, and had, I think, a nice evening.

So the next day, I followed up, saying I had a good time and asking about something she had mentioned she was working on while we were out. She didn’t reply! For over a whole day, she has not replied. Normally, I would take this as a clear sign that she was not interested, and in fact was doing what those damn Millennials call “ghosting.” But, as I said, she is already a forerunner for the World’s Worst Texter trophy. So maybe she did have a good time — she actually said as much while we were out! — and just does not reply to texts in a timely manner. Or, sometimes, at all.

A Bitter Aside

Dating sucks. Texting sucks. Finding oneself strategizing about texts to send hours or days in advance sucks. Everything sucks. Eat Arby’s.

Getting Towards My Actual Point, Insofar As There Is One

I was talking about all this with a friend of mine who has also been dipping her feet intermittently into the online dating game. I also mentioned that one time, after a date that I had enjoyed, I sent a similar followup text to the one described above, along the lines of: Hey, I enjoyed meeting you last night, would you like to hang out again sometime? <insert reference to a thing we talked about here***>. And she replied with something like: Hey, it was nice meeting you, but I’m not really feeling it, sorry. And I was all, Okay cool thanks for letting me know; good luck going forward!

And my friend here — the one at the beginning of last paragraph — was taken aback that a woman actually said that! It’s like getting a rejection letter after applying to a company. Nobody does that! She even thought that, hey, maybe she should try doing that too, when she’s not interested in someone, rather than ghosting.

But for a woman to actually straight-up turn a man down is often unpleasant, and sometimes even dangerous. When should a woman give a date an “official rejection” versus just ghosting on them? It’s clear that the woman should protect her own safety and wellbeing first of all, so texting a man who she did not feel safe with to begin with is a bad idea compared to disappearing. And giving a gentle but clear let-down to a man who is nice enough but you’re not interested in seeing again is a pretty clear good: you get to feel benevolent for being up front, and he gets to know where he stands and not to waste his energy pining over you. The advice seems natural: Err on the side of caution, but give a straightforward rejection whenever it feels safe.

Do You Even Remember The Title By This Point?

I started thinking about exactly what to do here. I started wondering what the correct course of action would be based on my poorly understood, utilitarian philosophy of morality. There’s two variables here: the woman’s (henceforth Alice) and the man’s (henceforth Bob) suffering****. We assume some things:

  • Alice has already decided not to see Bob again.
  • Bob would like to see Alice again.
  • Bob’s suffering is reduced by getting an explicit rejection as compared to being ghosted.
  • Alice’s suffering is reduced by offering an explicit rejection.
  • Alice’s suffering is, after deciding to offer the rejection or not, exclusively controlled by Bob’s response
    • If Bob replies in a positive or neutral manner, her suffering does not change (after the bump from magnanimously being straightforward)
    • The more negatively Bob replies, the more Alice’s suffering increases

I don’t think any of those assumptions are controversial: The first two are just an exposition of our hypothesis, and the rest seem straightforward.

For the purpose of getting anywhere, I am going to posit a final constraint that moves us into the world of spherical cows:

  • Alice has no priors to help predict Bob’s reaction

I think the idea holds up fine even if we allow Alice to have priors, but let’s assume she doesn’t for now. Bob is a spherical cow.

Maybe I’ll Present A Model Under This Header

Our assumptions let us assume fairly naturally that that Alice’s suffering will be increased by the value of some simple function*****.


The x-axis is the negativity of Bob’s response, and the y-axis is the effect on Alice’s suffering. So if he is positive or neutral, there’s no effect, then as he ramps up from “Okay, thanks” to “Wow, rude,” through “Fuck you, you’re ugly anyway” and on up to “You think you can just turn me down? I’ll fucking kill you,” her suffering increases. It’s not linear, its shape is not well known in general except that it is non-decreasing, but we can assume that it is “known” insofar as for any given response from Bob, we know the exact amount of suffering it causes in Alice.

But this function is not the end of the story. Because, see, what we actually have here is a probability distribution! Bob, for any given value of Bob, will only have one of these reactions. Over a large population of Bobs, however, we will find ourselves with some probability distribution over negativity of responses.

The Much-Awaited Conclusion

Once we have this probability distribution, we can wave our hands****** and come up with an expected value of Alice sending out a rejection text. The formula looks something like this:


Where S stands for suffering. So the total change in suffering, which is the sum of Alice and Bob’s changes in suffering, is equal to the suffering decrease of giving a rejection, plus the suffering decrease in receiving a response (these two are constants for our purposes), plus the suffering change based on the response that Bob gives.

Our setup gets us to this:


Which is to say: the expected suffering from Bob’s response is equal to the sum of suffering deltas of all responses times the probability of that response, multiplied by some normalization factor. So all we need to know is, given some suffering function from Alice and some probability distribution over possible responses from all Bobs, if


then Alice should send a rejection text. If the inequality is false, she should not.

To wrap up, I’d just like to say that I think it’s interesting that (for some model, anyway) a woman’s choice to do a rejected suitor the courtesy of actually telling him he’s rejected depends on two things: her own sensitivity to negative replies, and the probability distribution of how negative young men in the online dating world are when faced with rejection. That is to say, it is quite possible that today it is immoral to send a rejection text, but tomorrow the world will be a brighter place and men won’t be so fucking crazy, and it will become moral to reject them properly.


* It is not going great, so far. Thank you for your concern.

** Double-texting, despite being the Ultimate Sin of Thirst and Neediness, was my only option here besides giving up, so I reluctantly went with it.

*** I always worry that if I just say “hey let’s hang out again” I sound like I just wanna hook up, when, I mean, that’s something that’s totally cool, but not my primary goal, so I try to provide some sort of evidence that I was paying enough attention to be interested in my date as a person and not just a body, you know? I have no idea if this is effective; there are too many confounding reasons a woman might not want to go on a second date with me.

**** Suffering can be negative (what laymen refer to as “happiness”) but it’s easier to work with just one name for the quantity.

***** Simple here meaning: continuous, non-negative, real valued.

****** Do some calculus.☨

☨ I should really format my endnotes better.


Posted in math, morality | 2 Comments

On Speed Dating

I went Speed Dating this week. I utilized the services of this company.

The event’s location was moved the morning of.

I left my apartment half an hour before the event. I noticed I felt very nervous during the cab ride. I arrived at the event 20 minutes early. They told us to be fifteen minutes early to check in. No organizers were there when I arrived. No one was there to check anyone in until five minutes before the event started. The event started late.

There were more women than men. The men sat and the women moved between spots. I sat on a couch and women sat next to me. There were two groups of friends. One group was all medical students. One group was all mental health volunteers. Everyone was nice.

I was very tired by the end. I was less interesting in my last dates. Being nervous is very tiring. So is talking to strangers. The event was on a weekday. Speed Dating after work is even more tiring.

I turned in my match card after the dates. This matches people who match each other. The organizers never emailed me. People who match each other get introduced in an email. If you match no one they do not email you.

Posted in dating | Leave a comment