Oneliner - A Dialog DSL for Godot (and a mini postmortem of how it came to be through memento mori)

memento mori, with some delinquents memento mori pictured, in its completed? form

During development for our AGBIC 2018 game, memento mori, a small domain specific language (DSL) 1 and VM 2 for narrative scripting was grown organically following the needs of the project’s writer and my partner-in-crime, Jammybread. (the other one being the excellent composer Conciliator, who did the music!)

We had only a few requirements for this initially: being easy to write (ideally hard to mess up), having syntax which is close to writing a traditional script (or just plain text) but still flexible enough to accomodate things like expressions for characters in dialog, as well as triggers for things such as camera movements, and lastly the ability to add other miscellaneous functionality which we may need as development matures.

The goal was to keep it as simple as possible.

What we will be discussing

So that this article is more digestible as a whole, nitty-gritty details of the virtual machine and its implementation are mostly left out. (although it may get an article of its own in the future if there’s interest) Instead we shall be focusing on the language and its development/evolution.

What are/were the options then?

As a reader intending to maybe make your own system, you should first ask yourself: is there already a solution out there that accomodates your needs?

Do you even need all this in the first place?

If your dialog needs are even simpler (than ours), you may not even necessarily need a full DSL and VM, you might be off just fine with an array of strings/objects set up in code and an index/program counter that keeps track of which piece of dialog you’re currently on, or even just straight using coroutines if available in your language 3 (yield in the case of Godot + GDScript) may be sufficient.

In our specific case

However, we found that there were not many dialog systems written at this point that were easily usable through GDScript. There were a few systems used in other engines that might have been reasonable to use through C# in Godot, but we considered it far too much effort to both include Mono 4 in our by then relatively lean project, and then also write an integration into Godot for some (most of them rather bulky) existing system.

A few systems for Godot usable from GDScript did exist but many of them involved directly writing JSON or similar to set up dialog, which was immediately written off as unergonomic (the situation has since changed, but at the time it was as such). Moreover, support for easily adding in custom functionality such as triggers/calls into game logic wasn’t commonly supported.

As a whole, the options were “considered” and the NIH 5 continued at full speed!
At one point there might even have been the genuine belief that one might actually do a better job of making a dialog system than any of the other thousands or tens of thousands that may have come before… Yeah we’ll get to the details of that later.

Let us begin by enumerating the functional requirements we had when we started making the game (and we will return to this again later as well).

Defining our (initial) requirements

festival some choices, for flavor

The game, and how that shaped what we needed

At the start, the game was basically going to be a little visual novel/walking-simulator type thing where you walk around and talk to characters, any choices you saw in the actual dialog would not be consequential choices that resulted in huge branching hierarchies within a single dialog.

incomplete game flow diagram crude diagram of game flow, developed 75% through dev (to help tiredbrain devs)

By eliminating conditional choices almost entirely (any “conditional choices” you had were to talk to this character or don’t talk to them, time advancing when you talk to them and characters being available during different times of day), so during a conversation itself, you would not have much effect on the flow of it.

Persona 5 choices pictured, how Persona 5 does it

Having no choices at all of course might seem a little drab, so similarly to how Persona does it we added little flavour-type choices you could make during all conversations that the characters responded to directly. This was so effective that some ended up believing they were having greater impact on the dialog flow than what was actually happening in practice, however if this is positive or not may be a little ambiguous (are people misled to as what part in the game where meaningful choices are made, causing confusion as to which parts matter?).

We also figured that for the choices, we’d fix it to 3 choices per group.

2 is too few for it to seem like you have a variety in choice, 4 is too many to write unique responses for - jammybread

Functional Requirements

For our initial prototype the following functional requirements needed to be fulfilled:

  • Our not-very-programmery jammybread should be able to write it.

i’ve dabbled in Lua, BASIC, Python and Java >:U i’m just not very good at it - jammybread

  • Should largely read like a script (like you might see for a tv show).
  • We should be able to have groups of choices.
  • Have a straightforward way to add comments.
  • Only require a line-by-line parser.
  • Text should just be text.

Our Initial Prototype

Given our above (fairly vague) requirements, a dialog script in the initial prototype ended up looking something like this:

> Looks like he's skipping class.
> Talk to him?
Sup, new kid. You skipping?
It's only been like a week since you transferred too.
Already showing your true colours.
Too bad everyone else at this school sucks, man.
Fuckin' no-one can hold up a fight.
They're all weaklings.
And boring, man.
You're the only dude that's come up here.
Well, bro, you made the right choice.
Fuck bein' in class, am I right?
[Right.] Yeah, man.
[I think they just don't come up here while you're here.] ...... 'cause I'm too scary, yeah?
[I just didn't feel like it.] I get you. You're just sick of it, yeah?
No teachers. Nobody tellin' you what to do. 
No funny looks. Just the sky and a light breeze.
It's good, ain't it?
I feel like I could chill here forever, man.
...

first prototype dialog with pompadour-san in memento mori

In the prototype we can already see the first construct: The first two lines- denoted by starting with > and ending with ?- contains the text in the interaction popup when you approach a character. If you said no, it would cancel and you could carry on, if you said yes, it would continue the dialog script from that line.

This element persisted in some form throughout the iterations of the script language.

The second assumption made is that any text that was not prefaced with > was spoken by the character you’d initiated conversation with.

… This was before realising that we might want multiple characters in a single conversation, which we will get to eventually.

So in short our current elements here are:

> ... text ... ? initial “do you want to initiate this dialog” command, if you respond positively dialog continues, if you do not, it exits at that line.

> .. text ... character thoughts/player speaking.

... text ... the “spoken to” character speaking.

3x [Choice] Response (one on each line)

As this is parsed on a line-by-line basis, the parser is very simple. However, you can’t stick two “commands” on a single line easily 6 (without a special case in the parser), but we didn’t even have anything like commands/function calls by then so that wasn’t a problem.. yet. Nesting constructs essentially wasn’t a thing.

The Parser/Lexer then?

Given that our requirements were not particularly performance sensitive, as well as trying to keep the design as simple as was possible, we ended up with a system of parsing the dialog line-by-line first (no lexing step- the lexing and parse step here are one and the same, due to the simplicity of the language. For various reasons one may come to regret this a bit later), and for each parsed line it may spit out an instruction (or multiple).

Conditionals, Flags, Jumps, oh dear!

As we moved on with the game, we inevitably encountered cases where conditional events were interesting or even necessary. Lets say you have a choice to feed a cat or not; this cat being the linchpin of your universe. So now everything must change… Okay, maybe not quite that dramatic, but you get the need.

A way to have conditional statements was then necessary. We wanted to keep our line-based structure and avoid descending into making a full programming language, so I decided that a simple form of unconditional goto, as well as conditional goto should be added.

Now at this stage you might think- well, you’ve not yet introduced any concept of variables or flags? .. and yes, you’d be quite right to, in this step we also introduced what we ended up calling “flags” 7 in our implementation- which in our case are booleans, a binary state of true or false.

This state would initially be kept exclusively in a sort of global registry, which in our case is a simple dictionary.

conbini cat, in the conbini conbini bear, doin what they do (from the finished game)

Lets look at what some dialog looks like at this point:

# if cat is fed, go to dialog instead of shop menu
![CAT_FED, AFTER_FEEDING_CAT]
# talk: Welcome! 
# buy: Thank you for your purchase!
# leave: Have a nice day!
# talk again: Is there something else you wanted, sir?
# talk again again: Sorry, is something wrong, sir?
# 3rd talk: ... Are you bored or something?
AFTER_FEEDING_CAT:
Back again?
Did you buy that just for the cat outside?
He's a local. Comes here every day.
...

Looking at the dialog above, we can see two foreign constructs:

  • ![SOME_FLAG, SOME_LABEL] which is our new conditional goto construct
    • (the unconditional jump form looks like ![SOME_LABEL])
  • AFTER_FEEDING_CAT: which is our label that we jump to!

As you can see, there’s a bunch of dialog sketched out by jammybread that was commented out at the time (because at this point, we didn’t yet have constructs powerful enough to present the pseudo rpg shop interface imagined, but now we would have the tools!).

Also introduced at this point

Given the new constructs, a way to set flags in dialog and a complementary addition to our question/response structure were also introduced.

@[SOME_FLAG_TO_SET] for setting a flag.

[Question][TARGET_LABEL] for jumping to a label given a choice, complements the [Question] Response form.

… moving on.

Multiple Characters? Multiple Characters.

Moving on a little bit, we had managed to mangle together a shop interface by abusing the aforemented conditional gotos and labels but soon hit yet another roadblock, which required us to introduce yet another feature.

Until now we’d assumed every dialog would have only two speakers, the player and whoever they’re talking to. This had held up for most of the game, but now that we wanted a scene where the player would meet a group of characters, treating the group as one “character” seemed insufficiently expressive.

If you paid attention at the very start of the article, you might already have an idea of whats coming… and indeed.

Problematic Characters

the stooges our problem, in the (pixelated) flesh

Our Solution

So now we needed a way for any given line of dialog to have associated with it which character was speaking it, for any number of speakers.

Having had a look at our syntax, the existing and constructs (and which character each construct started with, remember our parser is very simple), along with the idea of representing characters as a set of uppercase codenames (sort of like the labels in our DSL) the solution we came up with looked like so:

> There's a group of three delinquents.
> Talk to them?
<BIRB> What do you want, loser? Buzz off.
<DAIKON> Hey man...
<YOU> What's up?
<BIRB> Oi, you, don't ignore me!
<BIRB> And don't talk to him, you idiot!
<DAIKON> Hmm... Ok...
<BIRB> I'm down here! Move the damn camera down!

A little bit after introducing this new construct we also realised that the old construct with > ... text could be removed entirely in favour of this new construct, also replacing > ... text ... ? for the dialog-initiating question with a variant that took a subject but otherwise worked the same as before.

Another choice made in the dialog system was to consider the <SPEAKER_NAME> HELLO WORLD?! construct as changing the current speaker state when <SPEAKER_NAME> is included in the line of text. Knowing these two things the above text could be simplified to the following:

<BRAIN> There's a group of three delinquents.
<BRAIN> Talk to them?

<BIRB> What do you want, loser? Buzz off.
<DAIKON> Hey man...
<YOU> What's up?
<BIRB> Oi, you, don't ignore me!
And don't talk to him, you idiot!
<DAIKON> Hmm... Ok...
<BIRB> I'm down here! Move the damn camera down!

Letting characters ~express themselves~

delinquent sheet
the spritesheet for our dear delinquent

A little later, we realised we had all these characters with different expressions but no way to have a character change expression mid-dialog, or in our case, whenever a given new line of dialog is executed.

As we figured this might be a common thing to do wherever a <SUBJECT> TEXT type expression occurs, we simply extended the syntax to accomodate a second form: <SUBJECT, EXPR> TEXT as well.

(EXPR is passed into the dialog system wherein it emits a signal to change the sprite if it finds an expression matching EXPR in the expression dictionary for the character SUBJECT in the character dictionary).

At this point, a bit of dialog might look a bit like this:

<BRAIN> Looks like he's skipping class.
<BRAIN> Talk to him?

# if we're past "Talk to him?", then set flag for having talked to him to true? or maybe just do it at end, do it here for now
@[TALKED_TO_POMPADOUR]

<POMPADOUR> Sup, new kid. You skipping?
<POMPADOUR, SMUG> It's only been like a week since you transferred too.
Already showing your true colours.
<POMPADOUR, NORMAL> Too bad everyone else at this school sucks, man.

some of pompadour’s dialog about 50 % through the project

The somewhat expected need for function calls

A little while later we’d come to see the need for camera adjustments and similar while in dialog, at which point it also became clear that creating specialised constructs for each and every little need in the dialog was going to be a fool’s errand.

(until this point, the camera would zoom in on a predefined point set in the scene in Godot once the dialog had started, but this proved inflexible should we want the timing to change, or further camera movements to occur)

So the time was then clearly ripe to create a generic function call syntax, so we could bind functions in GDScript-land and call them inline in our dialog script, for moving cameras or similar. Earlier we’d ripped out the syntax for the “initiate dialog” question which looked a bit like > Start Me?, which then forms a good entry point for using > as the leading character in our function call construct (as none of our other constructs start with >, it is then trivial for us to parse).

> CAMERA [POMPADOUR]
<BRAIN> Looks like he's skipping class.
<BRAIN> Talk to him?

example of moving the camera to focus on POMPADOUR when the dialog runs

Introduced constructs

> FUNCTION [ARG1, ARG2, ARG3, ...] functions with arguments

> FUNCTION functions which take no arguments

Introducing this construct opened the floodgates and made it possible to do many other things we’d otherwise have to implement a special case for in the DSL.

(.. however it did also introduce a fair few more possible failure cases and brought it ever more closer to a “full” programming language, which is something we’d been trying to avoid, albeit unsuccessfully).

To demonstrate the mileage we ended up getting out of introducing this in the end, presented are the functions we ended up exposing to the dialog script language for the end product (point is it came in useful, details aside):

# in game.gd, functions which are accessible by all kinds of dialog script,
#   and usually act on global/current scene related data.
PRINT [ARGS...]
ADVANCE_SUBPERIOD
EXIT_TO_OVERWORLD [SCENE]
ENTER_FROM_OVERWORLD [SCENE, TEXT]
SIMPLE_TRANSITION_RIGHT [SCENE, TEXT]
SIMPLE_TRANSITION [TARGET_SCENE, TEXT]
SIMPLE_TRANSITION [TARGET_SCENE, TEXT, FROM_SCENE]
PLAY_IDLE_THEME
PLAY_THEME [THEME_NAME]

# in dialog.gd, functions which are accessible only in the dialog,
#  usually act on the dialog object and the local scene associated.
SET_EXPR [CHAR]
SET_ZOOM [N]
SET_CAMERA [CHAR]
OFFSET_LIMIT_Z [N]
RESET_LIMIT_Z
CAMERA_MID [CHAR_ONE, CHAR_TWO]
CAMERA [CHAR]
ZOOM [N]
FLIP [CHAR]
HIDE [CHAR]
SHOW [CHAR]
WAIT [SECONDS]
WAIT_DOTFREE [SECONDS]
PLAY_ANIM [ANIM_NAME]
PLAY_ANIM_WITH_WAIT [ANIM_NAME, SECONDS]

# in popup.gd, functions which are accessible only in the popup form
ORIGIN [CHAR]
DISABLE_DIALOG
SET_WILL_TAKE_TIME

Moreover, grepping through the sourcetree where the english dialog 8 is in dialog/en with grep -R "^>" dialog/en/* | wc -l for the construct gives us 151 occurrences, over 1234 total lines dialog script source, so it does seem we have a decent return on investment!

Function calls that the Dialog VM must yield on

Moving on even further, at some point we’d come to a situation where we’d been using the function call syntax in scripts to play back animations in Godot-land, for anyone familiar with Godot and its Animation System one could see how this might be very useful in combination with the dialog system.

The problem you do run into at this point is that the dialog is not necessarily at an even pace or speed- it depends on the talking speed of the character you are talking to, whether the user is fastforwarding/etc…

The animation meanwhile simply plays when you fire off the animation (notwithstanding any trickery in the animation itself), and the user is free to continue clicking on to the next piece of dialog, which might in turn fire off another animation or event..

This is bad.

What was our first “solution”?

It was bad. Our first solution was introducing an explicit wait function that basically blocked the dialog system from continuing for however many n seconds you used > WAIT [N] with, but it then needed to be hardcoded for every animations actual length and explicitly changed should the animation change.

Moreover, if it was an event which didn’t necessarily have a fixed length, > WAIT [N] was basically useless (and it was also too tightly coupled to the dialog system implementation).

Thinking it over, and with it a better solution

So we’d run into a situation where we wanted some action to finish before the user would be allowed to move on in the dialog.

Straight up blocking everything on this is obviously not an option even if it’d be the easiest one logically, but what if we can accomplish the same thing logically as far as our dialog script is concerned? We want to suspend execution at our current position and resume when it is appropriate, or in our case, when an animation/action has completed.

If you’re familiar with coroutines you might see what we’re getting at here and indeed, we’re basically looking at something similar here (those of you who are from a Godot context, see yield there).

Implementation Details

In our VM implementation, reading the next instruction is given by calling a step sort of function which advances one instruction at a time, as each instruction has a meaning for us we need to read back into our UI impl and maybe even display incrementally, it makes sense for the UI layer to drive when a new instruction is read back (ui calling into the vm, rather than vm driving the ui).

Because of this, we can check what the return value of our step function on the VM instance is, and if it is still yielding on a function, we get nothing back until it is ready again.

Thus, a new construct as thus was created:

>> SOME_YIELDING_FUNCTION executes the function, after which the Dialog VM suspends execution at its current position until the bound GDScript function emits a signal to signal it has completed.

>> SOME_YIELDING_FUNCTION [ARGS...] ditto, but with arguments

In a snippet of real dialog script from the game it currently looks as such:

I'm gonna go home.
See if I can find my old telescope.
You cool?
...
...
<BEAR, HEH> See you again?
[We'll see.] ...Yeah.
[Give me free food from the conbini and I'll think about it.] <BEAR, SHOCK> You're supposed to be bribing me, not the other way round!
[Show me your telescope next time.] My-? ... Sure.
<BEAR, NORMAL> Cool.
Uh.
Bye!
>> PLAY_ANIM [bear_drive_off]
> HIDE [BEAR]
> DISABLE_DIALOG

Closing remarks

conbini, outside at night this too, uses the dialog system

Would we do it again?

In hindsight, would we build the same thing again knowing what we do now?

… If it was back then, probably yes (with some improvements as discussed below)

… If it is today, then probably not, excellent tools 9 like Ink by Inkle Studios now exist and a Godot integration as well which is usable from both GDScript and C# in Godot!

But what if you still really want to build it though?

The massive waste of time aside, building it again today I would most likely write a full recursive-descent parser 10 and drop the line-by-line requirement, formalize the grammar in EBNF form and treat it like I would putting together a normal programming language, keeping the grammar relatively close to the existing one but add Python-like indented blocks ala what Ren’Py does to make the goto construct much less necessary for any and all kind of nested operations and make branching narratives a lot more reasonable to do.

At least for choices, this could displace the clunky [Choice][TARGET_LABEL]

(potentially with something like the following)

[That goat sure is ugly.]:
  <GOAT> Wow, you sure chose well!
  > GOAT_KICK
[That goat sure is elegant.]:
  <GOAT> I see you understand culture when you see it.

instead of what we would have to do right now:

# jump to goat choices, past the block of conditional stuff
![GOAT_CHOICES]

GOAT_UGLY:
<GOAT> Wow, you sure chose well!
> GOAT_KICK
![AFTER_GOAT]

GOAT_ELEGANT:
<GOAT> I see you understand culture when you see it.
![AFTER_GOAT]

GOAT_CHOICES:
[That goat sure is ugly.][GOAT_UGLY]
[That goat sure is elegant.][GOAT_ELEGANT]
[...] <GOAT> ... ?

AFTER_GOAT:
# continue with whatever dialog here

(as a result of not thinking we needed it much initially)

But at the same time, with this would also likely soon come the want/need to likely introduce more complex conditionals and again, at this point you’re already better off just using something pre-existing.

Perhaps one could lean on allowing users to introduce arbitrary GDScript in their dialog scripts, but again, at this point you can be digging forever and at some point you do want to ship the game.

“If I could I would” type improvements/dreams

Another thing I wanted to do but quickly realised was grossly out of scope was write a visual editor ala what Ink does for its language, where you can instantly see errors as you are writing the dialog, ideally this would also include things like validating that you are referencing characters/objects that really do exist so as to avoid as many runtime errors as possible.

Other possible approaches include going for a node based approach, but personally I feel the flow of writing is very different when you’re just staring at a page of text rather than a screen of floating GUI elements, this would probably be different if the game had a need for more branching or interactive bits than it did.

What of the virtual machine itself and the implementation bits?

A second part may get written which talks about the dialog VM and more of the actual implementation details, somewhat separate of but related to the language, if that turns out to be something people want.

Thanks for reading, until next time!

Things we glossed over

There are many things in here that we glossed over merely to avoid creating a ten thousand word long tome, some of which include:

  • Most of the Dialog VM and its implementation details.

  • The actual structure of the (very simple) line-by-line parser.

  • The relation between the different pieces of the system and pieces such as fastforwarding/automatic advance in game and such, which is more related to the actual implementation/ui layer (the dialog controller, as well the dialog ui implementation), where the Dialog DSL + VM itself is concerned only with the discrete set of dialog pieces as if they were instructions in a programming language, the UI impl layer extracts these pieces and advances to the next instruction in the VM once it is ready (for example if you want a typewriter type effect).

  • How the dialog system (and its VM) was refactored continuously throughout development as a result of all these changes.

    • In fact, initially all the pieces: the Parser, VM, and UI implementation were all in one file for the first half of development, until reuse was necessary to use parts in both fullscreen dialog as well as “popup text”. You can see the two different forms in the screenshots above.

conbini, annoyed bear the random jump operator in use

  • The random jump operator we added to the language, yes really. Basically just jumped randomly to one of N labels passed as arguments!

    • Where did we use it do you ask? In the conbini when you talked to them and asked them too many questions, it’d jump to one of the available responses when you’d exhausted the options so it seemed a little more organic!
  • Probably other things I forgot, if there’s anything obvious I’ve glossed over you think I should cover, do let me know and I will amend it!

Further Reading

1. Domain Specific Language, a language specialized for a domain, see again c2 for discussion: http://wiki.c2.com/?DomainSpecificLanguage (Wikipedia also offers some insight but little discussion)

2. Virtual Machine, see the discussion on c2, as the definition is somewhat contentious: http://wiki.c2.com/?VirtualMachine

3. One of the bigger reasons this may not be a great idea is the difficulty of serializing state involving suspended coroutines, usually the suspended state is not something you as a programmer can inspect and save/restore, outside of some systems that can image the entire running program (hello Smalltalk), if there are any others I’ve glossed over, I’m happy to hear about them!
(caveat, if you are working in a language with continuations you may be able to build coroutines ontop of them, and achieve the ability to serialize them this way.. but most languages don’t let you go here)

4. Godot offers a version of it with Mono integrated, but you’ll pay for it with the extra 30MB+ and only desktop platforms may be supported as of currently, whereas one of our goals was to target at least Android.

5. Not Invented Here, a popular past-time in the Godot community amongst core devs and users alike. See also on c2: http://wiki.c2.com/?NotInventedHere.

6. As we hadn’t written a proper parser (like my personal goto, a recursive descent parser) and a proper grammar as one might for a proper programming language with expressions, statements etc, we basically just explicitly defined every possible case of what each line could look like, recursive cases didn’t really exist past what we explicitly designed for, where a normal programming language might say “well, on the RHS goes some kind of expression” and you can reduce your burdens this way considerably (assuming your design is sound).

7. In Visual Novel contexts usage of the word flag is quite common (but really I just like the term), as tvtropes having a page on it clearly proves.

8. We originally planned to localize the game at least to English and Spanish, but things fell apart and the one who we intended to help us out with the localization originally couldn’t, so we shipped with only English, the game is still built to handle localization though, so it may still happen! (unlikely though)

9. Ink from Inkle Studios, Twine, Yarn from NITW are but a selection of the many excellent tools available.

10. See 6.