The code behind Yandere Simulator from r/ProgrammerHumor
I am only going to talk about one commonly proposed solution to the excessive
use of if
statements in the code of Yandere Simulator and nothing else about its
development or any controversy surrounding Yandere Simulator.
Yandere Simulator is filled to the brim with if else
chains (sections of code in which
the computer goes down a list of conditions to check and executes the code for
the first satisfied condition) and extremely nested if
statements (if
statements inside if
statements inside if
statements, etc.), both of which
are bad practice.
With the exception of one or two people I've seen on the internet, most
criticisms that mention the excessive uses of if
statements in Yandere Simulator follow
the general idea that "long
if else
chains are a major
performance problem and they
should be replaced with switch
statements to
improve the performance." To be fair, some of these critics propose much
better solutions than switch
statements (specifically DarkDax and the Reddit
post), but I will focus specifically on the idea that using switch statements
will improve the performance significantly. For the same reason, I'll also
ignore subjective arguments about style and readability.
In the code for the function UpdateRoutine()
, the largest and most complicated function
that will run for around a hundred students every frame in the largest file in
the project, StudentScript.cs
, there are around
if
statements (not including else if
and the if
statements in loops)
else if
statements
else
statements
if
statement each
Within UpdateRoutine()
, there are many instances of code that look like
if (this.EatingSnack) { if (this.SnackPhase == 0) { this.CharacterAnimation.CrossFade(this.EatChipsAnim); this.SmartPhone.SetActive(false); this.Pathfinding.canSearch = false; this.Pathfinding.canMove = false; this.SnackTimer += Time.deltaTime; if (this.SnackTimer > 10f) { UnityEngine.Object.Destroy(this.BagOfChips); if (this.StudentID != this.StudentManager.RivalID) { this.StudentManager.GetNearestFountain(this); this.Pathfinding.target = this.DrinkingFountain.DrinkPosition; this.CurrentDestination = this.DrinkingFountain.DrinkPosition; this.Pathfinding.canSearch = true; this.Pathfinding.canMove = true; this.SnackTimer = 0f; } this.SnackPhase++; } } else if (this.SnackPhase == 1) { this.CharacterAnimation.CrossFade(this.WalkAnim); if (this.Persona == PersonaType.PhoneAddict && !this.Phoneless) { this.SmartPhone.SetActive(true); } if (this.DistanceToDestination < 1f) { this.SmartPhone.SetActive(false); this.Pathfinding.canSearch = false; this.Pathfinding.canMove = false; this.SnackPhase++; } } else if (this.SnackPhase == 2) { this.CharacterAnimation.cullingType = AnimationCullingType.AlwaysAnimate; this.CharacterAnimation.CrossFade(this.DrinkFountainAnim); this.MoveTowardsTarget(this.DrinkingFountain.DrinkPosition.position); base.transform.rotation = Quaternion.Slerp(base.transform.rotation, this.DrinkingFountain.DrinkPosition.rotation, 10f * Time.deltaTime); if (this.CharacterAnimation[this.DrinkFountainAnim].time >= this.CharacterAnimation[this.DrinkFountainAnim].length) { this.CharacterAnimation.cullingType = AnimationCullingType.BasedOnRenderers; this.DrinkingFountain.Occupied = false; this.EquipCleaningItems(); this.EatingSnack = false; this.Private = false; this.Routine = true; this.StudentManager.UpdateMe(this.StudentID); this.CurrentDestination = this.Destinations[this.Phase]; this.Pathfinding.target = this.Destinations[this.Phase]; } else if (this.CharacterAnimation[this.DrinkFountainAnim].time > 0.5f && this.CharacterAnimation[this.DrinkFountainAnim].time < 1.5f) { this.DrinkingFountain.WaterStream.Play(); } } }
With just the if
, else
, and else if
statements, you get a structure that
looks like
if (this.EatingSnack) { if (this.SnackPhase == 0) { // Code if (/*time in range*/) { // Code if (/*check if student is rival*/) { // Code } // Code } } else if (this.SnackPhase == 1) { // Code if (/*Check details of current object*/) { // Code } if (/*Check distance in range*/) { // Code } } else if (this.SnackPhase == 2) { // Code if (/*time in range*/) { // Code } else if (/*time in range*/) { // Code } } }
A problem with the above code might jump out at you: if this.SnackPhase
is 2, then it
can't also equal 0 or 1, but you have to check all of them. You might be
thinking it may make more sense to get the value of this.SnackPhase
and then execute the
code that corresponds to the case where this.SnackPhase
is 2. You can do using the
titular switch statement, like so
if (this.EatingSnack) { switch (this.SnackPhase) { case 0: // Code if (/*time in range*/) { // Code if (/*check if student is rival*/) { // Code } // Code } break; case 1: // Code if (/*Check details of current object*/) { // Code } if (/*Check distance in range*/) { // Code } break; case 2: // Code if (/*time in range*/) { // Code } else if (/*time in range*/) { // Code } break; default: break; } }
A switch
statement will tell the computer to map its input to relevant
locations (memory addresses of specific instructions) in the code through a jump
table. For example, the computer sees this.SnackPhase
is 2, looks at the third entry in
the jump table (0 is the first entry), and moves to the specified location (i.e.
where the case 2:
line is, though I'm simplifying slightly).
In this specific case, you might not see much benefit because switch statements
have to do a little more work up front to do what they need to do, but they will
execute within a constant amount of time. On the other hand if else
chains
have little start up time (an instruction to compare and an instruction to
branch per if
statement), but the amount of time they take to run grows
linearly with the number of cases (double the cases means around double the
time).
Switch
Statements?As far as I can tell, none of the people who said to use switch
statements
instead of if else chains tested how much faster switch
statements are,
so I took it upon myself to get a decent estimate.
I wanted to see how much performance I could get out of a switch
statement vs
an if else
chain, so I wrote some C
code (I already have C
set up on my
computer and compilers don't need to do much to convert if
and switch
statements to optimal machine code, so it should be a good estimate) that ran a
large number of iterations in which the program would pick a random number from
a list of 16 specified numbers (partly to avoid the if
statement using branch
prediction), where the code looked like
int array[] = { 1, 3, 5, 6, 8, 12, 23, 30, 44, 56, 78, 88, 94, 97, 98, 99 }; int dummy_variable = 0; // start timer for (unsigned long long i = 0; i < num_iterations; i++) { value = array[rand() & 0b1111]; if (value == 1) { dummy_variable = 1; } else if (value == 3) { dummy_variable = 3; } ... } else if (value == 99) { dummy_variable = 99; } } // end timer // Do something with dummy_variable so it doesn't optimize the for loop out // entirely // start timer for (unsigned long long i = 0; i < num_iterations; i++) { value = array[rand() & 0b1111]; switch (value) { case 1: dummy_variable = 1; break; case 3: dummy_variable = 3; break; case 5: ... case 99: dummy_variable = 99; default: break; } } // end timer // Do something with dummy_variable so it doesn't optimize the for loop out // entirely // start timer for (unsigned long long i = 0; i < num_iterations; i++) { dummy_variable = rand() & 0b1111; } // end timer // Do something with dummy_variable so it doesn't optimize the for loop out // entirely
I also added another timed loop that just ran the random number generator in a
loop so that I could subtract out the cost of looping and random number
generation. After running the test, I found that the switch
statement
consistently ran around 20 times faster.
Switch
StatementsThe if else
chain took an average of around 24.5 nanoseconds to execute per
iteration and around 8 if
statements are hit on average per iteration (it's
equivalent to a linear
search so each individual if
statement took around 3 nanoseconds per case
while the switch
statement took around 1.25 nanoseconds for any number of
cases. For a game to run at 120 FPS, it has to do everything it needs to do for
the next frame within 8.3 milliseconds. For if
statements to take up a
significant amount of that time (say 5%), over 130,000 if
statements would
have to be executed on at least one thread every frame.o
As I said earlier, UpdateRoutine()
has around 1000 if
statements and there are around
100 students, so you might be thinking that we've accounted for around 100,000
if
statements. This reasoning fails to consider that the computer will execute
relatively few if
statements in the function. For example, if we look back at
the sample code from UpdateRoutine()
, you should notice that the inner if
statements
won't execute unless this.EatingSnack is true. If students are only eating a
snack 5% of the time, we've removed around 1,000 if
statements per frame.
Assuming each if
and else if
statement has a 50% of being true (a pretty
good estimate for an upper bound since I don't have any prior knowledge of the
probability and individual if
statements in an if else chain have a lower
chance of being true), I get around 63 if
and else if
statements evaluated
on average for the entire function, meaning the computer will only evaluate
6,300 if
statements per frame in UpdateRoutine()
for all the students. Given that
UpdateRoutine()
likely has the most if
statements out of all the functions that execute
frequently and it is guaranteed to run for around 100 different entities, I
doubt most other frequently executing functions would even come close to the
upper bound 6,300 if
statements of UpdateRoutine()
, meaning that replacing all the
if
statements with switch
statements wouldn't come close to netting a 5%
performance boost even at 120 FPS. At the current ~20-50 FPS Yandere Simulator runs at,
it wouldn't even increase the FPS by a single frame.
If your program was a long chain of if else
statements in a loop, then sure,
but most of your program's time isn't going to be in evaluating if
statements. If you want
to make your code run fast, you need to optimize the slowest parts of your
program first. As an example, if you have one function that takes up 1
second and another function that takes up an hour, a 50% speedup in the first
task will only save you half a second but a 50% speedup in the second task will
save you half an hour.
I would honestly go as far as to assume that using switch
statements for
performance is a premature optimization unless you could prove that you would
get significant performance benefits, and even then you can replace them with an
array or a map/dictionary most of the time. In my switch
vs if else
test
program, I had another test where I get the value directly from an array, like
so
// start timer for (unsigned long long i = 0; i < num_iterations; i++) { dummy_variable = array[rand() & 0b1111]; } // end timer // Do something with dummy_variable so it doesn't optimize the for loop out // entirely
The above code saves me 50 lines of code and the direct array access runs around
2.5x faster than the switch
statement.
If you need more proof, YouTuber dyc3 profiled the code (and
did an entire code review with deeper analysis and suggestions about coding
architecture) and proved that the entire StudentScript.Update()
function
(which includes the function we looked at, UpdateRoutine()
and every other
update function) took less than a millisecond, which would be around 5% of the
runtime at 50 FPS. Rendering poorly optimized assets took far more time than
anything else, with poorly optimized physics, pathfinding, and UI interactions.
dyc3 didn't stop there, however. He
went through and replaced as many of the of the if else
chains as possible
with switch
statements and only decreased the time by 80 microseconds.
The game would have to be running at around 600 FPS for that improvement to be
significant (above 5%).
To be clear, the overuse of if
statements is a major problem for
maintainability and architecture (specifically the unnecessary coupling of data
and code), not necessarily performance. For example, let's analyze a top
competitor for the most infamous section of code in Yandere Simulator, the main body of
SubtitleType TaskLineResponseType()
in StudentScript.cs
.
if (this.StudentID == 6) { if (!false) { return SubtitleType.TaskGenericLine; } return SubtitleType.Task6Line; } else { if (this.StudentID == 8) { return SubtitleType.Task8Line; } if (this.StudentID == 11) { return SubtitleType.Task11Line; } if (this.StudentID == 13) { return SubtitleType.Task13Line; } if (this.StudentID == 14) { return SubtitleType.Task14Line; } if (this.StudentID == 15) { return SubtitleType.Task15Line; } if (this.StudentID == 25) { return SubtitleType.Task25Line; } if (this.StudentID == 28) { return SubtitleType.Task28Line; } if (this.StudentID == 30) { return SubtitleType.Task30Line; } if (this.StudentID == 36) { return SubtitleType.Task36Line; } if (this.StudentID == 37) { return SubtitleType.Task37Line; } if (this.StudentID == 38) { return SubtitleType.Task38Line; } if (this.StudentID == 52) { return SubtitleType.Task52Line; } if (this.StudentID == 76) { return SubtitleType.Task76Line; } if (this.StudentID == 77) { return SubtitleType.Task77Line; } if (this.StudentID == 78) { return SubtitleType.Task78Line; } if (this.StudentID == 79) { return SubtitleType.Task79Line; } if (this.StudentID == 80) { return SubtitleType.Task80Line; } if (this.StudentID == 81) { return SubtitleType.Task81Line; } return SubtitleType.TaskGenericLine; }
Remember that the long chain of if
statements is equivalent to a chain of if
else
statements since the return
statements will exit out as soon as one of
the if
statements are satisfied.
It would be best to separate our analysis into different levels that take progressively more information into account.
Look for the specific feature of the language that would make this code faster
without considering anything else. In this case, replace the if else
chain with
a switch
statement, netting you a few microseconds (making this replacement a
literal micro-optimization) for maybe 30 seconds of your time. People who have
suggested using switch
statements without first proposing architectural issues
have gotten stuck here.
Look for the specific feature of the language that would make this code faster
without considering anything else. In this case, replace the if else
chain with
a switch
statement, netting you a few microseconds (making this replacement a
literal micro-optimization) for maybe 30 seconds of your time.
Add the intended goal of this code into your analysis. Since you just want each
student to have the proper SubtitleType
, get the rid of the IDs entirely and
create a class called Student
with the field task_line
with the default
value SubtitleType.TaskGenericLine
. When you want to know the SubtitleType
for the student, ask the student with student.getSubtitleType()
. Doing so
removes unnecessary coupling between the task_line
data for all students and
some external function and gets rid of any operation except fetching a value
from a known memory address. Alternatively, you could also use the fact that
enums can be converted to integers and vice versa to make this entire function
unnecessary.
Add the work of other people into your analysis. Third-level analysis is
sufficient for TaskLineResponseType()
, but the code of Yandere Simulator needs a small
amount of fourth-level analysis, which would get rid of most of the if
statements and clean up the code quickly and easily. Specifically, he's
implementing a Finite
State Machine, which have been implemented
countless
times
in almost every language (all three links are in C#
or work with C#
and have
existed for years) and explained in multiple tutorials. Unlike these other FSMs,
which generally use dictionaries/arrays that map states to functions and other
states, the FSM in Yandere Simulator is built implicitly with if statements, which can
easily become unmaintainable as it loses any sense of regularity and can often
lead to massive amounts of code duplication.
switch
or by manually unrolling loops. Instead, good code consists of using the right
tools for the job, maintaining a modular architecture, keeping data and code
separate, giving things descriptive names, etc.
if else
chains are usually an architectural issue, but rarely a performance
issue. switch
statements generally make minor improvements to performance
without fixing the underlying architectural issue, though they can be required
in languages without objects, such as C
.