
Gap Buffers
25 minute read
Posted by Ryan Large
July 18, 2024
Intro
Today, let's talk about gap buffers
If you have never before heard of a gap buffer than you're in the right place! I knew nothing at all about them years into my programming career, until one day I felt like re-inventing the wheel.
This all started a few days into a new blog application I was creating for one of my clients. Their demands were quite reasonable, and their main focus was having a great text editing experience.
Rich Text Editor Libraries
For starters I immediately went to libraries. I mean who wouldn't? My tech stack was something like this...
- React
- Sanity.io (a headless CMS)
- And that's it!!....
So immediately the search was on. I checked out React Quill, React Rich Text Editor, Draft.js and countless others before settling in on a library called jodit-react .
This was a great experience for me. I was learning a new library specific skill, and creating something cool. This library came with all of the fancy bells and whistles straight out of the box. Plus or minus a few paid version features.. And there was the problem.
Is it worth the time?
To be fair to myself, there were more problems than needing to pay money for specific library features I was looking for, but also customization such as custom handles when a button was clicked in the editor, or how the editor looked.
All of these things were possible. But I am now a week into this project at the time and need an entirely new set of hours that I can dedicate to learning how to customize the library further, time I simply did not have and if I did I thought, how much more time would it have taken to build one on my own?
What Did I Learn?
I also considered what I learned during the experience when it was finally all said and done. There are hundreds of libraries especially in the JavaScript ecosystem and some of them are very useful and powerful tool which can save you a ton of time. But not all of them.
Some of them you might end up using once and never again. This was the case for me. The knowledge I took with me after learning how to integrate jodit-react into a web app was very specific, non versatile and I continued my software journey without ever using that library specific knowledge again
My Own Text Editor
So the journey continued and I officially decided that I needed more control over what technology I am using in my web apps. If I could spend weeks at a time getting truly good at implementing a library into my apps, I can take the time to build me own and gain a great deal of understanding when it comes to the lower level logic behind a text editor.
At first I had no idea so much went into the workings of a text editor. Regardless of the language. I researched and watch a few videos here and there showing me how to turn an HTML5 text area into a custom text editor with JavaScript. But that just circles back around to the same issue I started with. Lack of control.
So instead I continued to look around at a few developer posts and I started hearing about this mysterious gap buffer. I had no idea what it was as I had never heard of it before. So, I began to research the gap buffer.
Data Structures && Algorithms
What the heck is a Gap Buffer!? I asked myself. Well, turns out it is like any other kind of object in coding. We use the tools we have to create a little machine or a small enclosed system dedicated to a specific task. Like a mini program or structure. A gap buffer I came to find out was exactly that. It was a collection of methods that worked in a specific way to manipulate a certain kind of data. It was not without a ton of reading and searching before I could really have a good idea of how the inner workings of this data structure worked together to become living program that reacted to events. No good You Tube videos out there or papers on the subject really explaining how it worked, but with a ton of hopping back and forth and patience I was able to figure out how to put it all together and realized how it could fit into the heart of a larger program such as a text editor.
You see one of the draw backs of computers is that naturally they like to do as much work as possible. It is easy for them to do too much work. I mean they don't have the skills of taking a step back to view the larger picture all at once and use deductive reasoning to reach an answer swiftly. No computers can only see one thing at a time. If you want to find the largest number in a list you need to look at every single number in that list! No matter what. One by one it needs to be compared with the current largest number to see if it beats it. Well, see with that in mind while considering the fact that we intend to build ourselves a text editor, we seem to have ran into a little problem. I mean luckily computers are fast as heck. But when you truly start to understand how much work they are doing for simple tasks you consider the fact that maybe reprinting two hunderd and fourty eight thousand nine hundered a thirty four letters to the screen each and every time a letter is typed might become a bit taxing for it if not done in a more efficient way.
But that is not all and not painting the best picture. Let's step back and consider what it might take to build a text editor. Now there is certainly more than one way to handle editing, saving and printing large amounts of text, but we are sticking with a specific way. . .
A list
I hope if you are reading about Gap Buffers you know how we can reprisent lists in programming. Such as with an array. That is what I would like to use for this text editor, because as we all know with a text editor we need to be able to create, read, update, and delete text. So we need a way to store our text in memory and maybe in groups or as a long text string, which is just an array or text under the hood right? So let's go with that. An array. We will store our text in an array that will hold our characters at each index. Then we can loop over that array to print our text and we can remove characters from the array when we delete them, and we can append characters when we type them to the array or even prepend characters when we need to edit. Arrays are very easy to work with and computers are fast. But what about when our array is 500,000 characters long and we needed to add a letter to the first word of our paper? What will happen? Does anyone know what the computer will need to do?
Did you come up with a few answers? Well, here is what I noticed. I noticed that in order for us to add a character to the begining of the array we would neeed to literally move all 500,000 letters after it forward one spot in memory. Wholey freaking cow. You mean we would literally need to move every single letter after the index we are editing to the next index of the array one by one? Yes that is what I am saying. Why can we not just move the characters on the left over instead? Well, that sounds like a very intellegent solution accept, the fundamentals or every programming language and how they work with the very bare bones of our memeory system do not allow for that. We cannot then index into our first element with. . .
array [ -1 ]; .js
It does not work, and neither does -2 or -3 and so on. So we are literally foreced to phisycally move each character in memory over that is to the right of us. That is un-believable! And so much work! Each 8 bit memory slot will be wiped clean just to be replaced by the 8 bits to the left and god forbid the allocated space did not end right were a currently used up chunck of memory is. That means our computer will have no choice but to find a new spot to allocate our array to in memory that does have a long enough stretch of contiguous space that we can fit our new array and oh my gosh that means we need to copy the old one with our new update and paste it into new memory slot and then delete the old one so that memory can be freed up again!
Now we are getting into some serious computational weight and we can notice it because of the lag we experience as we edit. I mean come on, we are writting novels with our text editor right? How can we make something useful when we advertise that our program can only handle a few pages without beginning to lag and cause frustration!?
That is when we decide that an array is still a great option, but maybe we can turn it into a super array or an array of super arrays that have some special properties so our performance issues significantly decrease or maybe even dissapear completely.
Gap Buffer
Now, Gap Buffers will solve this exact problem for us while also allowing for an easy way to implement some other fancy things into our program as well but in a more convenient and safe way than otherwise. But, we will cover more on that later. For now, let us discuss what a Gap Buffer actually looks like and then we can dive into how it solves our performance problem.
What Is A Gap Buffer?
Before I paste the crappy code I created from scratch as my own first Gap Buffer implementation I will give you a quick overview. The issue with a single array holding all of our information is that quickly we have problems with memory allocation involving relatively large amounts of data and very often. This causes lagging and bad performance. So we consider this problem and use the Gap Buffer to mitigate it mainly by its power to manage a large list of characters as a whole even when split up into multiple lists and allocated into different parts of memory so now any given part of our text being edited is only one of many pieces of the whole puzzle significantly reduced in size so taxing operations are done only on a small portion of the string rather than the whole 500,000 characters in the worst case. Let's visualize with code.
text = ["R", "y", "a", "n"]; .js
Above is our array of character which will be printed as "Ryan" in our editor. Let's add "Hello,"" to the beggining so our text editor would then print out "Hello, Ryan" instead. Of course using code that is quite simple and we have many ways of doing it. Let us keep it simple.
const addition = "Hello,";
for (let i = 0; i < addition.length; i++) {
text.shift(addition[i]); } .js
In this example code we are manually inserting the characters that make up "Hello, " one by one as we would if we were to simply type this string into a program such as a text editor. The value of viewing this small snippet is simply to show us that using arrays is easy and with that we can quickly create ineffecient programs. What we do not see in this code unless you have spent time in a deeper level of programing is that behind the scense we are moving each and every character in memeory to a new location, basically copying and pasting the data making up the current string at each iteration and deleting old data.
We must remember that this is simply how computers do things. They take certain steps a certain size and a certain way every single time to create a very logical, reliable and consistent computation. It is our job as programmers to make sure that the process is carried out responsibly and is managed in a way that we can keep our computational work load managable for the computer even if it can in fact process billions of tasks a second, it would suprise you how quickly that adds up.
So our job in the task of building a text editor is not so much how to change the fundamentals of computation and how our operating system allocates memory, but instead to understand how it allocates memory and know our limitations. So now we know its not the computer or memory allocation logic that is the problem it is that when our string becomes production size or extraordinarily long we will have massive slow downs when making edits such as in the above code because of the way our operating system needs to manage memory. So the solution for our issue here is to split up the string into managable sizes so any given edit only manages a string that the computer is built to handle in a reasonable time.
Implementation
Now, I am going to paste in some crappy code. This code is my own implementation of a Gap Buffer. And below I will go into details about how each method fits together as a whole to make what we call the Gap Buffer data structrue.
class Buffer {
constructor() {
this.buffer = new Array(10).fill(" ");
this.gapStart = 0;
this.indexState = 0;
this.gapEnd = 10;
this.newLines = 0;
this.currentLineCount = 0;
}
moveGap(index) {
if (index < this.gapStart) {
const offset = this.gapStart - index;
this.buffer.splice(
index,
0,
...this.buffer.splice(this.gapStart, this.gapEnd - this.gapStart)
);
this.gapStart = index;
this.gapEnd -= offset;
this.indexState = index;
} else if (index > this.gapStart) {
const offset = index - this.gapStart;
this.buffer.splice(
index,
0,
...this.buffer.splice(this.gapStart, this.gapEnd - this.gapStart)
);
this.gapStart = index;
this.gapEnd += offset;
this.indexState = index;
}
}
expandBuffer() {
const addSize = 10;
const newBuffer = new Array(addSize).fill(" ");
this.buffer.splice(this.gapEnd, 0, ...newBuffer);
this.gapEnd += addSize;
}
insert(c, index) {
if (index < 0 || index > this.buffer.length) {
return;
}
if (this.gapStart !== this.gapEnd) {
this.moveGap(index);
}
if (this.gapStart === this.gapEnd) {
this.expandBuffer();
}
if (c === "\n") {
this.currentLineCount = 0;
this.newLines += 1;
} else {
this.currentLineCount += 1;
}
this.buffer[this.gapStart] = c;
this.gapStart++;
this.indexState++;
}
erase(index) {
if (index <= 0 || index > this.gapEnd) {
return 0;
}
if (this.gapStart != this.gapEnd) {
this.moveGap(index);
}
const c = this.buffer[this.gapStart - 1];
if (c === "\n") {
this.newLines -= 1;
let currentLineLength = 0;
for (let i = this.indexState; i > 0; i--) {
currentLineLength++;
if (this.buffer[i] === "\n") {
break;
}
}
this.currentLineCount = currentLineLength;
} else {
this.currentLineCount -= 1;
}
this.buffer[this.gapStart - 1] = " ";
this.gapStart--;
this.indexState--;
return 1;
}
print(withCursor) {
let beforeGap = this.buffer.slice(0, this.gapStart).join("");
let afterGap = this.buffer.slice(this.gapEnd).join("");
if (withCursor) {
beforeGap += `<span class="cursor">|</span>`;
}
return beforeGap + afterGap;
}
printRaw() {
return this.buffer;
}
printSelection(start, end) {
console.log(this.buffer.slice(start, end));
}
getCurrentPos() {
return this.indexState;
}
getGapEnd() {
return this.gapEnd;
}
movePosBack(amount) {
if (this.indexState - 1 < 0) {
this.indexState = 0;
return;
}
this.moveGap(this.getCurrentPos() - amount);
}
movePosForward(amount) {
if (
this.indexState + amount >
this.buffer.length - (this.gapEnd - this.gapStart)
) {
return;
}
this.moveGap(this.getCurrentPos() + amount);
}
getSize() {
return this.buffer.length;
}
}
export default Buffer; .js
First thing is first, especially if you are not the most patient or advid code reader on the Earth, do not let this long piece of code scare you off from Gap Buffers. Let us break it down from top to bottom, method by method and wrap it all up with the big picture or in other words, the Gap Buffer class.
That large block of code up there is a class. A JavaScript class to be exact. This class acts as a reusable container. And if you are familiar with code we can use this container of code and access all of the methods in it by creating a new Gap Buffer container like so.
gapBufferOne = new GapBuffer; .js
That was easy, I must say. Now what is gapBufferOne? We set it to a new instance of our Gap Buffer class. This means that we litterally have managed to take all of that Gap Buffer code above and put it inside of our gapBufferOne variable. Heck let's do it again. . .
gapBufferOne = new GapBuffer;
gapBufferTwo = new GapBuffer; .js
Now we have two sets of that long code snippet above stored in two different variables!! This is cool. It's almost like we built a mini program and packaged it into those two names, "gapBufferOne" and another copy of the program into "gapBufferTwo". Like two chrom browser instances open at the same time! Are you picking up what I am putting down here?
Gap Buffer Class Constructor
Inside of this class constructor at the top of the class we have a few variables mostly for keeping track of certain information and quick access to data that pertains to our string array. This data includes the array we will be keeping our characters in. Observe that we initialize it to have 10 8-bit memory slots preserved to us for this array which will virtually garuntee us the space required to fill it with 10 characters and have no need to reallocate memory from attempting to occupy an already occupied piece of memeory
this.buffer = new Array(10).fill(" "); .js
Underneither this line of code we have a few more variables.
this.gapStart = 0;
this.indexState = 0;
this.gapEnd = 10;
this.newLines = 0;
this.currentLineCount = 0;
.js
Let us quickly review these as they are important for tracking information about how, where and how much of the array we are editing.. eg. creating, updating, deleting characters within the array.
gapStart
gapStart tells us where in the array our cursor is currently at, where our users updates will take place at
indexState
indexState is similar to gapStart by storing the information of where our cursor currently is within the string array of characters in our buffer. But we give it another name so we can use it for clarifying purposes within our implementation
gapEnd
gapEnd tells us the length of the gap buffer array we created
newLines
newLines variable is there to help us keep track of new line characters within out buffer for ease of use when rendering the buffer. not relavent
currentLineCount
Another variable that is not relavent in this article for rendering purposes in my text editor code.
So we understand now that the important information we need to have for a gap buffer implementation is an array with a specified amount of space allocated in memory. Where our cursor is, and how long the array is. I would say so far that is pretty simple. This is all the information that is needed for us to create some amazing methods for manipulating the character array depending on what action the user takes!
Move Gap
moveGap(index) {
if (index < this.gapStart) {
const offset = this.gapStart - index;
this.buffer.splice(
index,
0,
...this.buffer.splice(this.gapStart, this.gapEnd - this.gapStart)
);
this.gapStart = index;
this.gapEnd -= offset;
this.indexState = index;
} else if (index > this.gapStart) {
const offset = index - this.gapStart;
this.buffer.splice(
index,
0,
...this.buffer.splice(this.gapStart, this.gapEnd - this.gapStart)
);
this.gapStart = index;
this.gapEnd += offset;
this.indexState = index;
}
}
.js
Here is one of the most important methods of the gap buffer. This method allows us to move around an imaginary cursor in our array of character the "gap" and if we were to make some kind of edit, where in our array of characters this update would take place.