As part of my QtPlex project, I've learned a ton about how computers handle key events at a lower level. Specifically, X11-based systems like most Linux distributions - although many of the concepts carry across operating systems and managers. In the pursuit of that knowledge, I also discovered some interesting tidbits about the history of X11 and key event handling. This article provides a brief introduction into key events, how they're handled in X11 and the long history of their implementation and development.
What Are Key Events?
In layman's terms, a key event is just a signal that the user has pressed some button, or combination thereof. How we translate the physical button to the signal is where things get weird.
Each peripheral key, regardless of whether it was sent by a programmable mouse, keyboard or bluetooth headset, eventually maps over to some specific, pre-defined encoding. If you ever make the foolish decision to create a Linux application which listens for a specific key to be pressed, you'll first need to find the code (or codes) which correspond to that key of interest. Peripheral key encoding is like the more annoying step brother of character encoding, and I'd advise you to just stop reading now.
A Passing Glance at Key Events
What system is responsible for defining, handling and emitting key events varies across environments. For most Linux-based systems, we can look to the X Window System, standard eleven, aka X11.
The X Window System standard was first conceived at MIT in the mid-1980's. It set out to provide an open-source and standard way for applications to interact with users. This interaction was mostly through graphical user interfaces, but also defined interactions with peripherals such as keyboards and mice. And this is where the "fun" begins!
On any given system, physical keys on the keyboard map over to specific, pre-defined keycodes. These are relatively dumb values that essentially visualize the keyboard as a matrix. For instance, take the following comparison of German and US keyboards:
In X11 world, the English 'Z' key would have the same exact keycode as the German 'Y' key, as they are both in the same physical position. These keycodes are single-byte numbers that are baked into the X11 server, and on their own don't do us much good. What we're really after is what is known as keysyms - the logical mapping of keycode events into their "actual" meaning to the end user. These are the events we think of as key strokes, such as space, alt, or even combo-codes like ctrl+c and ctrl+v.
The History of Keysyms
The Birth
The most basic of these key event definitions on X11 are "keysyms," defined in keysyms.h
. If you have the x11proto-core-dev
package on your machine, you can head over to /usr/include/X11/keysymdef.h
to take a peek at the source. Look at them, all cute, cozy and orderly! There is absolutely no indication of technical debt or 33-year-old software atrophy to be found.
If you don't believe me, just read the multi-paragraph comment at the top of the 2,500 line header file. It tells the treacherous tale of deprecation, Unicode compliance and one-to-many keycode mappings.
The Angsty Teen Years
The implementation of X11 was, at first, a major battleground in the so-called Unix Wars. However, in the late 90's, a champion arose! XFree86 took the throne as an open-source port of the X Window System standard to Intel x86, and later many other chips. XFree86 kept the original free license of the X Window System standard, poising it as a clear leader.
Most Linux distributions started to ship with XFree86, which provided a lot of great things. One of those great things was an extended definition of keysyms! You can take a look at the source for this in the XF86keysym.h
file in the same directory as the standard keysymdef.h
file.
The Old, Senile Years
At this point, you're probably wondering what cool things that XF86keysym.h
brings to the table. Well, nobody really knows! Which is absolutely and horrifyingly hilarious. If you don't believe me, pop open the source and read for yourself:
XFree86 never properly commented these [and] has removed their mail archives of the period that might have shed more light on some of these definitions. Until/unless we resurrect these archives, these are from memory and usage.
And if you just thought that passage referred to XFree86 in the past tense, you are correct! As shocking as it is, XFree86 fell from grace once they started to restrict contributors, refuse to listen to their userbase and fail to track bugs. Fun fact, it wasn't until 2003 that one of the four people allowed to commit to the source code decided it'd be a good idea to start keeping note of bug reports.
So the community responded like they always do: fork the project and let the original one die. And thus, X.Org Server was born! And from that moment on, keysym definitions became an orderly affair. Just four additional files and 450 more lines of header definitions...
Summary
Key events have an interesting history, and this history has resulted in an increasingly convoluted API. Numerous tools (such as xev and showkey) can help you identify, track and trace these events. Unfortunately, this doesn't even begin to scratch the surface once we add things like MPRIS into the equation. Hopefully this dive into keycodes has been funny, enlightening and made you a bit more curious about the computers we interact with. Happy computing!