GUIdebook > Articles > “The X Window System”

Home > Articles > “The X Window System”

Reprinted from Byte, issue 1/89, pp. 353-360.

Born as a means to network graphics workstations, MIT’s X Window is gaining ground as a windowing system for Unix

Windowing user interfaces are now the accepted way of interacting with computers. People may still argue about whether they prefer icons or filenames, pull-down or pop-up menus, but no one disputes the usefulness of splitting the display screen into several areas that clearly separate different software functions.

Though purely character-based windows are quite workable (consider pop-up utilities like Borland’s SideKick), the industry trend is to adopt the full overlapping-windows metaphor that treats the screen like a bit-mapped graphics image with “soft” typefonts and a mouse-driven pointer that can move around by single-pixel increments. Windows then appear to be active objects that can obscure one another, can be moved and resized, and can contain pointer-activated controls for scrolling and zooming. Special kinds of windows (e.g., menus and dialog boxes) present you with choices from which to select by pointing rather than by typing commands.

Examples of such interfaces familiar to us are the Apple Macintosh interface; Microsoft Windows for IBM PC-compatible systems; Digital Research’s GEM, used on the Atari ST and some PC compatibles; and the Intuition interface of the Commodore Amiga.

Two features that are common to all these interfaces is that they are for single-user systems and they are closely tied to the hardware of the computer on which they run. This is partly because such graphical displays impose a much larger computational burden than traditional character-based systems do; thus, their implementation tends to be highly optimized for speed by using direct video memory accesses and even, in the case of the Amiga, custom hardware assistance in the shape of a blitter chip.

In the world of engineering workstations, windowing interfaces have been the norm for several years now. In that world, the almost universal adoption of the Unix operating system combined with the need to share data over networks has generated more pressure for standardization than in the personal computer world.

The goal has been a network-transparent, device-independent way for a program running on a network workstation to create windows on the screen of another workstation that might have been made by a different manufacturer.

Despite the emergence of some proprietary systems, such as Sun’s NeWS, it looks as though the workstation world is settling on the X Window System that was developed at MIT.

The significance for personal computer users is that the worlds of the PC and the workstation are very rapidly converging (see the article “Sun’s Newest Workstation: the Sun386i” by Tom Thompson in the July 1988 BYTE). Top-end PCs already use the same processors (i.e., the 80386 or the 68020/30) as leading workstations. Networking is now widespread among larger PC users.

Meanwhile, CAD and desktop publishing applications have created a need for true high-resolution graphics. These latter two application areas are also beginning to make people want a portable, device-independent way of exchanging graphical information. So far, that demand has been met by Adobe’s PostScript Page Description Language, which has now spawned Display PostScript as a possible video graphics standard in apparent competition with X Window (but see below).

X history

The X Window System has sprung up in a mere 4 years, thanks to the enthusiasm of a group of programmers at MIT and elsewhere. It arose in 1984 out of an MIT project called Athena, which investigated the use of networked graphics workstations as a teaching aid for students in various disciplines.

The idea was that each student should have a windowing graphics workstation on which he or she could run local tools like word processors and spreadsheets while simultaneously being able to call up library pictures and documents from remote sources.

Since MIT has a mix of hardware from Digital Equipment Corp., IBM, and other manufacturers, it was clear that the students needed a hardware-independent protocol for sending graphics around the network. The development of this protocol by Bob Schiefler, along with work by Jim Gettys, Ralph Swick, and others, led to the X Window System. It has progressed in those 4 years from version 4 up to the current release, which is version 11.2.

In 1986, the Athena team decided to release version X10.4 on tape to other interested parties for a nominal charge (reminiscent of the way Unix was spread in its early days). The positive response was overwhelming. Hewlett-Packard and DEC even designed new workstations around X Window.

Finally, in January 1988, MIT formed a consortium with most of the leading workstation manufacturers to develop X Window further and have it adopted as an ANSI standard. The members of the X Consortium included Apollo, Apple, AT&T, DEC, HP, Sun, IBM, Televideo, and Tektronix. The copyright for X Window is held by the consortium members, but permission for its use is granted to any party interested in implementing it.

What is X Window?

The MIT team designed X Window as a distributed, network-transparent, device independent, multitasking windowing and graphics system. It permits you to display multiple applications on the same screen, and it lets one application use many windows. It supports overlapping and hidden windows, text with soft fonts, and two-dimensional graphics drawing.

X Window achieves device independence by splitting the job of drawing windows into two parts, using the increasingly familiar client/server model (see the article “A Personal Transputer” in the June 1988 BYTE). The client is an replication program making requests of the server to draw windows, text, and other objects. The server program runs on each workstation, drawing the required objects on the display.

Figure 1: The client program sends packets of instructions to the server, which contains the hardware-dependent drivers for that workstation. An X server controls not only the screen but also the keyboard and a pointing device with up to five buttons.
The client communicates with the server by sending packets of instructions conforming to the X Protocol, which is, in effect, a high-level graphics-description language. Each workstation has its own server, which contains the hardware-dependent drivers for that workstation. An X server controls not only the screen but also the keyboard and a pointing device with up to five buttons (see figure 1). The application programmer links the client program with X Window using Xlib, a library of graphics and windowing functions.

The client and server might be resident on the same workstation, as when a single user is executing a program locally, or they may be very widely separated. For example, a person with a graphics terminal in London, England, could execute a program on a Cray-2 in Berkeley, California, by using a satellite link in a wide-area network. The physical means of communication is immaterial to X Window since the X Protocol is the same in all cases, that is, network-transparent. Communication schemes used with X Window include shared memory, message passing, and transputer links at the local workstation level and RS-232C, Ethernet, token ring, and many others between workstations.

Though X Window has been developed mainly under Unix, it is not dependent on any particular operating system. It can be implemented on top of any operating system, as it has been for VAX/VMS. However, where possible, X Window uses operating system calls to establish the connections between the Xlib functions and the network, and between the network and the X server. This implies that some operating systems, notably multitasking ones with network support, will be far more suitable than others. It is also possible to create mixed operating system networks by implementing an X server for the “foreign” system. MS-DOS microcomputers have been added as X Window terminals to Unix-based networks by just this kind of hybridization.

The X server

The chain of communication in opening a window under X Window has seven links, depicted in figure 1 but summarized as follows:

Application → Xlib → OS → X Protocol → OS → X server → Screen

An X server program controls the display screen, keyboard, and mouse on the workstation. A single workstation might have several screens driven by the same server (like those big screens for the Macintosh), or a single computer might run more than one server with different graphics terminals attached. More likely, each workstation will have its own X server.

Since a single X server can service requests from many client applications, the screen might have several windows containing the output from different programs. The client programs might be running on the server machine or on several others in the network. Equally, programs running on the local machine or workstation can open windows on other workstations. This means you can create very sophisticated electronic mail systems under X Window.

For example, the leader of a workgroup might pop up a menu window on the screen of each member of the group. They would have to click a preference from the menu, and the results of this “ballot” would be returned to the application program on the leader’s workstation, where it would be displayed in a table. This is an illustration of two-way communication in which the X server returns user input from a keyboard or mouse to the client program.

The X server’s primary job is to share scarce resources among the client applications that request them. The two principal resources are processor time, for drawing and text manipulation, and screen space. An intermediary program, the window manager (see below), doles out the screen space. The server is responsible for scheduling work performed on behalf of the client programs, for memory management, and for such subsidiary processes as maintaining the communications links with each client.

The server performs all these functions by using the services of the underlying operating system. X Window can be used for a truly distributed system; when it opens a window on another workstation screen, it is the remote CPU that is doing the drawing.

The current X11 version of the server can perform two-dimensional drawing of lines, rectangles, circles, arcs, text, and arbitrary bit maps on monochrome or color displays with up to 32 bits per pixel. The X server also loads new fonts from operating system files, stores them in memory, and makes them available for text writing.

From a structural point of view, an X server consists of a device-independent layer that receives and translates client request messages in the X Protocol format, an operating system-dependent layer that interfaces to a particular operating system, and a device-dependent layer that is a collection of device drivers for the specific hardware supported. To port X Window to a new system, only the latter two layers need rewriting.

Window hierarchies, events, and window managers

Figure 2: X Window treats overlapping windows as a hierarchy.
When you open a window under X Window, it becomes part of a hierarchy just like the DOS subdirectory structure. Each screen has its own hierarchical structure and a “root” window that fills the whole screen. The root window can have “child” windows that occupy part of the screen. These in turn can have further children. The overlap and visibility of windows is controlled by the stack order of siblings of the same level, but children always stay in front of their parents. (Figure 2 illustrates this principle.) The number of windows you can create (and destroy) is almost limitless. Each window has attributes such as foreground, background, and border color, cursor shape, and a color map.

Pop-up menus, radio buttons, and dialog boxes are implemented as trees of child windows, since windows are used for all screen interactions. The X server can only output via a window, and it can allow more than one client to output via the same window.

Input from the keyboard must also go into a window, normally the one in which the cursor currently resides. However, X Window has an “input focusing” feature that allows a client program to specify some other window as the source for input. In addition, a client can grab the mouse pointer under certain circumstances.

X Window applications, like those of Microsoft Windows or the Macintosh, are event driven. The main part of a window application program is a loop that waits for an event to happen and then jumps to the appropriate action. The X server recognizes many event types including pointer motion, key press, button press and release, window entry and exit, input focus switching, exposure of previously covered windows, color map event, and status change. Also, communications from the client programs can cause events.

The stated philosophy of the X Window System is to provide only the mechanism for drawing windows, not the policy for using them. This differentiates it from Microsoft Windows, Macintosh, GEM, and other systems that provide both. Under X Window, the policy must be provided by a separate program called a “window manager,” which is just an ordinary client program.

Client programs have to negotiate with the window manager, which has the last say in all matters of screen usage. The proper protocol is for clients to offer the manager “window hints” of their wishes (something like “I want a 20-row-by-30-column window at row 10 column 10 in the foreground”). The manager can then use any available algorithms to satisfy these requests as fairly as it can. It can, for example, resize existing windows or alter their stacking order. Windows can be restacked, moved, resized, closed, or reduced to an icon via the attentions of a window manager. A window manager can also alter the way events are delivered, grab the mouse pointer, and change the input focus.

In fact, a window manager can impose any policy its implementer can dream up. It might forbid the overlapping of windows altogether and send a rude message to any client program that asks for too much screen.

More sensibly, a window manager can emulate other windowing systems. If you have several window managers present in your system, you can switch from the Macintosh look, complete with scroll bars, to the Microsoft Windows look just by running a new manager. Several development firms are presently working on window managers that emulate the OS/2 Presentation Manager.

Xlib, X toolkits, and X protocol

A programmer wishing to write applications that run under X Window must perform all windowing and drawing by using only procedures from the Xlib library, which is available in C, Pascal, FORTRAN, Modula-2, and Ada. If you write your program this way, you should be able to port it to any hardware that supports an X server by simply recompiling without altering the code.

Xlib contains more than 200 procedures, many of which resemble those found in any graphics library. For example, the drawing primitives include XDrawPoint, XDrawRectangle, XFillRectangle, andXDrawArc.

Xlib supports clipping, stippling, and tiling operations as well as the manipulation of raw bit maps. It supplies other procedures to create and configure windows (XCreateWindow, XResizeWindow, XDestroyWindow), and still others concerned with events, queries, font manipulation, keyboard, pointer, and color control.

Opening an X Window application involves eight steps in sequence:

Open a connection to the server with XOpenDisplay.
Create a top-level window with XCreateWindow.
Set standard properties for the top-level window, including hints for the window manager.
Create window resources such as graphics contexts.
Create any other windows needed.
Select the desired events for these windows.
Map the windows.
Enter the event loop.

The “graphics context” referred to in step 4 is a data structure that contains information about a drawing: the foreground and background colors, line width, and clipping region. Mapping is an initialization process that makes a window viewable. The C source code in listing 1 shows how this initial sequence looks for a simple application that prints the traditional “first” program of any language, “Hello World.”

Closing an application properly involves killing all windows, freeing all resources, and then calling XCloseWindow. Should you fail to do this and merely exit, the server will eventually notice and close the windows down itself, since it is responsible for maintaining the client connection.

Figure 3: X Protocol request packets contain a 4-byte header and as many additional bytes as needed to communicate the request. The PolyLine request contains the information to draw a series of lines with a given line style and thickness within the graphics context.
If you have linked your application to the required Xlib routines, at run time X Window will generate the equivalent X Protocol requests to send to the X server. These requests correspond quite closely with the Xlib routines that generate them (though there are fewer of them, since many routines generate the same request). For example, the XDrawRectangle routine generates a PolyRectangle request, and XCreateWindow generates a Create Window request. X Protocol requests are variable-length data packets that begin with an 8-bit op code that identifies the type of request, followed by a 16-bit field specifying length, and one or more bytes of additional data (see figure 3). The added data might be numeric parameters and coordinates, text strings to write, or raw bit-map data in scan line order.

Requests that are queries (e.g., QueryPointer, which asks for the current location of the mouse pointer) return a 32-byte reply packet back to the client. All the requests sent during a particular connection are sequentially numbered so replies can be linked to the request with which they belong. Because network communication is such a slow process, X Window programmers try to minimize the number of these “round trips” (i.e., replies from the server to the client). Most requests do not require a reply and receive one only if they terminate with an error. Events are transmitted back to the client as 32-byte packets that contain an 8-bit code specifying the type of event.

Above the level of Xlib lies the realm of X Toolkits. These are sets of prefabricated routines, built out of Xlib functions, that allow rapid program development. Several toolkits already exist in the public domain. One, called Xtk, is supplied on the X Window tape. Xtk lets you quickly prototype user interfaces by bolting together ready-made components.

There is a well-defined style for writing X Toolkits, set down in the Standard Supplement of the X Window System Manual Set. The style is a model of good modern software engineering practice, being object-oriented in a very strong sense.

The basic data type is a structure called a widget (which, in Smalltalk parlance, would correspond to an object) that holds information about the state of graphical objects. Widgets belong to classes, with a full inheritance mechanism. Applications are built out of instances of widget classes.

Widgets are active entities; a widget can take input from the user and alter its display appearance by using procedures common to its class. Pop-up widgets are for representing dialog boxes and other interactive components.

Toolkits are written in plain, ordinary C. The object-oriented structure of the widget is implemented solely by careful design of the data structure and meticulous naming conventions. This demonstrates that object-oriented programming can be as much a frame of mind as a property of the language. The only drawback to the scheme is that the manual reads like a sequel to Through the Looking Glass. You may have to save a child from a cascade of widgets. Even pickled widgets make an appearance.

Competitive and complementary standards

Anyone who uses BIX regularly must surely, at some time, have bemoaned the fact that you can’t send graphics in your messages. As communications become more sophisticated with the eventual introduction of public ISDNs, protocols for sending graphics material over a network will take on great importance. With X Window, it looks as if, for once, we have a chance of achieving a standard quite early in the cycle, rather than after the customary bloody war of attrition between competing proprietary systems.

The main competition to X Window appears to be Adobe’s Display PostScript; but, in fact, the two systems are not exclusive and might even complement each other. It’s possible to write X servers that generate PostScript output to drive PostScript devices, and it’s equally possible to write interpreters that translate PostScript output into X Protocol requests so that X Window and PostScript clients can address the same server. Because PostScript offers far more powerful typographical functions than does X Window, the two could prove to be synergistic.

Sun, whose NeWS windowing system is based on Display PostScript, has indicated its intention to build in X Window support. Microsoft has also put out feelers about a possible X Window implementation of the Presentation Manager.

As for the future of the X Window System itself, the X Consortium has agreed to freeze the core specification at the XI1 level for at least 3 years. This will allow software developers to work unhindered by upgrades. The principal development activity until then will be bug fixing, internationalization, an ANSI standard, extensions such as three-dimensional graphics support to the PHIGS standard, and inclusion of live video in X Windows.

Dick Pountain

Dick Pountain is a BYTE contributing editor, a technical author, and a software consultant living in London, England. You can contact him on BIX as “dickp.”

Further Information:
IXI Ltd.
62-74 Burleigh St.
Cambridge CB1 1OJ, U.K.

MIT Software Distribution Center
MIT E32-300
77 Massachusetts Ave.
Cambridge, MA 02139

Sidebars:
“Managing the X Window desktop”
Listing of an X Window program