An unwilling dive in xfce4 internals

I've always liked text consoles more than graphical ones. This at least until some time in 2005, when I realized I was spending a large chunk of my time in front of a browser, and elinks, lynx, links and friends did not seem that attractive anymore.

Nonetheless, I've kept things simple: at first I started X manually, with startx, on a need by need basis. I used ion (yes! ion) for a while, until it stopped working during some upgrade. Than I decided it was time to boot in a graphical interface, and started using slim. Despite some quirks, I've been happy since.

In terms of window managers, I really don't like personalizing or tweaking my graphical environment. I see it as a simple tool that should be zero overhead, require no maintenance, and not get in the way of what I want to do with a computer. I don't want to learn which buttons to click on, how to do transparency, which icons mean what, or where the settings I am looking for were moved to in the latest version.

So I started using xfce. Not because of a particularly well informed choice, just because it worked out of the box with a reasonably minimal interface, and was fast to load. And if all you need is a browser opened on a pane, and gnome-terminal on the other, this is a really good choice.

Today, however, it broke :( for the first time since I have installed it, and through years of unattended upgrades, I'm forced to write this post from a tiny window on the left top corner of my monitor. And to fix it, I had to find out much more than I ever wanted to know about a window manager and xfce.

So, here are the symptoms: rebooted the laptop after a long long time (usually I just put it to sleep), got to the login prompt with slim, entered my username and password, and... sadness, I get a tiny terminal in my leftmost top corner, nothing else, my mouse looks like that X I hadn't seen in a long time, on black and white background. No sign of the usual desktop, tray or windows, which I can't even resize.

I really didn't want to deal with this, had other things I wanted to do on my laptop. So, what to do? For a while, I just used this tiny xterm. But this got boring pretty quickly: if I opened the browser, I had to close it to go back to the terminal. No alt+tab, no panes, couldn't move or resize windows. I finally decided it was time to fix the issue.

Here's the things I did and unwillingly learned in the process, just in case you end up in a similarly tragic situation. Note that each debugging session is about 30 minutes long, the time it takes me to get home and get to work on public transport, minus some time to read emails or interact with other people around me.

First round...

I don't believe in reboots as the one size fits it all solution, but my first hope was that this was some sort of transient failure. Maybe something just went wrong during startup, so I tried the following things:

  • /etc/init.d/slim restart - to restart the login manager, which in turn would start xfce4 again. It did not help, no useful message on the console, no error whatsoever in the logs. Clean as a bottle of grappa in the winter.
  • /etc/init.d/slim stop, and startxfce4 - to start xfce4 manually. Same problem as above, but at least ruled out a problem in slim, which is a good start.

From the tiny terminal, I then started xfce4-session, which supposedly is the component that starts up xfce4. Unfortunately, it just bailed out with error:

xfce4-session: Another session manager is already running

Which at least told me xfce4-session had been started already. I could confirm with ps -C xfce4-session, but ps faux only showed my terminal as a child of xfce4-session, while from my past memories I believe I would see many more xfce components.

So.. why did xfce4-session not start anything else? I started poking around in log files, with no luck. Nothing logged at all. strace -fp `pidof xfce4-session` also showed it was just sitting there waiting for some syscall to complete.

Maybe one of the components was not started properly? So I started manually fidgeting with the various pieces of xfce4.

Running xfwm4 manually gave me the naked window manager. At least I could now resize and move windows, victory! Still no start menu, still a single panel.

xfce4-panel gave me, well, the panels, and the "start buttons" at the bottom of the screen.

At this point, my graphical interface was in good enough shape to be usable again. Good, I could stop worrying about it, and do some real work :).

Second round...

Second day, second round. Let's assume that xfce4-session is not starting everything it should. How is it configured? according to the man page, beside a few caches it reads its configurations by using xfconf.

xfce4-session reads its configuration from Xfconf. xfce4-session stores its session data into $XDG_CACHE_HOME/sessions/.

man page also refers to a "sessions" subdirectory of $XDG_CACHE_HOME, by default in ~/.cache/, and a set of subdirectories in $XDG_CONFIG_HOME, by default in ~/.config/.

Let's start poking at xfconf-query. A simple:

$ xfconf-query
Channels:
  thunar-volman
  xfce4-mixer
  keyboards
  xfce4-desktop
  xfwm4
  xfce4-power-manager
  xfce4-settings-manager
  xfce4-panel
  xsettings
  xfce4-keyboard-shortcuts
  thunar
  pointers
  xfce4-session

returns a list of channels, and after reading xfconf-query --help, I tried:

$ xfconf-query -R -c xfce4-session -l
/general/FailsafeSessionName
/general/SaveOnExit
/general/SessionName
/sessions/Failsafe/Client0_Command
/sessions/Failsafe/Client0_PerScreen
[...]

which gave me the list of settings for xfce4-session. To fetch a variable, I can do:

$ xfconf-query -c xfce4-session -p /sessions/Failsafe/Count
5

But nothing particularly interesting turned out here. So let's poke at the $XDG_CACHE_HOME, and $XDG_CONFIG_HOME.

$XDG_CACHE_HOME seems just a collection of cached files, ranging from chrome to duplicity, and well, xfce.

In ~/.cache/sessions, aka $XDG_CACHE_HOME/sessions, referenced earlier, I see a list of files:

Thunar-xxxx-33af-41b8-80c9-xxxx
xfce4-session-joshua:0
xfce4-session-joshua:0.bak
xfwm4-xxxx-1ecf-41cf-8d02-yyyyy
xfwm4-xxxx-1ecf-41cf-8d02-xxxxx.state

let's look at xfce4-session-joshua:0 to start with, cat xfce4-session-joshua:0. Turns out it's a simple text file, maybe providing settings for each program I had started during the last session of xfce4? seems like a plausible idea (some stuff replaced by xxx and yyy):

[Session: Default]
Client0_ClientId=xxxx
Client0_Hostname=local/joshua
Client0_CloneCommand=xfwm4,--display,:0.0
Client0_DiscardCommand=rm,-rf,/home/yyy/.cache/sessions/xfwm4-xxx.state
Client0_RestartCommand=xfwm4,--display,:0.0,--sm-client-id,xxxx
Client0_CurrentDirectory=/home/yyy
Client0_Program=xfwm4
Client0_UserId=yyy
Client0_Priority=15
Client0_RestartStyleHint=2
Client1_ClientId=zzzz
Client1_Hostname=local/joshua
Client1_CloneCommand=Thunar
[...]

Note the CloneCommand line above shows a command line to run. Let's look at it in more details:

$ grep CloneCommand ./xfce4-session-joshua\:0
Client0_CloneCommand=xfwm4,--display,:0.0
Client1_CloneCommand=Thunar
Client2_CloneCommand=xfce4-panel
Client3_CloneCommand=xfdesktop,--display,:0.0
Client4_CloneCommand=xfce4-settings-helper,--display,:0.0
Client5_CloneCommand=gnome-terminal

Note that 2 out of 6 commands (xfwm4, xfce4-panel) are the ones I had to run manually to get back some of the normal features of a desktop environment. Let's try to run some of the others:

  • Thunar - a file manager kind of window appears. Given that I had never seen it before, I just close it. Useless, you can do the same with a shell.
  • xfdesktop - yay! my background (a solid reddish thing) appears. Together with 3 icons. Overall, I can do without, but it's nice to have a familiar and uniform color as a background.
  • xfce4-settings-helper - doesn't seem to be installed on my system. Weird.

So.. what is xfce4-settings-helper? and what happened to it?

$ apt-cache search xfce4-settings-helper

turns out empty. Let's go with apt-file:

$ apt-file search xfce4-settings-helper
xfce4-settings: /usr/bin/xfce4-settings-helper

So it should be part of xfce4-settings. Let's look at it:

$ dpkg -L xfce4-settings |grep bin
/usr/bin/xfsettingsd
/usr/bin/xfce4-settings-manager
/usr/bin/xfce4-display-settings
/usr/bin/xfce4-mime-settings
/usr/bin/xfce4-mouse-settings
/usr/bin/xfce4-settings-editor
/usr/bin/xfce4-accessibility-settings
/usr/bin/xfce4-keyboard-settings
/usr/bin/xfce4-appearance-settings

Looks like xfce4-settings-helper has been replaced by something else recently? My apt-file index is probably a few months old at this point. xfsettingsd seems useful, by the name of it. But turns out it's already running:

$ ps -C xfsettingsd
  PID TTY          TIME CMD
 4990 ?        00:00:03 xfsettingsd

and it's been running since my first attempt at fixing the system. If I run the xfce4-... commands in xfce4-settings manually, I see some sort of control panels to change the settings of keyboard, mouse, ... Not surprising :).

So, what about the other files in ~/.cache/sessions? I am not interested in Thunar.*, so let's look at the xfwm4 files:

$ cat xfwm4-2d1adf3c0-1ecf-41cf-8d02-0f70f2f2f5eb
[CLIENT] 0x2400004
  [CLIENT_ID] 2c734204f-71b3-4e34-916d-a3367d9c329f
  [CLIENT_LEADER] 0x2400001
  [WINDOW_ROLE] gnome-terminal-window-3521-307132299-1301003369
  [RES_NAME] gnome-terminal
  [RES_CLASS] Gnome-terminal
  [WM_NAME] ccontavalli@joshua: /var/log
  [WM_COMMAND] (1) "gnome-terminal"
  [GEOMETRY] (0,15,1280,785)
  [GEOMETRY-MAXIMIZED] (3,15,577,335)
  [SCREEN] 0
  [DESK] 1
  [FLAGS] 0x10300
[CLIENT] 0x1a0006a
  [CLIENT_ID] 2cde246f4-a13f-4060-bc80-638880912489
  [CLIENT_LEADER] 0x1a00001
  [WINDOW_ROLE] browser
  [RES_NAME] Navigator
  [RES_CLASS] Iceweasel
  [WM_NAME] slim xfce4 consolekit debian - Google Search - Iceweasel
  [WM_COMMAND] (1) "firefox-bin"
  [GEOMETRY] (0,15,1280,785)
  [GEOMETRY-MAXIMIZED] (0,15,1280,785)
  [SCREEN] 0
  [DESK] 0
  [FLAGS] 0x10300

This looks an awful lot like the screen I had when I last used xfce4, this is probably where xfwm4 stores my last session. Overall not very interesting.

Let's move back to exploring $XDG_CONFIG_HOME, ~/.config/. Here there seems to be a directory for each software I've used in X in the last few months. Not surprisingly, there is a xfce4 and xfce4-session subdirectory. Let's explore them.

The main directories I recognize are:

[...]
./xfce4/xfconf/xfce-perchannel-xml/xfce4-session.xml
./xfce4/xfconf/xfce-perchannel-xml/pointers.xml
./xfce4/xfconf/xfce-perchannel-xml/thunar.xml
./xfce4/xfconf/xfce-perchannel-xml/xfce4-keyboard-shortcuts.xml
./xfce4/xfconf/xfce-perchannel-xml/displays.xml
./xfce4/xfconf/xfce-perchannel-xml/xsettings.xml
[...]

Those seem the settings shown by xfconf earlier. Opening those files confirms that they are likely the same settings, stored in .xml.

[...]
./xfce4/panel/launcher-12533521212.rc
./xfce4/panel/systray-4.rc
./xfce4/panel/tasklist-12533520341.rc
./xfce4/panel/launcher-9
./xfce4/panel/launcher-9/13094492190.desktop
[...]

this looks an awful lot like what I have in my "menu bar" at the bottom of the screen. This is probably where xfce4 stores the buttons I have configured.

./xfce4-session turns out to be empty :(. Nothing here again. It's probably time to get to the next level, let's look at the xfce4-session source code.

First thing I notice in main are:

/* check that no other session manager is running */
sm = g_getenv ("SESSION_MANAGER");
if (sm != NULL && strlen (sm) > 0)
  {
    g_printerr ("%s: Another session manager is already running\n", PACKAGE_NAME);
    exit (EXIT_FAILURE);
  }

/* check if running in verbose mode */
if (g_getenv ("XFSM_VERBOSE") != NULL)
  xfsm_enable_verbose ();

So by removing the SESSION_MANAGER environment variable and setting XFSM_VERBOSE, hopefully I can run xfce4-session manually and see what's happening.

Let's try:

$ unset SESSION_MANAGER
$ export XFSM_VERBOSE=foo
$ xfce4-session
xfce4-session: Another session manager is already running

Argh, there is another check further down in the code:

if (DBUS_REQUEST_NAME_REPLY_PRIMARY_OWNER != ret)
  {
    g_printerr ("%s: Another session manager is already running\n",
                PACKAGE_NAME);
    exit (EXIT_FAILURE);
  }

so, no luck. And well, I am out of time for today.

Last and final round...

I'd really like at this point to run xfce4-session under strace or ltrace, to see what's happening under the hood. Given I can't just run xfce4-session from my shell easily, let's try to exit the graphical interface, and run strace -f startxfce4 2>/tmp/log. If I am lucky, I will see something failing after the exec(... xfce4-session ...) somewhere in the middle of the trace. If not, it will be too noisy, and will need to find a better way to trace the problem.

As soon as I exit the graphical interface, I notice on my console some messages that look like:

(xfce4-session:16876): xfce4-session-WARNING **: Unable to launch "xfwm4": Failed to change to directory '/home/xxx' (No such file or directory)
(xfce4-session:16876): xfce4-session-WARNING **: Unable to launch "xfwm4": Failed to change to directory '/home/xxx' (No such file or directory)
(xfce4-session:16876): xfce4-session-WARNING **: Unable to launch "xfce4-panel": Failed to change to directory '/home/xxx' (No such file or directory)
(xfce4-session:16876): xfce4-session-WARNING **: Unable to launch "xfce4-panel": Failed to change to directory '/home/xxx' (No such file or directory)
(xfce4-session:16876): xfce4-session-WARNING **: Unable to launch "xfdesktop": Failed to change to directory '/home/xxx' (No such file or directory)
(xfce4-session:16876): xfce4-session-WARNING **: Unable to launch "xfdesktop": Failed to change to directory '/home/xxx' (No such file or directory)

YAY! This is probably the culprit: a few months ago I moved my home directory to /opt, as my root partition (where /home used to live), was full. I changed the record in passwd, and naively assumed that programs would still find their data in the new location, using wordexp() for tilde expansion, environment variables or getpwnam().

I bet nobody ever changes his home directory, or has different NFS mount points on different machines (right, who would use /home/u/user, for example, on a crowded server? and /home/user in his own desktop? surely a home directory is visible from the same path on every computer where one might access it).

Conclusion

At this point, I first tried to create a symlink from the old home location to the new one, and everything appeared to work:

ln -s /home/xxx /opt/home/xxx

To fix the problem forever, I used something like:

find ~/.config ~/.cache/sessions -print0 |xargs -0 sed -i -e "s@/home/xxx@/opt/home/xxx@"

which updated all the paths in every config and sessions file.

Updates: as pointed out on the Debian bug I filed, I should have started looking from ~/.xsession-errors. That would have made things easier :)


Other posts

Technology/System Administration