Let's say you want to add a search box to your web site to find words within your published content. Or let's say you want to display a list of articles published on your blog, together with snippets of what the article looks like, or a short summary.

In both cases you probably have an .html page or some content from which you want to generate a snippet, just like on this blog: if you visit http://rabexc.org, you can see all articles published recently. Not the whole article, just a short summary of each.

Or if you look to the right of this page, you can see snippets of articles in the same category as this one.

Turns out that generating those snippets in python using flask and basic python library is extremely easy.

So, here are a few ways to do it...

Before starting to talk about python, I should mention that doing this in javascript should be extremely easy and straightforward. In facts, you can find many sites that load entire articles and then magically hide portions of it using javascript.

[ ... ]

With my last laptop upgrade I started using awesome as a Window Manager.

I wasn't sure of the choice at first: I have never liked graphical interfaces, and the thought of having to write lua code to get my GUI to provide even basic functionalities wasn't very appealing to me.

However, I have largely enjoyed the process so far: even complex changes are relatively easy to make, while the customizability has improved my productivity while making the interface more enjoyable for me to use.

The switch, however, has forced me to change several things in my setup. Among others, I ended up abandoning xscreensaver for i3lock and xautolock, while changing a few things on my system to better integrate with the new environment.

In this article, you will find:

  • A description of how to use xautolock together with i3lock to automatically lock your screen after X minutes of inactivity and when the laptop goes to sleep via ACPI.

  • My own recipe to display the battery status on the top bar of Awesome. This is very similar to existing suggestions on the Awesome wiki, except there is support for displaying the status of multiple batteries at the same time. Which, for how rare this may sound, is something supported on my laptop which I regularly use (x230 with 19+ cell slice battery).

[ ... ]

Jun 21, 2014 | Technology/Linux

Let's say you want to make the directory /opt/test on your desktop machine visible to a virtual machine you are running with libvirt.

All you have to do is:

  • virsh edit myvmname, edit the XML of the VM to have something like:

    <domains ...>
      ...
    
      <devices ...>
        <filesystem type='mount' accessmode='passthrough'>
          <source dir='/opt/test'/>
          <target dir='testlabel'/>
        </filesystem>
      </devices>
    </domains>
    

    where /opt/test is the path you want to share with the VM, and testlabel is just a mnemonic of your choice.

    Make sure to set accessmode to something reasonable for your use case. According to the libvirt documentation, you can use:

    mapped
    To have files created and accessed as the user running kvm/qemu. Uses extended attributes to store the original user credentials.
    passthrough
    To have files created and accessed as the user within kvm/qemu.

[ ... ]

All the libvirt related commands, like virsh, virt-viewer or virt-install take a connect URI as parameter. The connect URI can be thought as specifying which set of virtual machines you want to control with that command, which physical machine to control, and how.

For example, I can use a command like:

virsh -c "xen+ssh://admin@corp.myoffice.net" start web-server

to start the web-server virtual machine on the xen cluster running at myoffice.net, by connecting as admin via ssh to the corresponding server.

If you don't specify any connect URI to virsh (or any other libvirt related command), by default libvirt will try to start a VM running as your username on your local machine (eg, qemu:///session). This unless you are running as root, in which case libvirt will try to run the image as a system image, not tied to any specific user (eg, qemu:///system).

I generally run most of my VMs as system VMs, and systematically forget to specify which connect URI to use to commands like virsh or virt-install. What is more annoying is that some of those commands take the URI as -c while others as -C.

[ ... ]

While traveling, I have been asked a few times by security agents at airports to turn on my laptop, and well, show them it did work, and looked like a real computer.

Although they never searched the content and nothing bad ever happend, every time I cross the border or go through security I am worried about what might happen, especially given recent stories of people being searched and their laptops taken away for further inspection.

The fact I use full disk encryption does not help: if I was asked to boot, my choice would be to either enter the password and login, thus disclosing most of the content of the disk, or refuse and probably have my laptop taken away for further inspection.

So.. for the first time in 10 years, I decided to keep Windows on my personal laptop. Even more, leave it as the default operating system in GRUB, and well, not show up GRUB at all during boot.

Not because I think it is safer this way, but just to create as little pretexts or excuses for anyone to further poke at my laptop, in case I need to show it or they need to inspect it.

[ ... ]

I am writing a small python script to keep track of various events and messages. It uses a flat file as an index, each record being of the same size and containing details about each message.

This file can get large, in the order of several hundreds of megabytes. The python code is trivial, given that each record is exactly the same size, but what is the fastest way to access and use that index file?

With python (or any programming language, for what is worth), I have plenty of ways to read a file:

  • I can just rely on read and io.read in python having perfectly good buffering, and just read (or io.read) a record at a time.
  • I can read it in one go, and then operate in memory (eg, a single read in a string, followed by using offsets within the string).
  • I can do my own buffering, read a large chunk at a time, and then operate on each chunk as a set of records (eg, multiple read of some multiple of the size of the record, followed by using offsets within each chunk).
  • I can use fancier libraries that allow me to mmap or use some other crazy approach.

[ ... ]

Just a few days ago I realized that the Raspberry PI I use to control my irrigation system was dead. Could not get to the web interface, pings would time out, could not ssh into it.

The first thing I tried was a simple reboot. The raspberry is in a black box in my backyard, maybe the hot summer days were... too hot? I have a cron job that shuts it down if the temperature goes above 70 degrees. Or maybe the shady wireless card and its driver stopped working? I have another cron job to restart it, so this seems less likely.

So.. I reboot it by phyiscally unplugging it, but still nothing happens. The red led on the board, next to the ethernet plug is on, which means it is getting power. The green led next to it flashes only once. By reading online, this led can flash to report an error, or to indicate that the memory card is being read.

There is no error corresponding to one, single, flash, so I assume it means that it tried to read the flash, and somehow failed. It is supposed to be booting now, so I would expect much more activity from the memory card.

[ ... ]

When it comes to HTML, CSS, and graphical formatting, I feel like a daft noob.

Even achieving the most basic formatting seems to take longer than it should. Giving up on reasonable compromises is often more appealing to me than figuring out the right way to achieve the goal.

Anyway, tonight I am overjoyed! I wanted to have a <pre> block, with code that:

  • had an horizontal scroll bar.
  • but only when there are lines too long.
  • and well, long lines did not wrap.

I first fidgeted with the white-space property in the attribute, which has a nowrap value, and various other ones. None of them seemed to do what I wanted, the only valid value to preserve white spacing was pre.

overflow-x: auto was easy to find. It would do the right thing except... the text was wrapping, so the scroll bar never showed up.

It took me a while to discover that a word-wrap: normal would do exactly what I wanted.

So, here is the final CSS:

pre {
  word-wrap: normal;
  overflow-x: auto;
  white-space: pre;
}

And here is what it looks like rendered:

This is a really really really really really really really really really really really really really really really really really really really really really really really long line

It's amazing how happiness at times can come from very little things.

[ ... ]

Jul 11, 2013 | Technology/Web

I've always liked text consoles more than graphical ones. This at least until some time in 2005, when I realized I was spending a large chunk of my time in front of a browser, and elinks, lynx, links and friends did not seem that attractive anymore.

Nonetheless, I've kept things simple: at first I started X manually, with startx, on a need by need basis. I used ion (yes! ion) for a while, until it stopped working during some upgrade. Than I decided it was time to boot in a graphical interface, and started using slim. Despite some quirks, I've been happy since.

In terms of window managers, I really don't like personalizing or tweaking my graphical environment. I see it as a simple tool that should be zero overhead, require no maintenance, and not get in the way of what I want to do with a computer. I don't want to learn which buttons to click on, how to do transparency, which icons mean what, or where the settings I am looking for were moved to in the latest version.

[ ... ]

If you like hacking and have a few machines you use for development, chances are you know what I am about to talk about here. You start from this new idea, install a few tools, peek at some existing source code, try to compile it, get something running... and eventually move on to the next project

At least until your laptop becomes a giant meatball of services running for who knows what reason, you can't remember which machine you were actually using for that test, or half assed scripts you have no memory of keep creeping up in your PATH.

My first approach at finding a solution was based on chroots. The idea was simple: only develop on my laptop, but create a self contained environment for each project where to install all the needed dependencies and tools, and where to run all my crazy experiments. The holy grail of the time were chroots, and during those years, I became good friend with rsync, debootstrap, mount --rbind and sometimes even pivot_root.

This worked well for a while. Until, well, I run into the limitations of chroots: can't really simulate networking, run different kernels (or OSes), and don't help much if you need to work on something boot related or that has to do with userspace and kernel interactions.

[ ... ]

Just a few days ago I finally got a new server to replace a good old friend of mine which has been keeping my data safe since 2005. I was literally dying to get it up and running and move my data over when I realized it had been 8 years since I last setup dmcrypt on a server I only had ssh access to, and had no idea of what best current practices are.

So, let me start first by describing the environment. Like my previous server, this new machine is setup in a datacenter somewhere in Europe. I don't have any physical access to this machine, I can only ssh into it. I don't have a serial port I can connect to over the network, I don't have IPMI, nor something like intel kvm, but I really want to keep my data encrypted.

Having a laptop or desktop with your whole disk encrypted is pretty straightforward with modern linux systems. Your distro will boot up, kernel will be started, your scripts in the initrd will detect the encrypted partition, stop the boot process, ask you for a passphrase, decrypt your disk, and happily continue with the boot process.

[ ... ]

Let's say you have a CSS with a few thousand selectors and many many rules. Let's say you want to eliminate the unused rules, how do you do that?

I spent about an hour looking online for some tool that would easily clean up CSS files. I've ended up trying a few browser extensions:

  • CSS Remove and combine, for chrome, did not work for me. It would only parse the very first web site in my browser window, and seemed to refuse file:/// urls. I later discovered that chrome natively supports this feature: just go in developer tools (ctrl + shift + i), click the audits tab, click run, and you will find a drop down with the list of unused rules in your CSS.

  • Dust-me Selectors, for firefox, worked like a charm: it correctly identified all the unused selectors.

In both cases, however, the list was huge, I had thousands of unused selectors. I was really not looking forward to go through my CSS by hand, considering also that many styles had multiple selectors, and I could only remove the unused ones.

[ ... ]

Let's say you have a regression test or fuzzy testing suite that relies on generating a random set of operations, and verifying their results (like ldap-torture).

You want this set operations to be reproducible, so if you find a bug, you can easily get to the exact same conditions that triggered it.

There are many ways to do this, but one simple way is to use one of many pseudo random generators, one that given the same starting seed generates the same sequence of random numbers. Example?

Let's look at perl:

# Seed the random number generator.
srand($seed);

# Generate 100 random numbers.
for (my $count = 0; $count < 100; $count++) {
  print rand() . "\n";
}

Given the same $seed, the sequence of random numbers will always be the same. Not surprising, right?

Now, let's go back to our original problem: you want your test to be reproducible, but still be random. Something you can do is get rid of $seed, and just call srand(). srand will return the seed generated, that you can helpfully print on the screen and reuse if you need to. The final code would look like:

if ($seed) {
  # Use an existing seed to reproduce a failing test.
  srand($seed);
} else {
  # Let srand pick a seed to start a newly randomized test.
  $seed = srand();
}

print "TO REPRODUCE TEST, USE SEED: " . $seed . "\n";

Now, where is the problem? Well, the problem is that before perl 5.14 (~2011, in case you are wondering), srand() did not return the seed it set. Just doing $seed = srand() did not work.

[ ... ]

While trying to get ldap torture back in shape, I had to learn again how to get slapd up and running with a reasonable configs. Here's a few things I had long forgotten and I have learned this morning:

  1. The order of the statements in slapd.conf is relevant. Don't be naive, even though the config looks like a normal key value store, some keys can be repeated multiple times (like backend, or database), and can only appear before / after other statements.
  2. My good old example slapd.conf file, no longer worked with slapd. Some of it is because the setup is just different, some of it because I probably had a few errors to being with, some of it is because a few statements moved around or are no longer valid. See the changes I had to make.
  3. Recent versions of slapd support having configs in the database itself, or at least represented in ldiff format and within the tree. Many distros ship slapd with the new format. To convert from the old format to the new one, you can use:

    slapd -f slapd.conf -F /etc/ldap/slapd.d
    
  4. I had long forgotten how quiet slapd can be, even when things go wrong. Looking in /var/log/syslog might often not be enough. In facts, my database was invalid, configs had error, and there was very little indication of the fact that when I started slapd, it was sitting there idle because it couldn't really start. To debug errors, I ended up running it with:

    slapd -d Any -f slapd.conf
    
  5. slapd will not create the initial database by itself. To do so, I had to use:

    /usr/sbin/slapcat -f slapd.conf < base.ldiff
    

    with base.ldiff being something like this.

[ ... ]

Have you ever been lost in conversations or threads about one or the other file system? which one is faster? which one is slower? is that feature stable? which file system to use for this or that payload?

I was recently surprised by seeing ext4 as the default file system on a new linux installation. Yes, I know, ext4 has been around for a good while, and it does offer some pretty nifty features. But when it comes to my personal laptop and my data, well, I must confess switching to something newer always sends shrives down my back.

Better performance? Are you sure it's really that important? I'm lucky enough that most of my coding & browsing can fit in RAM. And if I have to recompile the kernel, I can wait that extra minute. Is the additional slowness actually impacting your user experience? and productivity?

Larger files? Never had to store anything that ext2 could not support. Even with a 4Gb file limit, I've only rarely had problems (no, I don't use FAT32, but when dmcrypt/ecryptfs/encfs and friends did not exist, I used for years the good old CFS, which turned out to have a 2Gb file size limit). Less fragmentation? More contiguous blocks? C'mon, how often have you had to worry about the fragmentation of your ext2 file system on your laptop?

What I generally worry about is the safety of my data. I want to be freaking sure that if I lose electric power, forget my laptop in suspend mode or my horrible wireless driver causes a kernel panic I don't lose any data. I don't want no freaking bug in the filesystem to cause any data loss or inconsistency. And of course, I want a good toolset to recover data in case the worst happens (fsck, debug.*fs, recovery tools, ...).

[ ... ]

Back in 2004 I was playing a lot with OpenLDAP. Getting it to run reliably turned out more challenging than I had originally planned for:

  1. BerkeleyDB performance was terrible if the proper tunings were not provided. Nowhere in the docs was mentioned that this was necessary. The way to do it was to drop a DB_CONFIG file in the top level directory of the database. Not a feature of openldap, rather a feature of BerkeleyDB.
  2. Not only performance would be terrible, but even the latest BerkeleyDB versions at the time had a bug (feature?) by which with the indexes used by openldap the database would deadlock if certain parts of the index did not fit in memory. I don't remember the details of the problem, it's been too long, but I do remember it was painful, and ended up submitting changes to the openldap package in debian to make sure this was mentioned in the documentation, and that a reasonable default would be provided.
  3. At the time, OpenLDAP supported two kind of backends: BDB, and HDB, both based on BerkeleyDB. The first, older, did not support operations like 'movedn', which had been standardized in the LDAP protocol for a while, and a few other features that HDB had. HDB though, was marked as experimental. During our use, we found several bugs.

[ ... ]