Monday, January 23, 2012

SVN Revert (but only *some* changes)

Sometimes I endeavor in complicate refactorings and, instead of commiting to a branch when at least a part is working, I keep editing until I lose control of the source and wished that I could go back to that previous working situation.

The solution is, luckily, not too difficult:

  1. save the edits to a patch
    svn diff > great_mother_of_all_patches.patch
  2. revert the project to the last commit:
    svn revert -R .
  3. apply the patch edit by edit (I know this is tedious) using the eclipse "Team Synchronizing" perspective, then click on the Synchronize button and choose the Synchronize with patch option
Disclaimer: I don't use the Eclipse subversion plugin because, on Linux at least, it frequently crashes Eclipse.

Friday, January 20, 2012

OTRS-hacking: log all sql queries for use in external reports

OTRS is a great ticket-tracking/helpdesk software. It even has cool statistics built in, but sometimes the PHB goes crazy and asks for impossibily detailed statistics.

For those situations when the stats module cannot help us there a couple of (Open Source) tricks up our sleeve:
  1. install the Eclipse Birt web reporting application (we will need java and tomcat for that)
  2. create a custom report using the report designer
  3. deploy the report on the server and send the link to the boss
To make sure the SQL queries used in the report are the same used by OTRS we can temporarily have OTRS log all queries (remember to disable logging after you're done or your log files will grow out of control) by editing the Kernel/System/ file as documented here:

# 0=off; 1=updates; 2=+selects; 3=+Connects;
# $Self->{Debug} = $Param{Debug} || 0; 
# leave the original around for later
$Self->{Debug} = 2; 

Tuesday, January 17, 2012

Grails, blobs and postgres 'bytea_output'

Grails has great support for storing binary data into any database, Postgres included. It is usually simple to use but today it gave me problems.

I have an application originally developed on pg 8.4 and grails 1.3.6 which stores images in a database table and used to work fine until I upgraded the database to 9.1. It turns out that with version 9 Postgres by default will return data to the client using a new hex format, instead of the escape format used in versions < 9. This, of course, confuses the client which returns garbage to the browser, hence the corrupted image.

The solution is, luckily, quite simple: just tell postgres to revert to the old behaviour. It seems that this behaviour can be tuned per database, so you could have one database using the new format and another the old one:

ALTER DATABASE dbname SET bytea_output='escape';

I should point out that I could also have updated the jdbc driver (now I'm using postgresql-8.3-603.jdbc3.jar), but that seems more risky and would probably break backwards compatibility with 8.x.

Saturday, January 14, 2012

Using SQLite to keep state in shell scripts

Bash shell scripting is one the things that I miss more on the times I work on Windows. Even as good as shell scripting is, sometimes I wish it was easy to keep track of state across script executions, for instance when a script executes a rsync at short intervals. In that case I don't care if one particula rsync fails, but I definitely care if it fails, say, ten times in a row or for more than a day.

To do this I need some kind of way to keep state and record each run exit status. The simplest approach that I could think of is to use SQLite. Copied straight from the SQLite site: SQLite is a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine. SQLite is the most widely deployed SQL database engine in the world. The source code for SQLite is in the public domain.

As I said, in this example I'll show how to track the status of a series of rsync operations, all run from a central node and pushed to remote nodes. The script is scheduled to run every 30m.

First thing to do in the shell script is to create and init the database unless it does not exist. This is pretty easy (notice the unique constraint, which I'll explain below):

function createdb {
 echo "create table stat(host varchar(20) unique, failcount smallint, lastsuccess datetime);" | sqlite3 db/stats.db

mkdir db > /dev/null 2>&1
test -e db/stats.db || createdb

I put this code at the beginning of the script. If the database gets corrupted I just delete the file and it gets recreated with the next run.

The table structure is easy enough: for each host I want to know when the last successful replication occurred and how many failures since then.

The second important step is to make sure there is a row for each host, because in SQL data must first be inserted and then updated. This is also easy since SQLite support the OR statement: basically we add a unique constraint on the host column and then at every run attempt to insert an initial row for each host. On a traditional db (like Oracle or Postgres) the second insert would trigger an error because the duplicate value violates the unique constraint. Instead on a db that supports OR we can tell the db to ignore the duplicate and keep the existing row. This way I don't have to test if the row exists and insert it if it doesn't with the bonus that the script stays small and simpler:

echo "insert or ignore into stat(host,failcount,lastsuccess) values ('$ip', 0, null);" | sqlite3 db/stats.db  

SQL statements can be fed on stdin or as the second argument of the sqlite3 program. I'll show you how in the last fragment. In this case $ip holds the ip address or hostname used to identify the host in the script.

After the rsync command has run I capture the exit code and store it into the db:

if [ "$status" == "0" ]; then
   echo "update stat set failcount=0, lastsuccess=datetime('now') where host='$ip';" | sqlite3 db/stats.db  
   echo "update stat set failcount=failcount+1 where host='$ip';" | sqlite3 db/stats.db  

At the end of the script I write out failures stats in a file that is later piped into a mail notification:

echo "The following hosts NEVER completed a sync successfully:" > stats.txt
sqlite3 db/stats.db "select host from stat where lastsuccess is null" >> stats.txt
echo "" >> stats.txt
echo "The following hosts did not sync in the last 24 hours:" >> stats.txt
sqlite3 db/stats.db "select host from stat where lastsuccess < (datetime('now','-1 day')) and failcount>0" >> stats.txt

Thursday, January 12, 2012

Rsync (or any other cygwin daemon) as a Windows service

Today I was asked to quickly put together a script to replicate files/directories from a central server to a number of XP clients. This would be dead simple if it was not for Windows.
The customer originally thought of robocopy/xcopy but I dont' really trust them: error reporting is kinda difficult to read and copying over cifs is slow and unreliable. So I set out to build it on top of rsync, but to do that I had to find a way to get rsync installed as a service on all the clients.

Turns out it's quite easy. I started from a clean cygwin install to which I added rsync and cygrunsrv. The latter is the native cygwin tool that can be use to run any tradition *nix daemon as a native Windows service.
Even a basic install like this will be in the order of 50MB or so which is too large for quick deployments. I then started trimming all the stuff that was not essential to the rsync service.

At the end of the trimming process I was left with a mere 10MB, which after being compressed in a nice nsis setup (source here) went down to 5MB. That should be small enough.

This is the complete list of files that I kept from the original cygwin install. I deleted all uneeded programs (like awk, groff, etc), man pages, static libraries, header files, timezones and even language translations (except a few basic ones). Please note that the rsyncd.conf was created by me with the settings needed on this specific occasion:


Now the only thing missing is a bat with the commands used to install the service:

cd c:\cygwin\bin\
cygrunsrv -I "RSYNC" -d "Rsync server" -p /bin/rsync.exe -a "--daemon --no-detach"
net start RSYNC

For ssh, telnet, amanda anythin else the steps are quite the same: start from the list above and add the needed dll and binaries. A nsis script that you can use as a template is available here.

Thursday, January 05, 2012

JTS processing Grails servlet

Sometimes it is convenient to make certain topological operations available in a web-based gis. For a gis that makes heavy use of Javascript (like OpenLayers-based ones) it might be worth looking at jSTS, a Javascript port of JTS.

For all the rest and for those who don't want to load yet another library in the browser you can always write a Grails controller that encapsulates common JTS operations like buffer, intersection, union, etc.

Assuming you are familiar with Grails the steps are as follows:
  1. drop the jts jar in the lib directory
  2. create a controller and define the relevant methods
  3. define a url mapping to prettify the calls
Step 1 is trivial, so we'll go straight to 2. Create a controller and call it JtsController, then open the source file and paste this code:

import com.vividsolutions.jts.geom.*
import com.vividsolutions.jts.operation.overlay.snap.*
import grails.converters.JSON

import grails.plugins.springsecurity.Secured

class JtsController {
 def exec = {
  def pm = new PrecisionModel(PrecisionModel.FLOATING_SINGLE);
  def fact = new GeometryFactory(pm);
  def wktRdr = new WKTReader(fact); 
  def text = request.reader.text
  if(params.operation) {   
   def geometries = text.split("\\*")
                 Geometry A = selfSnap([0]))
   Geometry B = null
                  Geometry C = null
   if (geometries.length==2)
    B = selfSnap([1]))
   if ("area".equalsIgnoreCase(params.operation))
    C = A;
   else if ("intersection".equalsIgnoreCase(params.operation))
    C = A.intersection(B);
   else if ("union".equalsIgnoreCase(params.operation))
    C = A.union(B);
   else if ("buffer".equalsIgnoreCase(params.operation)) {
    // defaults to 25
    C = A.buffer(25);
   } else if (params.operation.startsWith("buffer")) {
    // parametric buffer
    def distance=(String)params.operation.substring(6)
    C = A.buffer(Double.parseDouble(distance));
   } else {
    render text: "${params.operation} not supported.", status: 400
    return false
   render(contentType: "text/json") {
  } else {
   render text: "Please supply an operation to be performed.", status: 400
   return false

 def selfSnap(Geometry g)
  double snapTol = GeometrySnapper.computeOverlaySnapTolerance(g);
  GeometrySnapper snapper = new GeometrySnapper(g);
  Geometry snapped = snapper.snapTo(g, snapTol);
  // need to "clean" snapped geometry - use buffer(0) as a simple way to do this
  Geometry fix = snapped.buffer(0);
  return fix;

the relevant points to note are:
  • the geometries (up to two) are sent in the POST body in WKT format, separated by a * (you may change that, I just happened to like the *)
  • both geometries are 'cleaned' with a self-snap operation to prevent invalid geometries from blocking the operation (in my case I had many, cleaning was not an option as I am not the owner of the dataset)
  • the operation is specified as part of the url, thanks to a custom url mapping
The url mapping (step 3) is as follows:

"/jts/$operation"(controller: "jts") {
   action = [GET: "exec", POST: "exec"]

A JTS buffer operation can then be invoked in Sproutcore as follows:

     .notify(this, 'didPerformGeoOperation')
     .send(geom1.toString() + "*");
A JTS intersection operation in jQuery :
  type: 'POST',
  url: "/app/jts/intersection",
  data: geom1.toString() + "*" + geom2.toString(),
  success: didPerformGeoOperation

Tuesday, January 03, 2012

Managing a Mapserver farm with Salt

Mapserver is probably the most popular Open Source web mapping platform (even though Geoserver is getting much of the limelight nowadays).

One of the advantages that Mapserver has against Geoserver is that its configuration is pretty easy because it consists of a flat text file (Geoserver instead uses a xml-backed repository).
Because of this kind of repository managing a Geoserver farm becomes complicated when changes have to be replicated across all hosts and the services restarted to pick up the changes. To address this issue there have been recent efforts to build a multi-master replication mechanism plugged into Geoserver. While this is pretty cool (and it's done by an Italian company of which of course I'm proud of, being an Italian myself) I think it's even cooler to see how easy it is to manage Mapserver configuration files in a similar cluster environment.

The Mapserver setup is as follows:
  1. a cluster of mapserver servers serving WMS through a number of maps (more than one, otherwise it's pointless)
  2. a master node managing the cluster (can be one of the nodes in the cluster)
  3. data is stored in a shareable source like a db
When a map needs to be changed the map file is edited on the master node (tested, if necessary) and then the changes are replicated to all nodes. For the sake of simplicity we assume that all map files are stored under /srv/maps/ (optionally in subdirs) and are referenced in the WMS requests with the usual ?map=/srv/maps/ parameter. Since the map file is read with every request there is no need to restart anything.

With Mapserver the only tool required for the job is Salt. Salt is a remote execution and configuration manager. It works in more or less the same way as Puppet, but it's Free (Puppet is actually more sophisticated (read: expensive ;-)) than Salt, but in our case all the extra sophistication does not change the outcome).

Installing Salt is a piece of cake and on Ubuntu it is only a matter of adding the repo and then apt-get install. The details are here. The same install must be done on all nodes and on the master node.

When you're done simply start the salt daemon on the master:

/etc/init.d/salt-master start

on the nodes edit the /etc/salt/minion configuration file, find the master option and set it to the master's address or dns name, then start the client with the command:

/etc/init.d/salt-minion start

(clients are called in minions in Salt parlance). Check that all minions are communicating with the master by issuing this command on the master:

salt-key L

This command will report the keys of all minions that have communicated with the master. Before the master can issue commands to the minions the master must accept the minions' keys. Let's do it with this command:

salt-key -A

Now let's check communications again by asking all the minions to ping the master and report back:

salt '*'

if everything is ok it's time to configure the mapserver replication cfg on the master.
Edit the /etc/salt/master file on the master and uncomment the default file root in the File Server settings. It should read like this:

# Default:
  - /srv/salt

restart the master and create the following files in /srv/salt

bash# cat top.sls
    - mapserver

bash# cat mapserver.sls
   - recurse
   - source: salt://mapserver/srv/maps

now let's create the directory /srv/salt/mapserver/srv/maps and copy the mapfiles (along with dependencies like symbols, fonts, etc) into this directory.

Restart the salt master (/etc/init.d/salt-master restart) and if there are no syntax errors we should be ready do go.

WARNING: the following commands will cause the files in the /srv/maps directory of the minions to be overwritten by those served by the master. As of 0.9.4 files that exist on the minions but are not on the master will not be modified. Do not proceed further on a live system unless you know what you're doing. You have been warned.

With this command we tell the salt master to push state changes to all minions. The state changes include a full replica of the contents of the /srv/salt/mapserver/srv/maps contents.

salt -t 60 '*' state.highstate

The replication might take some time but will eventually complete. Now check on the minions that the files have been correctly transferred. Every time you must push changes again just drop the files on the master and then run the state.highstate command.

Congrats, you're done.