Tuesday, June 26, 2007

Individuals and interactions over processes and tools



Today I heard about a project team that is transitioning to Agile and was looking to better prioritize their feature backlog. The team was using a voting system to determine priorities, however they decided that a better solution was required. After some discussion, the project team decided to utilize business value points to prioritize the work. This is a good thing! The team recognized a deficiency in their system and adapted to improve their effectiveness.

Unfortunately I think the story takes a bit of a downward turn from then on.

The team decided to build a "backlog management/prioritization tool" and subordinate most of the backlog items to building a new tool. Now, I don't think that backlog management tools are bad per se, however I would prefer the use of a tool such as Microsoft Excel or even Version One or Rally to hand-rolling a solution and only when the demands of the project really need it. From what I've heard so far, I don't believe the latter to be the case. Let me assume for a minute that it will take 1-2 months to develop a basic software package. Given 2 developers with an average salary of $60 per hour (base pay + benefits), we're looking at around $19,200 for a basic application, not including the lost time on higher business value features sitting on the backlog. Is that worth the cost? My other thoughts on this revolve around committing to building a tool before having ever estimated business value points. Is this really a good idea? Looking to lean for inspiration, "defer commitments until the last possible moment" comes to mind. What if the organization pours even half of my estimated $20k on this tool and the business users decide that business value points are too ambiguous to be used effectively and want to go back to voting?

It is an interesting story and I don't know if it will have a happy ending. One thing is for sure, I will be in touch with my collegaue to see how things shake out. More to come I'm sure...

Sunday, June 17, 2007

CURL is your friend


One of our systems requires that we download a file from a website and import the file's contents. When we first started importing the file we did it manually (which we all know is not the best use of a developer's time), unfortunately there were a few bumps in the road to automation.

1) No FTP site, HTTP access only.
2) No RSS/e-mail/notification of any kind that the file has been updated and a new one is available. (A Last Modified date existed on the page, but we found it was out of sync with the actual file modifications)
3) The file is large (~25MB).

Fortunately HTTP and the CURL utility make overcoming these limitations pretty easy.

Step 1: Use CURL to download the file

CURL is a Linux command line program that will retrieve the contents of a given URL. We can use CURL to easily get around limitation #1.

curl http://example.com/data-file.dat > data-file.dat

This works great. We can now incorporate the downloading of the file into a scheduled script. This leaves us with limitations #2 and #3. As it turns out, CURL's support for HTTP allows us to take advantage of HTTP in order to save the overhead of downloading a 25MB file.

Step 2: Mix in a little HTTP goodness

We've all heard of GET and POST, two of the methods defined in the HTTP spec. There is a lesser known HTTP method called HEAD, that can be used to get information about a resource on the web, but not actually return its contents. That seems to do the trick for limitation #3, but how does it help us with #2? Let's try it and see:
curl --head http://example.com/data-file.dat

Which returns something like this:

HTTP/1.1 200 OK
Content-Type: application/rdf+xml
Last-Modified: Sun, 17 Jun 2007 06:43:12 GMT
Expires: Sun, 17 Jan 2038 19:14:07 GMT
Server: Apache
Content-Length: 26214400
Date: Mon, 18 Jun 2007 00:35:28 GMT

The interesting item in this response is the "Last-Modified" header. It specifies the timestamp on the file it is returning. We can now use this in our script to compare the timestamp of the last file we downloaded with the timestamp of the file on the web and download the new file if necessary.

Step 3: Even more HTTP goodness

If you don't like to do date comparisons in script, HTTP offers another option: the If-Modified-Since header. This is a conditional GET request. If the resource specified has not been modified since the date passed in the header, the server should respond with a 304 (Not Modifed). If the resource has been modified, the server will return a 200 (OK) along with the contents of the file.

So, to clean this up even more, simply enter the curl command using the time-cond option with the timestamp from the last file. Our final curl command looks like this:

curl http://example.com/data-file.dat --time-cond "Sun, 17 Jun 2007 06:43:12 GMT" > data-file.dat

This request will download the contents of the file
Using CURL enables us to completely automate the downloading of that file in a standard way, even though the vendor didn't make it the easiest for us.