Excluding log4j dependencies

Posted: February 2nd, 2009 | Author: David | Filed under: Programming | Tags: , , , | Comments Off

While it’s noted in a few other places that there can be issues with the small world of unnecessary transient Sun dependencies that log4j pulls in to a Maven 2 project, and that the easiest thing is to exclude them; I thought I’d get it down here too, as trouble has again come a knocking at my door with this particular hat on.

<dependency>
    <groupId>log4j</groupId>
    <artifactId>log4j</artifactId>
    <version>1.2.15</version>
    <optional>false</optional>
    <exclusions>
        <exclusion>
            <groupId>com.sun.jdmk</groupId>
            <artifactId>jmxtools</artifactId>
        </exclusion>
        <exclusion>
            <groupId>com.sun.jmx</groupId>
            <artifactId>jmxri</artifactId>
        </exclusion>
        <exclusion>
            <groupId>javax.jms</groupId>
            <artifactId>jms</artifactId>
        </exclusion>
    </exclusions>
</dependency>

Evaluating Feeds

Posted: February 2nd, 2009 | Author: David | Filed under: Programming | Tags: , , , , , , , | Comments Off

A not so uncommon situation I’m finding is that a website will have more than one feed associated with it. This is sometimes just to point to alternative markup (e.g. different versions of RSS spec, or a site offering both RSS and Atom feeds, or combinations thereof), or to hook up with feed aggregation services (Feedburner easily being the most prevalent), but the content of the feed can also sometimes be quite different.

Initially, I had made the crude assumption that for me, RSS is more useful than Atom (as I had written a very lightweight RSS parser). Now that I’m incorporating the ROME Java API for feed processing, I’m not so bothered about the choice of tech, or the spec of that tech, but I am quite interested in hooking up with the best feed for my purposes. I also don’t want to have to approve a few hundred feeds manually.

So what’s the best feed for my purposes? Assuming that these feeds are concerning the same subjects (i.e. new posts to the blog), then the best purpose feed is most likely going to be the one with the most content.

A really simple algorithm for deriving the feed with the most content

The first task is to pre-process the content of each feed to determine a value for the content of each post of each feed, measured by the number of words in the description and the largest number of words in each representation of the post content, once all markup has been removed.

We’re then left with a representation of feeds to lists of word counts for relative posts, such as:

feed1..n → { wordspost1, wordspost2, .. wordspostx }

Since the number of posts in each feed could vary (and the number of posts a feed covers shouldn’t be a discriminating factor), we take the minimum length of all the word count lists, and sum the word counts within that range for each feed. We can then select the feed which has the highest word count as the preferred feed to use.

This method assumes that the feed entries are in the same order and about the same posts in each feed, on the basis that each feed is most likely to originate from the same blog management system and therefore either dynamically produced, or published at a similar time.


Resurrection

Posted: January 27th, 2009 | Author: David | Filed under: ...and everything else | Tags: , , | Comments Off

After almost two years of nothing, I’ve brought this blog back from death with fresh intentions of regular writings. I’m primarily hoping to be writing about technology and programming, in particular using this space as a diary for my MSc project and a place for interesting things I want to remember I’ve done. But I’m hoping to structure this in such a way so all that can be ignored if necessary.

A good weeding has removed a lot of the old drivvel, so apologies in the unlikely event you’ve been longing for the return of a cherished post from internet past.


Iceland

Posted: April 30th, 2007 | Author: David | Filed under: Travel | Tags: , | Comments Off

I don’t think they have Sesame Street in Iceland, but if they did it’d be pretty cool. Today, Sesame Street was brought to you by the letter ‘ð would be a great way to close the show. Anyway. Iceland was great. You can pretty much get an idea of what it was like by checking out the photos here.


Watercress and Potato Soup

Posted: February 25th, 2007 | Author: David | Filed under: Food | Tags: , , , | Comments Off

Yesterday, I made soup. The recipe I adapted is based around 100g of watercress, but Sainsbury’s sell it in 85g bags, so I’ve reworked it for two of those bags. This is a very tasty soup, and the recipe should make between 6-8 servings.

Chop 3 onions and fry off with a small splosh of olive oil in a large saucepan for 5ish minutes. Add 3 large potatoes, chopped (should be around 780g) and cook, covered for another 5 minutes. Chop the two 85g packs of watercress, and add with a litre of vegetable stock. Bring to the boil and cook until the potatoes are tender, which should take about 15 minutes.

Blend the mixture, while adding 500ml of semi-skimmed milk, a freshly grated nutmeg and some black pepper. Reheat to serve.