Show Code Cells

UKHousePriceVisualisation

Illustrating The Point - Visualising UK House Price Data


This is a retcon of some work I put together a couple of years ago, that I thought might have a home here. In spring of 2017 I was supposed to be studying for some industry exams, but instead found myself playing with F# visualisations. I'd started learning the language the year before, having spent a lot of time using C# and .NET already, and was looking for small analysis and visualisation projects to cut my teeth on.

I was interested in the House Price bubble in London, where I'm based. Those of you who have moved to big cities will have and experienced, like me, the inflated costs that come with them. I'd moved from a university town and was shocked (£5 a pint!), but I still couldn't believe that property prices were so different to other UK cities. How exactly did London compare?

I realised that this would actually be a perfect topic for a small visualisation project, so set about looking for data.

Measuring Up

Because property transactions are public domain, there's a surprising amount of open data available. For this project I was mostly interested in the value of the property, regional labels for that property, and coodinates to help map those properties.

For the former, there is public access to the UK Land Registry's website. This is a really useful resource when looking at property-related data, and I encourage anyone who's interested in data science or buying or selling a house to take a look for themselves:

This transaction data contains information about sales of property in the uk; including Date, Price, and information about the property itself.

Gathering coordinates was more difficult, however the Land Registry data included the properties' postcodes, and I could therefore use Postcode-Coordinate lookups to approximate each house's position in the UK. Postcode data is available from a number or different sources, but for this I used the ONS (Office of National Statistics) portal to find what I wanted:

In [1]:
#load "Paket.fsx"
Paket.Package [
    "MathNet.Numerics"
    "MathNet.Numerics.FSharp"
    "XPlot.Plotly"
    ]
#load "Paket.Generated.Refs.fsx"
#load "XPlot.Plotly.fsx"

#load ".\DataAccess.fs"
#load ".\Statistics.fs"
#load ".\PGraphs.fs"
#load ".\Analyses.fs"
In [2]:
open System
open HousePriceAnalysis
open Statistics
open DataSet
open PGraphs

A First Look

As a first dive I attempted to map house prices by their coordinates, expecting to see a dense cluster of high property prices around London in particular. I wasn't prepared for just how clear this would be, so I was very happy with the graph below:

(Note because of the large number of data points, I've limited the chart to 2016 only, but it can make your browser a bit unhappy)

In [3]:
//DISTRIBUTION
Analyses.RawHousePriceMap(fun h -> h.Date.Year=2016)

Out[3]:

This map looks excellent, you can clearly see individual population centres, as well as increased house prices in the South East.

The house price transaction data also included the name of the nearest Town/City, so I was able to isolate a few key areas. In the graph below, I have filtered out everythng except the UK's 3 most populous cities (London, Manchester and Birmingham), and Oxford and Cambridge, which also have significantly inflated house prices, mainly due to their status as commuter towns:

In [4]:
Analyses.RawHousePriceMap(fun h -> h.Date.Year=2016 &&
                                    (h.TownCity.ToLower()="london"
                                        || h.TownCity.ToLower()="manchester"
                                        || h.TownCity.ToLower()="birmingham"
                                        || h.TownCity.ToLower()="oxford"
                                        || h.TownCity.ToLower()="cambridge"))
Out[4]:

In More Depth: City House Price Distributions

The charts above look great, and you can see than, in general, London, Oxford, and Cambridge are more expensive places to buy property, and you can also seen that there's a lower number of transactions in Oxford and Cambridge, possibly due to higher levels of renting (I'd have to check this, but they are student towns).

However, I wanted an easier way to compare between the price levels of the individual towns and cities, rather than having to rely on colour.

For this I broke divided up the transactions by city, and looked at the Quartiles of the price distribution. In the graphs blow you can see box plots representing house prices in each city:

In [5]:
Analyses.BoxPlotAnalysis(fun h -> h.Date.Year=2016 &&
                                            (h.TownCity.ToLower()="london"
                                                || h.TownCity.ToLower()="manchester"
                                                || h.TownCity.ToLower()="birmingham"
                                                || h.TownCity.ToLower()="oxford"
                                                || h.TownCity.ToLower()="cambridge"))
                        (fun h -> h.TownCity.ToLower())
Out[5]:

Going Further

With this information at my finger tips, it's easy to see how I could get carried away, but I just wanted to see how stark the differences between areas really was, rather than relying on the headline grabbing extremes that I'd been presented with up to that point. I think the boxplots above show the level and distribution of houes price levels in some depth.

As a handle for further potential study, I took a look at the most and least expensive areas in the country. I've included the two plots below to show the levels in these areas.

I hope this has been interesting! And as always, you can find the source code on github.

In [11]:
Analyses.BoxAnalysis(fun h -> h.Date.Year=2016)(fun h -> h.TownCity.ToLower())
    |> Seq.take(10)
    |> BoxPlot("Most Expensive UK Towns")("House Price (£)")
Out[11]:
In [13]:
Analyses.BoxAnalysis(fun h -> h.Date.Year=2016)(fun h -> h.TownCity.ToLower())
    |> Seq.sortBy(fun (c,q) -> q.[2])
    |> Seq.take(10)
    |> BoxPlot("Least Expensive UK Towns")("House Price (£)")
Out[13]: