The Pretty Charts thread


#21

I’m not sure it would cost 3k. That fee is to cover transactions (searches) of the ECAF data on the website by customers. It’s not clear to me that it’s necessary for backend or batch processing.

I’ll bow to your experience of usefulness though.

I have thought about putting together an app to crowd-tag historic PPR data with eircodes by hand, but laziness and procrastination have stymied progress.

Presumably the PPR people will be doing this for new transactions at some point…?


#22

Don’t hold your breath. The PPR is dependent on Revenue e-Stamping. I think they’ve finally realised that people are deliberately obfuscating their addresses on the PPR so their self-service system now gets you to provide an address id instead of typing it yourself. Unfortunately Revenue are using their own internal LPT id … might be useful if we could get our hand on that catalog.


#23

v0.2 now up. Now works on Firefox and Chrome. Should be at least viewable in Safari but may be some sizing bugs. (Would appreciate if someone would check it out on Safari on Mac to see if it differs from iPad). Doesn’t work on MS Edge or IE. Map text and tooltips are next. Comments, suggestions and bug reports welcome.

Click the image.
i.imgur.com/3JKspDF.png?1

EDIT: Code, resources and data here.


#24

Very good, really like this.

So the actual figures for the square footage per area/region are coming in the next version of the map?


#25

Yes (if I can figure out the bugs in browser SVG implementations) , but they will be the same as you can already see on the “Tabular” view.


#26

Tabular…now why didn’t I click on that!
Thanks


#27

:smiley: I might make the Tabular view the default one in future.


#28

Hey all – I’m the author of daftdrop.com, and I have all daft.ie property info since 2011, including price changes, GPS co-ords! Have any use for the data? I don’t store stuff like square footage or estate agents, however…


#29

Could be interesting to lob it onto some sort of heat map at a more granular level than the county data I have. How big is the data and do you mind sharing it?


#30

Hey credmond, does your data table have a direct link to the daft entry?

I’m pretty much done scraping the PPR for Dublin 2015 and matching it against electoral area which makes some nice mean price and turnover charts but square footage for each property is the next challenge. A list of links to daft entries might do it but I have a few other ideas if that doesn’t work.


#31

Was just logging on to my LPT account - Revenue is storing the Eircode now as well. Not sure if for all properties, but the data is there in any case - don’t know if it is send to the PPR though.


#32

So this is Dublin’s housing market in 2015

There were 15205 houses sold in Dublin during that time period (PPR) and I managed to map 88% of them which are represented on these two charts.


#33

That’s brilliant jess. How did you parse the addresses to get the breakdown by ED?


#34

I use R which is good for this sort of thing. Learnt it as part of my PhD.

The scraping of google maps against the PPR was carried out using code literally copied and pasted from here:
shanelynn.ie/massive-geocodi … ogle-maps/

(Thank god, because that would have taken me a few days to work out, I’m rusty).

This is an abbreviated version since the code contains everything from start to finish.
I’ll do the 2014 & 2013 Dublin files next, then maybe I can look at changes.

Really could do with getting square footage for each house but google doesn’t seem to keep cached daft entries as I was hoping.

[code]rm(list=ls(all=TRUE))

#Didn’t use all of these, but I tend to stick em in
library(ggmap)
library(maps)
library(mapdata)
library(rgdal)
library(rms)
library(gmodels)
library(sp)
library(animation)
library(proj4)

#Read Shape Files

ED<-readOGR(“C:/R/Mapping/Census2011_Electoral_Divisions_generalised20m.shp”,
layer=“Census2011_Electoral_Divisions_generalised20m”,input_field_name_encoding=“utf8”)

AdminCounties<-readOGR(“C:/R/Mapping/Census2011_Admin_Counties_generalised20m.shp”,layer=“Census2011_Admin_Counties_generalised20m”,input_field_name_encoding=“utf8”)

#Take only the Dublin Part
ED<-ED[ED@data$NUTS3==“IE021”,]

#This long string looks complicated but it’s copy and pasted from one of the qualities contained in the ED file. Use summary(ED) to get it from any R shape file

proj4string <- “+proj=tmerc +lat_0=53.5 +lon_0=-8 +k=1.000035 +x_0=200000 +y_0=250000
+datum=ire65 +units=m +no_defs +ellps=mod_airy
+towgs84=482.530,-130.596,564.557,-1.042,-0.214,-0.631,8.15”

#This takes the Dublin co-ordinates from ED and puts them into GPS so I can set limits for the map later

XY<-coordinates(ED[ED@data$NUTS3==“IE021”,])
XY<- project(XY, proj4string,inverse=TRUE)

LongLim<-c(min(XY,1]),max(XY,1]))
LatLim<-c(min(XY,2]),max(XY,2]))

###########################

infile<-“PPR.csv”
data<-read.csv(infile)

data$index<-1:nrow(data)

geocoded<-read.csv(file=paste0(infile ,"_geocoded.csv"))
geocoded<-geocoded[c(1:531,533:nrow(geocoded)),]

data$long<-0
data$lat<-0
data$long<-geocoded$long
data$lat<-geocoded$lat

Source data (GPS co-ordinates at this point)

#Pick only the points that actually ended up in Dublin

data<-data[complete.cases(data),]
data<-data[data$long>LongLim[1] & data$long<LongLim[2] & data$lat>LatLim[1] & data$lat<LatLim[2],]

Transform data to Irish GRID

XY<-data,c(“long”,“lat”)]

XY<- project(XY, proj4string)

data$long<-XY$x
data$lat<-XY$y

XY <- SpatialPoints(XY,proj4string=CRS(proj4string(ED)))

#Link the points in the list XY with their appropriate GEOGID (this is the electoral district) and add that Data to the datafile.
XY1<-over(XY,ED)
data$GEOGID<-XY1$GEOGID

#The ED file also has the number of dwellings per ED for 2011 which I used for turnover

data$HS2011<-XY1$HS2011

[/code]

The other relevant bits are probably generating the colour ramp:

COL<-colorRampPalette(c(rgb(0,0,0),rgb(1,0,0),rgb(1,1,1))) (max(summary$P2)+1)

(at this point I had created a data frame called summary which pulled only the relevant columns from data, bad practice to use a function name but twas done)

and adding the colours to the original ED shape file so that they can be plotted

ED@data<-data.frame(ED@data, summary[match(ED@data$GEOGID, summary$GEOGID),])

#35

Is there nothing to be said for another BXLE?


#36

ps200306 do you have a file that lists address and square footage of properties for sale?

I’m thinking we can fuzzy match them and round out the data.


#37

I have that data from 3-monthly scrapes of myhome.ie. What sort of fuzzy matching are you thinking of? Would we not be better off firing it against the Google geocoding API and matching it up to your ones by location?

P.S. I mucked around with R for spectroscopic emission line profile fitting, but I can’t say I ever fell in love with it. :wink:


#38

Could do google API and then proximity matching but if the addresses are slightly different then the GPS might be slightly different or you could have multiples at the same point (e.g. same apartment block)

Also was thinking of trying to avoid google API again since I can only run 2.5K a day.

Multiple ways to approach it!


#39

Here’s the csv output from my most recent scrape of 01-Feb-2016:

drive.google.com/file/d/0BxllNJ … sp=sharing

… if you figure out what to do with it I can give you the quarterly ones going back several years. Columns are:

  • address – county only
  • bedNo – number of bedrooms
  • houseType – ‘H’=house, ‘A’=apartment, ‘S’=site
  • saleType – ‘S’=private treaty sale, ‘A’=auction
  • area sqft – area in sq. feet (integer), zero if not available
  • price – price (integer)
  • houseType2 – a longer description, e.g. semi-detached house
  • price sqft – price per sq. ft (integer), zero if not available
  • href – myhome short code
  • ber – BER code if available
  • fullAddress – full address

Note that area and price per sq. ft. are only available on about a third of the properties.


#40

This thread is full of love.