Property on listed at 642 million euro, etc.

Moving on,

All properties advertising in the calendar year 2008 were excluded. Seems like an oddly arbitrary exclusion. It defines Jan 1 2008 - Dec 31 2008 as the bubble period. There is no justification for these dates being picked.

However, the letting ads were excluded for a different period- H2 2008 and H1 2009. Why? No explanation.

Also, 2 years “pre-bubble” are compared with 3 years “post-bubble”. No justification of those choice of periods.

Edit: I’m not arguing that these are wrong, but we seem to be missing some of that “Level 7 analysis” (or any analysis at all) in the paper around why these decisions were taken.

Marching onwards

120k properties were excluded because of accuracy problems with the address. No data on how the ad was analysed for address so it’s not clear whether their algorithm for extracting the data was just poor or what. This is probably not hugely significant, but it would be interesting to see if certain types of properties were more likely to be excluded than others; for example new estates where the address might not yet be in the geo matching system? Or random stuff? I’m sure an analysis of this was done at Level 7 of the Ronan Analysis Level at least.

OK here’s the most interesting one:

Some abiguity over how ads with changed prices are treated – the paper does recognise that a significant portion of sales ads (41%) were for existing ads where the asking price had changed but no methodology is actually given for how this was determined. From how it is phrased, it sounds like they used the ad ID only; so when something is removed and re-listed, this would not be detected. Is that correct?

In addition, and most significantly, it appears that the model is based on the data from when the property is first listed, and subsequent changes are excluded. So if I list a property for €1m and drop it down in steps to €400k, Ronan’s model appears to count is as a €1m listing throughout.

I’m open to correction on this, but that’s what the model appears to do. If my understanding is correct, that’s a significant flaw in the model, so let’s hope I’m wrong and it’s just my puny Level 1 brain that’s unable to comprehend the awesome Level 7-8 brains behind this.

I think that’s it. Let’s hope Ronan can take time out from analysing the Higgs Boson data to reply.

If you are right then I would expect that Ronan L will acknowledge the flaw and make sure that correction notices are published where ever the research has been quoted or relied upon.

I would be shocked if the price-change factor was not accounted for somehow, so I suspect it has been taken into account – but it’s quite hard to figure that out from the document (at least for my Level 1 brain).

Thanks for at l(e)ast engaging! And as I said at the time, I didn’t want to come across as bitchy or defensive or any other pejorative adjective you care to throw at me. I just really don’t have the buckets of free time that might allow me to engage in keyboard wars with someone who - at least until earlier today - was quick to cast asparagus but didn’t really have any meat on the bone (if you don’t mind mangled metaphors).

As it is, you’ve done a good job of impersonating a discussant at an academic conference (although they normally have at least one or two nice things to say - still, I got you to read the darn thing so I should probably be grateful for small miracles). Anyway, bearing in mind what’s on my plate aside from the pin, hopefully I can do a reasonably good job at responding:

(1) On 2006-2007 non-listings/duplicates, this was a tech issue, not a Daft report issue. As you can imagine, any serious outfit would be aware as best it can of every last property for sale in the country and how many it had versus hadn’t. I generally leave the tech guys to this - 7+ years of working with them convinces me these guys are definitely on top of their game - but in preparation for academic papers using this dataset, I checked up on this particular issue. It seems that whatever properties that have been put up for sale 2006-2012 have not been on Daft, the vast vast majority of them were pre-2008.

Given the nature of those listings in 2006-2007 (myhome agents principally, and your priors should be telling you something about where their properties lay in the bubble distribution), you would hopefully be among the first to criticise me for making conclusions about the change in distribution of prices in Ireland if I were missing an important chunk of that distribution.

(2) Speaking about 2008, yes, it could be argued that this is a relatively arbitrary exclusion. In fact, the research probably doesn’t need this exclusion as the right degree of interacted variables would allow distinction between bubble and crash periods (another paper of mine does this while tackling a different question). However, the purpose of the paper (and maps) was to compare bubble and crash periods so even if those observations hadn’t been excluded (which by the way would only give the model more observations to play with), boundaries have to be set between them and to prevent criticism in relation to arbitrary date choosing, the boundary was set at a year wide. The boundaries have been set bearing in mind the overall price indices for prices and rents respectively: give each segment a year to adjust to falling values before starting to look at the price structure as reflective of the crash period. You could include anything with falling values as crash but I think it’s hard to argue (particularly bearing in mind point (4) below) that the market goes overnight from bubble to crash, let alone if one did argue that how one might pick the day.

(3) This is based on addresses and GeoDirectory. As is obvious, and becoming annoying in relation to the house price register, if the seller/agent/whoever refuses to put in a meaningful address, it’s hard to do any analysis that is geographic in nature and relies on address. Fortunately, roughly two thirds of properties can be identified to their street or better, with most of the rest being identifiable at estate or village level, so while we would clearly want 100% accuracy, for a paper that is working with geographical units that are quite large, it is unlikely that this is a structural issue affecting the conclusions.

(4) I think you’re assuming that, in the eyes of the model, once a property is up, it’s up for ever. This is, in fairness, the myhome approach for their quarterly reports - every Leitrim property on the site, even those up since 2007 or 2009, go into the Q2 2012 figures reported. However, as is clear from the methodological notes on every Daft report, I’m not a fan of that approach. List prices have their uses and their limitations - their primary uses are as a measure of seller expectations and as a proxy for sales price. But, due to human behaviour, they are remarkably nominally rigid. Hence a McMansion listed for €1m in Q1 2007 and eventually revised down to €400,000 in Q3 2009 will make two appearances (and two only) in the model: once in Q1 2007 and again in Q3 2009, as these are the only times we can be clearest on what the seller’s expectations are. For a different study, one might want to focus in on these re-listings (and indeed I plan to study rigidities in list behaviour once my next working paper is complete) but for the purpose of this study, their re-inclusion is treatment enough.

Hopefully that answers your queries, gives 23rdbuchan something to watch and also addresses Devilsbit’s comment (I’m not sure which flaw he is referring to).

Given you did such a good job as being academic discussant, can I ask what you liked about the paper? Was there anything you would take away from it? Or would you still rather have a blank canvas in relation to what’s happened the structure of prices, rents and yields in Ireland, awaiting the day we get different data?

And now back to my analysis of Higgs’ Bosom. Yours haughtily from Level 7,


(PS. I very much doubt the above will be the last contribution on the subject. Unfortunately, I suspect this will be my last post on the Pin for at least a week and possibly a month.)

What happens if the property drops multiple times, not just one?
i.e. (with complete arbitrary figures)
2007, 1m
2008, 800,000
2009, 600,000
2010, 400,000

Does this give 4 entries in your model? More curious then anything. I far from have enough knowledge to know what the impact would be.

Regarding the paper, I’ve only read part of it so far but it seems easy enough to read (most are far from it). I would suggest expanding out the methodology section. At a minimum include something on how the data set is cleaned. I too looked from something to see how you removed false properties, and there definitely are some on daft but not to the extent of myhome. Based on your quote above, the problem ones on myhome would have been excluded by using the geolocation vs address data anyway but it wouldn’t hurt to have this in there.