Google and the "right to be forgotten" - EUCJ


What about someone who scans a newspaper using an OCR system, analyzes the text, and stores both the text and the results of the analysis in a database?
I can’t see how Google is not a data controller.
Edit: Here’s a lengthy description of what constitutes a data controller.

I wouldn’t describe it as a victory; rather it’s obliging Google to take some more active responsibility for its data collection.

I’m not clear on how that would work?
There’d have to be a new directive, wouldn’t there?

That’s what I’d do.

Google’s main cause for upset is that it has to do a bit more work; meanwhile society at large is teasing out issues around data protection that do have to be addressed. I don’t think it’s a ‘bad ruling’ - it’s just a ruling that moves the discussion forward.

I’m reminded of this joke.


That document focuses on the roles of the actors, but not on what data triggers being a data controller in the first place. That’s the issue here – the data is already in the public domain, and already on the internet. The weirdness is the treatment of the person who reproduces existing public data as being a data controller, even when the person publishing the original data is not a data controller. That’s somewhat counterintuitive (or “crazy”).

I’m not familiar enough with EU law to know if it has to be a Directive/Regulation or whether it can be done more easily.


Okay, I think I understand your perspective. I suppose the first issue I’d raise is that Google isn’t simply reproducing existing public data; it’s collecting personal (public) data and repurposing that data.
If the newspaper concerned maintained up to date profiles of individuals, it would be acting as a data controller and the data subject could request links to out of date information be removed from their profile.
As it is, the newspaper has an archive, presumably with dates of publication apparent, but because it doesn’t process or repurpose this it’s not a data controller in its treatment of the archive.

Do you accept that there’s a difference between what Google does and what the newspaper does?

I think it has to be initiated by either parliament or the commission and reviewed by both, then passed on to the council of ministers. It’s slow.


When a newspaper publishes a story about them, it is usually an up-to-date profile of that individual, compiled from other sources which have been repurposed.

Yes, but I draw the opposite conclusion from you – because the newspaper actively writes stories about the individual, I would want them subjected to more scrutiny than a computer programme that simple aggregates references to someone and links you to the content.

Yes, I believe so, either way. I meant it’ll be fixed quickly by EU standards :slight_smile:


It was only last year that I went to look up a barrister in google and as I typed in his name the predictive said Manfred Jones gay.

I tried to do that now and it doesn’t come up any more.

It is entirely sinister that other people’s searches are stored on google and re-used. Google’s bleatings are tiresome.


They are subjected to more scrutiny.
The newspaper wrote a story at time [t - 3] stating that Data Subject was a tax defaulter.
The paper made a digital archive available online at time [t - 2], in which the story appeared unchanged.
Google’s spider accessed this particular page at time [t - 1], adding the information about Data Subject being a tax defaulter to the Google search index.

If the newspaper publishes a new story on Data Subject at time [t], it may include mention of his having been a tax defaulter at time [t - 3]. Crucially, a human has thought about this and made the decision to include the information.
Google’s search index, on the other hand, presents results on Data Subject at time [t] that have probably not been validated by a human.
However ephemeral it might appear, Google’s search index meets most of the criteria for a publication.


That time lag, or the lack of human intervention, are not the basis for the judgement. If they were, it would cure the problem for a Google employee to look at the link and say “Yep, he was a tax defaulter at t-3” once per month (I’m not suggesting this would be feasible of course, just talking theoretically).

Also, the argument is defeated by the fact the the newspaper’s own archive does not have to remove the article at time [t] or any other time. It can remain there, searchable, forever.


Change in recruitment SOP coming up.
Find out all the places where they have lived their teenage and adult lives - then specifically search the local papers court reports!


Lack of human intervention - I agree, I merely noted it because you want the newspaper data to be subject to greater scrutiny and it already is.
The time lag is of course important to the judgment because the static historical archive is not subject to revision whereas Google’s search index (personal database) is subject to revision.

I fear we’re starting to go in circles here, so I’ll try one last time to explain my understanding:

  1. The newspaper has an archive that doesn’t change. This is accessible to the public and the data is effectively static.
    The data subject does not have the right to require that the historical record be revised.

  2. Google maintains a database that does change. This is accessible to the public and the data is subject to frequent revision. The data is published dynamically on request.
    The data subject does have the right to require that historical information about him be removed.

Google claims to be nothing more than an indexer or an aggregator, but it clearly does a whole lot more than that.
It collects, maintains, analyses, adapts, and publishes personal data; as such its activities are coming under data protection laws.


I do understand the reasoning now I think, but I respectfully disagree.

If you use the search on the IT home page, this is essentially a search of their archive. It changes over time as articles are added, revised and removed, in the same way Google does. It is simply limited to their archive rather than aggregating across sites.

I should point out that some websites actually use custom Google searches as their own internal search engine. It’s simply a Google-hosted search limited to or whatever. So where do you propose that these should fall?

Really, I cannot see this holding up to logical scrutiny. It appears to be a mis-guided fuddled attempt by a non-technical court to stick it to the Google Man. The Google Man does deserve a good smacking on privacy grounds, but not for this.


I disagree. I don’t think that a regular search feature on a web site covers what Google does. It’s part of what Google does, but it’s not the full extent of Google’s data management.

Do you consider Google to be a publisher of information or just a conduit?

If it’s just a conduit, why do we see things like caching, summaries, similar pages, filtered pages, predictive text, suggested spellings, targeted advertisements, reading level, time range, and customizations based on search activity?
There’s a whole lot more going on than ‘crawl, index, search’…

Do you mean arrangements where Google is crawling the company site and Google is hosting the search index, or arrangements where a custom Google engine and index is hosted by the company site? I’ve understood both to be possible.

I don’t know what the answer is if Google is hosting the search index and pushing the results to the IT website. Probably the data subject has the right to request content be kept out of Google’s main databases and only available in the database that feeds the search on the IT domain.


Yes, but that appears to be incidental to this decision. As I understand it, the decision applies to anyone who does “crawl, index, search”. Except not the newspaper’s own search engines.

To me, this makes no sense.

I think that shows the arbitrariness of the decision.