Change text size: A A A

Internet Copyright – Are Search Giants Playing Fair?

Internet search engines play a vital role, enabling us to navigate the web and locate other people’s information in an organised way. But in doing so, are they unfairly lining their pockets with advertising revenue by copying material without permission and breaching copyright?

More specifically, do search giants such as Google, Yahoo or AltaVista, breach copyright when they copy others’ content, store it for future access in a cache on their own server, and publish parts of it in search results?

The law, as usual, is playing ‘catch up’ with technological developments online. Initial skirmishes have been fought in various parts of the world by those with the money and inclination to sue.

Courts are just beginning to adjudicate on alleged infringement by search giants. So far, some American states have adopted a relatively permissive attitude toward alleged infringers, Europe less so.

Search engines work by sending out automated programmes (called ‘robots’) to crawl across the web, capturing others’ data.

The robots visit billions of webpages, making copies for their search engine’s index. The elements of each page are broken down, categorised, and cross-referenced in an index that is then interrogated to provide a list of responses to users’ searches.

Generally, an entire copy of each visited webpage is copied to the search engine’s cache – which is its ‘own’, easy-access copy – so that if the target website’s current page is ever unavailable, users can request access to the page as it looked when the search engine’s robot last visited it.

Websites can keep the robots out by following a fairly simple internet protocol, called the robots exclusion standard, which was established more than 10 years ago. This enables websites to exclude robots by deploying a file called ‘robot.txt’.

They can also prevent unauthorised caching, by deploying a ‘NOARCHIVE’ command in their own webpage’s code.

But the contentious issue, from the copyright perspective, is that the onus is placed on websites to opt out, rather than on search engines to obtain prior consent for access and copying.

For its part, Google apparently takes the view that those who do not positively opt out are deemed to have given implied consent, and opted in. This argument found favour in two recent American cases.

In January 2006, author and lawyer Blake Field sued Google in Nevada District Court over its caching of stories that he had published on his website. He alleged Google’s cache feature enabled people to access copies of his copyright works without his permission.

However, the judge ruled that Google’s use of his copyright work amounted to ‘fair use’ under US copyright law. He decided Google had taken a passive role in the copying process and that it is the user who downloads a copy of the cached web page that in reality makes the unauthorised copy.

The judge described Google’s processes as “non-volitional” and “automated”. He decided Field had granted an implied licence by not taking steps himself to disable Google’s caching feature. The judge appears to have been influenced by utilitarian considerations and the fact that search engines are the great facilitators of the web.

He said: “Google’s use of entire web pages in its cached links serves multiple transformative and socially valuable purposes.”

An equally permissive ruling was made in May 2007 by an appeal court in San Francisco, which lifted an injunction previously slapped on Google for displaying without permission various copyright thumbnail-size images from a subscription-only porn site. The court held that the claimant website was unlikely to be able to overcome Google’s ‘fair use’ defence in US copyright law.

It is difficult to predict how the point would be decided in UK, where English law allows material to be copied without permission so long as it does not amount to a “substantial part” of the copyright work.

However, it is fair to say that in Europe, the search giant has certainly experienced a less permissive attitude than in America. In February 2007, it lost a copyright case in Belgium against newspaper group Copiepresse, which represents Belgian, French and German publications.

Copiepresse wanted copyright content from its members’ newspaper websites to be accessible via Google News, but it wanted Google to pay for copying that content.

Copiepresse took a stand on a point of principle. It could easily have prevented Google’s robot from accessing its members’ sites, by using the exclusion protocol, but it wanted Google’s methods to be scrutinised by a court.

It argued that Google was breaching copyright by copying newspapers’ content to its cache and allowing users to access that rather than visiting the newspapers’ own sites.

The court ruled that Google’s procedure breached copyright under Belgian law. In particular, it noted that Google did not merely index and cache, but went so far as to copy headlines and text excerpts from the newspapers’ websites.

Google was reportedly considering an appeal.

Potentially, the Belgian court’s approach could be replicated in other EU countries, including UK, although it is unclear how an English court would deal with possible defences of implied consent, fair dealing, and insubstantial copying.

Rather than suing on a point of principle, however, copyright owners may find it more cost-effective to use the established ‘opt out’ protocol.

After all, Copiepresse’s victory was only pyrrhic. Following the judgment, Google deleted all its coverage of the newspapers’ websites from Google News Belgium, as well as from its main search index and cache, thereby reducing the flow of traffic to the individual sites. That cannot have been good for the newspapers’ e-circulation stats, or advertising revenue.

Clearly, legal judgments on the acceptability of search engines’ methods are still crystallising, with different outcomes being reached in different jurisdictions.

However, copyright owners can take some comfort from the fact that they themselves may benefit if unauthorised copying serves to increase their audience. And robot exclusion protocols provide a simple alternative to the uncertainties of copyright litigation.






Footer Curve