What is web scraping and is it illegal under the Computer Fraud and Abuse Act (CFAA)?

April 09, 2018

What is web scraping and is it illegal under the Computer Fraud and Abuse Act (CFAA)?

Web scraping is the process of loading and extracting large amounts of data from the pages of websites in an automated fashion and saving the data available on the websites to a file on your computer or database.

Scraping can be used for legitimate business purposes, such as use by price comparison sites and market research companies, and for illegitimate purposes, like copyright theft and price undercutting, which can cause the targeted site to suffer financial losses.

Website operators have asserted various civil claims against “web scrapers” (or “scrapers”) including copyright claims, trespass to chattels claims, contract claims, and Computer Fraud and Abuse Act ("CFAA") claims. This blog post will focus on claims under the CFAA.

The CFAA was originally intended as an anti-hacking statute, so its application to scraping—which usually involves accessing publicly-available data on a publicly-available website—is not always intuitive. Congress passed the CFAA in 1986 to criminalize and counteract computer hacking. Section 1030(c) catalogs the criminal penalties for committing these offenses, which range from fines to imprisonment for 20 years to life.  In 1994, Congress expanded the act to also permit civil actions for victims of crimes prohibited by the act. The CFAA protects computers in which there is a federal interest—federal computers, bank computers, and computers used in or affecting interstate or foreign commerce. The statute shields these types of computers from trespass, damage, and being used as instruments of espionage or fraud.

Plaintiffs asserting CFAA claims against scrapers usually allege a violation of subsection 1030(a)(2). This subsection prohibits accessing a computer without authorization or exceeding authorization, resulting in exposure to protected computer-housed information. The determinative question in assessing the viability of a CFAA claim for scraping is -- Did the scraper access the website without authorization or exceeding authorization?

In deciding the issue of authorization, courts often rely upon whether the website gave users sufficient notice that access was not authorized. For example, courts have found that a CFAA claim may exist for scraping where the website took security measures to limit access, such as by requiring a password, and the users bypassed those measures in accessing the website. Similarly, courts have found a valid claim may exist where the website took affirmative steps to prevent a user’s access (for example, by blocking a user’s IP address or sending a user to cease and desist letters) once it discovered a user’s scraping activities, and the user continued scraping. In contrast, courts have dismissed CFAA claims against scrapers where the website and information were publicly available and did not require any login, password, or other individualized grants of access.

Plaintiffs asserting CFAA claims against scrapers sometimes allege that the data scraping was unauthorized because it was prohibited by the website’s terms of use. Courts are split, however, as to whether access or use of a website in a manner prohibited by its terms of use is without authorization.

To decide the issue, courts occasionally consider the nature and visibility of the terms of use. For example, in Cvent Inc. v. Eventbrite, Inc., the District Court for the Eastern District of Virginia found that there was no CFAA violation where there were no other technical barriers to access and the terms of use were not sufficiently visible because the link was “buried” at the bottom of the first page in fine print such that users had to scroll down to the bottom to see the link. In contrast, in Southwest Airlines Co. v. Farechase, Inc., the District Court for the Northern District of Texas refused to grant defendant’s motion to dismiss the CFAA claims where the terms of use agreement were accessible from all pages on the website. Similarly, in Facebook, Inc. v. Power Ventures, Inc., the District Court for the Northern District of California granted plaintiff’s motion for summary judgment on the CFAA claims where users had to affirmatively agree to the terms of use before logging in and accessing certain features. However, in both of the latter two cases, the plaintiffs had also made direct warnings and requests to defendants to stop the scraping activity, which bolstered plaintiffs’ claims that access was unauthorized

Overall, the recent trend appears to be for courts to focus on the original purpose of the statute and reject broad theories that allow terms of use violations to be used as a basis to establish liability under the CFAA. In narrowing the scope of liability under the CFAA, courts often emphasize the distinction between unauthorized access to data and unauthorized use of data. Only unauthorized access constitutes a violation of the CFAA. Courts also sometimes differentiate between technical barriers to access and contractual limitations on access in narrowing liability, reasoning that the CFAA was only meant to prevent hacking (especially considering the CFAA imposes criminal penalties).

For example, in United States v. Nosal, the Ninth Circuit held that access without authorization under the CFAA “does not extend to violations of use restrictions,” but concerns “hacking—the circumvention of technological access barriers.” In reaching its decision, the court emphasized the legislative history of the CFAA, noting that it was enacted primarily to address the growing problem of computer hacking. The court stated that applying the CFAA to use violations would “transform the CFAA from an anti-hacking statute into an expansive misappropriation statute.” The court also noted the absurd results that would follow from potentially criminalizing violations of website use restrictions, which, the court noted, nearly everyone who uses a computer is guilty of.

However, at least one district court has interpreted Nosal narrowly and has implied that violations of certain terms of use can still lead to CFAA liability. In Weingand v. Harland Financial Solutions, Inc., the District Court for the Northern District of California held that “although Nosal clearly precluded applying the CFAA to violating restrictions on use, it did not preclude applying the CFAA to rules regarding access.” Applying this reasoning, the court proved willing to allow a contract limiting access to information to create a viable CFAA liability upon violation of the contract. Weingand suggests that CFAA liability could turn on whether the limitations in a website’s terms of use are framed as use restrictions or access restriction.17 If framed as use restrictions, violation of that term may not create liability, but if framed as an access restriction, violation of that term could still create CFAA liability.

Takeaways

  • Determining whether web scraping can support a viable claim under the CFAA requires a fact-intensive inquiry into whether the user accessed the website “without authorization” or “exceed[ing] authorization.”
  • CFAA claims are more likely to hold up in court if the relevant computer or website is protected from unauthorized access either by technical measures or by explicit warnings.
  • CFAA claims are less likely to hold up in court if the relevant computer or website is publicly accessible and not protected by any security measures or explicit warnings.
  • It is unclear whether web scraping in violation of a website’s terms of use would be a viable CFAA claim. Impacting factors include the location and accessibility of the terms of use, whether there are any additional warnings or barriers to access, and whether the prohibition is phrased as an “access” restriction or “use” restriction.

If you have any questions about data scraping, contact clientservices@opendoorgc.com.

Written by Stephanie Kostiuk - Founder, OpenDoor GC