Operators of public-facing websites are typically concerned about the unauthorized, technology-based extraction of large volumes of information from their sites, often by competitors or others in related businesses. The practice, usually referred to as screen scraping, web harvesting, crawling or spidering, has been the subject of many questions and a fair amount of litigation over the last decade.
However, despite the litigation in this area, the state of the law on this issue remains somewhat unsettled: neither scrapers looking to access data on public-facing websites nor website operators seeking remedies against scrapers that violate their posted terms of use have very concrete answers as to what is permissible and what is not.
In the latest scraping dispute, the e-commerce site QVC objected to the Pinterest-like shopping aggregator Resultly’s scraping of QVC’s site for real-time pricing data. In its complaint, QVC claimed that Resultly “excessively crawled” QVC’s retail site (purpotedly sending search requests to QVC’s website at rates ranging from 200-300 requests per minute to up to 36,000 requests per minute) causing a crash that wasn’t resolved for two days, resulting in lost sales. (See QVC Inc. v. Resultly LLC, No. 14-06714 (E.D. Pa. filed Nov. 24, 2014)). The complaint alleges that the defendant disguised its web crawler to mask its source IP address and thus prevented QVC technicians from identifying the source of the requests and quickly repairing the problem. QVC brought some of the causes of action often alleged in this type of case, including violations of the Computer Fraud and Abuse Act (CFAA), breach of contract (QVC’s website terms of use), unjust enrichment, tortious interference with prospective economic advantage, conversion and negligence and breach of contract. Of these and other causes of action typically alleged in these situations, the breach of contract claim is often the clearest source of a remedy.
This case is a particularly interesting scraping case because QVC is seeking damages for the unavailability of their website, which QVC alleges to have been caused by Resultly. This is an unusal theory of recovery in these types of cases. For example, this past summer, LinkedIn settled a scraping dispute with Robocog, the operator of HiringSolved, a “people aggregator” employee recruting service, over claims that the service employed bots to register false accounts in order to scrape LinkedIn member profile data and thereafter post it to its service without authorization from Linkedin or its members. LinkedIn brought various claims under the DMCA and the CFAA, as well as state law claims of trespass and breach of contract, but did not allege that their service was unavailable due to the defendant’s activities. The parties settled the matter, with Robocog agreeing to pay $40,000, cease crawling LinkedIn’s site and destroy all LinkedIn member data it had collected. (LinkedIn Corp. v. Robocog Inc., No. 14-00068 (N.D. Cal. Proposed Final Judgment filed July 11, 2014).
However, in one of the early, yet still leading cases on scraping, eBay, Inc. v. Bidder’s Edge, Inc., 100 F. Supp. 2d 1058 (N.D. Cal. 2000), the district court touched on the foreseeable harm that could result from screen scraping activities, at least when taken in the aggregate. In the case, the defendant Bidder’s Edge operated an auction aggregation site and accessed eBay’s site about 100,000 times per day, accounting for between 1 and 2 percent of the information requests received by eBay and a slightly smaller percentage of the data transferred by eBay. The court rejected eBay’s claim that it was entitled to injunctive relief because of the defendant’s unauthorized presence alone, or because of the incremental cost the defendant had imposed on operation of the eBay site, but found sufficient proof of threatened harm in the potential for others to imitate the defendant’s activity.
It remains to be seen if the parties will reach a resolution or whether the court will have a chance to interpret QVC’s claims, and whether QVC can provide sufficient evidence of the causation between Resultly’s activities and the website outage.
Companies concerned about scraping should make sure that their website terms of use are clear about what is and isn’t permitted, and that the terms are positioned on the site to support their enforceability. In addition, website owners should ensure they are using “robots.txt,” crawl delays and other technical means to communicate their intentions regarding scraping. Companies that are interested in scraping should evaluate the terms at issue and other circumstances to understand the limitations in this area.