Yes, scraping public data in New York is generally permissible, but strict adherence to local and federal laws is required. Publicly accessible information—such as government records, court filings, or open datasets—may be scraped unless restricted by terms of service, privacy laws, or contractual obligations. The New York State Department of State’s 2024 Digital Privacy Guidelines emphasize transparency, while the 2026 Algorithmic Accountability Act imposes additional scrutiny on automated data collection. Violations of the Computer Fraud and Abuse Act (CFAA) or General Business Law § 349 (deceptive practices) can trigger liability.
Key Regulations for Scraping Public Data in New York
- Computer Fraud and Abuse Act (CFAA): Prohibits unauthorized access to computer systems, even if data is publicly available. Courts in the Southern District of New York (e.g., hiQ Labs v. LinkedIn, 2022) have ruled that scraping may violate the CFAA if it bypasses technical barriers like CAPTCHAs or login walls.
- New York’s Shield Act (2019): Mandates data security and breach notification, indirectly limiting scraping of personal data. Entities must ensure scraped data is not used in ways that risk unauthorized exposure.
- Local Open Data Laws: NYC’s Local Law 11 (2012) and the 2026 Open Data Expansion Act require agencies to publish certain datasets, but prohibit scraping for commercial purposes without prior approval. Violations may result in fines under NYC Admin. Code § 23-502.
Practical Compliance Notes:
- Terms of Service: Ignoring website restrictions (e.g., LinkedIn’s robots.txt or API terms) may constitute breach of contract under NY UCC § 2-302.
- Public vs. Private: Data from government portals (e.g., NYC OpenData) is safer to scrape than third-party platforms, but aggregation risks re-identification under HIPAA or GDPR if linked to individuals.
- Enforcement: The NY Attorney General’s 2025 Data Protection Bureau actively monitors non-compliant scraping, particularly for biometric or geolocation data.