GDPR120 consists of 39,834 annotations, 120 recent privacy policies, 108 possible labels and is, to date, the only open-source large-scale dataset dealing with privacy issues under European law. GDPR120 has been annotated under the close supervision of senior researchers and professors specialised in the fields of privacy and data science.

GDPR120 is available in two formats: GDPR120Q is formatted for training Q&A algorithms while GDPR120TC is suited for multi-label token classification tasks.

GDPR120 is curated and maintained by the Smart Law Hub to support NLLP research in the field of privacy compliance.

DATASET Code

Let's keep in touch !