Policy date: 23 May 2007
This statement summarizes the ANT project position on data privacy to our data providers and consumers.
Our goal in this policy is that ANT should not fundamentally change the privacy of network users of monitored networks. We follow best-current-practices in network data collection (for example, see RFC 1262).
To that end:
We do not collect any user data (packet data contents sent between users).
We do collect packet headers, however we collect only the first fraction of a packet (to get the header). We immediately run a program (the “scrubber”) that examines each packet and deletes any partially captured user data. (This program runs automatically, immediately after capture and before any user examines the data.)
We ensure that all user IP addresses collected with packet headers are anonymized.
We scramble (with prefix-preserving anonymization using a cryptographic key) the lowest 8 bits of any IP addresses in the packet headers. The cryptographic keys used for this are rotated and discarded regularly. Once these keys are discarded it is impossible to determine the the low-order 8-bits of the original IP address.
In addition we scramble (with prefix-preserving anonymization using a second cryptographic key) the complete 32-bits any IP addresses in the packet headers. The cryptographic keys used for this are rotated and discarded regularly. Once these keys are discarded it is impossible to determine either the lower-order 8 bits or the full original IP address.
(In some cases, when we are confident IP addresses are for infrastructure, not users, we report them without anonymization.)
We use the data for networking research in a privacy-respectful matter.
We ensure that:
We make our datasets available publicly to researchers who need these kind of datasets for research, currently direct requests to our project at USC. All researchers using our data must agree to follow our privacy policy. Details of these requirements are in a memorandum of agreement, but a summary of those requirements is:
Under specific conditions, such as with explicit user consent and after IRB review, we may relax these policies for specific datasets or analysis.