The MR-Net project ran from 2008 to 2010 and is now complete. Its research results are now used by several other projects at ISI and elsewhere. For follow-on work, please see current work by the ANT Lab.
MR-Net project is exploring the use of parallel algorithms to processes of very large network datasets. Our goal is to be able to analyze 2.7 billion pings to map the Internet address space, process 6 months of flow records to understand traffic trends, and search a week’s worth of packet headers to understand attack characteristics. Each of these tasks requires efficient and economical processing of datasets in sizes from 50GB to several terabytes. This leap in dataset sizes by a factor of 100-1000-fold or more requires fundamentally different ways of handling network data than today’s tcpdump and ethereal on a workstation.
Two enablers contribute to making this leap in size possible. We will exploit map/reduce-style parallelism. Map/reduce runs efficiently over clusters of commodity PCs (as shown at Google, and now at Yahoo and in academia). Programs such as PREDICT and our LANDER project are collecting large network datasets.
The goal of this project is to fill the remaining gap: understanding how to apply parallelism to networking problems. Our maps and browsable versions of of the Internet address space are our first steps towards this understanding.
MR-Net is supported by the National Science Foundation’s Networking Technology and Systems (NeTS) program, grant number CNS-0823774.
For related publications, please see the ANT publications web page.
See also the ANT software web page.
See the ANT traces page.