LANDER:geoloc university ground truth-20151001 From Predict README version: 5185, last modified: 2016-01-26. This file describes the trace dataset "geoloc_university_ground_truth-20151001" provided by the LANDER project. Contents • 1 LANDER Metadata • 2 Dataset Contents • 3 Data Format • 4 Dataset Generation • 5 Citation • 6 Results Using This Dataset • 7 User Annotations LANDER Metadata ┌───────────────────────────┬────────────────────────────────────────────────────────────────────────────────────┐ │ dataSetName │ geoloc_university_ground_truth-20151001 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ status │ usc-web-and-predict │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ shortDesc │ Geolocation ground truth data │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ longDesc │ This dataset contains latency measurements and location information for 85 /24 │ │ │ blocks that belong to universities around the world. The IP addresses of each │ │ │ block are believed to be closely co-located. The measurements are leveraged from │ │ │ the IP Geolocation datasets: InternetTopologyData/IP_Address_Geolocation_Data │ │ │ provided by the USC/LANDER project (http://www.isi.edu/ant/lander). The location │ │ │ information is obtained from Google Geocode API and MaxMind GeoLite City database. │ │ │ The dataset also provides artificial dual-location blocks made by merging IP │ │ │ addresses from two different single location /24 blocks. The two merged blocks are │ │ │ VP-compatible (probed by the same set of of VPs). The distance between the │ │ │ combined blocks varies across the artificial blocks. This dataset can be useful │ │ │ for geolocation and IP blocks co-locality research. │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ datasetClass │ Quasi-Restricted │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ commercialAllowed │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ requestReviewRequired │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ productReviewRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ ongoingMeasurement │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ submissionMethod │ Upload │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartDate │ 2015-10-01 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndDate │ 2015-10-02 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartDate │ 2016-01-28 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartTime │ 15:07:05 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndDate │ 2030-01-01 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ anonymization │ none │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ archivingAllowed │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ keywords │ category:internet-topology-data, subcategory:ip-address-geolocation-data, │ │ │ ip-address, geolocation │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ format │ text │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ access │ https │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ hostName │ USC-LANDER │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ providerName │ USC │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingId │ │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingSummaryFlag │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ retrievalInstructions │ download │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ byteSize │ 2097152 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ expirationDays │ 14 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ uncompressedSize │ 1718137 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ impactDoi │ 10.23721/109/1354081 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ useAgreement │ dua-ni-160816 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ irbRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ privateAccessInstructions │ See http://www.isi.edu/ant/traces/index.html#getting_datasets for information on │ │ │ obtaining this dataset. │ │ │ See │ └───────────────────────────┴────────────────────────────────────────────────────────────────────────────────────┘ Dataset Contents The dataset contains files of latency measurements and location information for /24 universities blocks. geoloc_university_ground_truth-20151001.README.txt     copy of this README single_location_blocks/single_location_blocks_RTT_data.fsdb single-location blocks latency measurements single_location_blocks/single_location_blocks_geolocation_data.fsdb single-location blocks geolocation data dual_locations_dataset/dual_location_artificial_blocks_RTT_data.fsdb dual-location artificial blocks latency measurements dual-location artificial dual_locations_dataset/dual_location_artificial_blocks_distance_geolocation_data.fsdb blocks distance and geolocation data Data Format File "single_location_blocks_RTT_data.fsdb" has the RRT measurements (microseconds) for the IP addresses of the single location blocks. The RTT measurements are observed via 10 different VPs. Each record (row) has the following data: #fsdb block block_prefix ip RTT1 RTT2 RTT3 RTT4 RTT5 RTT6 RTT7 RTT8 RTT9 RTT10 c242d000 194.66.208.0/24 194.66.208.235 10081 10121 10057 9571 9562 6757 5304 5265 5183 5167 c242d000 194.66.208.0/24 194.66.208.246 10032 10044 10148 9549 9562 7882 5218 5291 5172 5188 File "single_location_blocks_geolocation_data.fsdb" has the geolocation data for 85 /24 blocks from two sources: Google geocoding API and MaxMind Geolite city public database. Each record has the following data (MM refers to MaxMind Geolite data). #fsdb block block_prefix googleAPI_lat googleAPI_lon MM_lat MM_lon MM_country MM_state_or_region MM_city MM_postal_code c242d000 194.66.208.0/24 51.277940 1.090660 51.279 1.0799 GB G5 Canterbury CT1 c13c4e00 193.60.78.0/24 51.483316 -0.003960 51.4776 -0.0104 GB E7 Greenwich SE10 File "dual_location_artificial_blocks_RTT_data.fsdb" has similar data to that in file "single_location_blocks_RTT_data.fsdb" but for the artificial blocks (combination of two single-location /24 blocks). The combination_ID identifies IP addresses in the same artificial block. #fsdb combination_ID ip RTT1 RTT2 RTT3 RTT4 RTT5 RTT6 RTT7 RTT8 RTT9 RTT10 Combination_1 193.60.78.60 4649 4647 4660 4205 4258 3058 1817 1699 1384 1574 Combination_1 193.60.78.153 4785 4891 4933 4392 4364 3081 2099 1852 1718 1613 File "dual_location_artificial_blocks_distance_geolocation_data.fsdb" has the Google Geocoding API location (latitude/longitude) for the two merged /24 blocks in each of the artificial blocks, and the distance (in miles) between them. #fsdb combination_ID block_prefix_1 block_prefix_2 distance block1_lat block1_lon block2_lat block2_lon Combination_1 193.60.78.0/24 194.66.208.0/24 49.29 51.483316 -0.003960 51.277940 1.090660 Combination_2 134.173.112.0/24 134.173.237.0/24 0.53 34.104669 -117.704739 34.103824 -117.713872 Dataset Generation The measurements in this dataset are leveraged from the IP Geolocation datasets: InternetTopologyData/IP_Address_Geolocation_Data provided by the USC/LANDER project (http://www.isi.edu/ant/lander). The location information is obtained from Google Geocode API and MaxMind GeoLite City database.The blocks in this dataset are selected according to a number of criteria to insure they are locally hosted in their organizations and that each block has its IP addresses geographically co-located. First, the initial set of /24 blocks is selected such that each block contains the IP address of a universities website. Such blocks are more likely to be locally hosted. The blocks are further checked for self-hosting by checking for outsourcing using whois information. Also, the physical geographic location of the university obtained using Google Maps Geocoding API and the block location reported by MaxMind databases are compared. Only blocks with 10 miles or less between their locations reported by the two aforementioned location sources are kept in the dataset. Though we think all the academic institution blocks in the dataset are single-location, the following 7 blocks are identified to have 2 clusters by the clustering method used in the related work (see section: Results Using This Dataset): 194.81.33.0/24, 139.222.128.0/24, 132.198.101.0/24, 149.222.20.0/24, 134.245.12.0/24, 141.89.68.0/24, 202.229.120.0/24. Multiple clusters mean a potential of multi-location blocks (or at least heterogeneity in the observed block latency measurements ). The dual-location artificial blocks are created by combining two blocks at two different locations from the single-location dataset. Only blocks identified as single-cluster are used to create the artificial dual-location blocks dataset. Also only blocks probed by the same set of VPs (VP-compatible blocks) can be combined to form an artificial block. Some of our single-location blocks are almost co-located. We provide information about the location of each /24 block and the distance between combined blocks for each artificial block. Citation If you use this trace to conduct additional research, please cite it as: University blocks geolocation ground truth and latency measurements, PREDICT ID: USC-LANDER/geoloc_university_ground_truth-20151001. Provided by the USC/LANDER project http://www.isi.edu/ant/lander. Results Using This Dataset Manaf Gharaibeh, Han Zhang, Christos Papadopoulos, and John Heidemann. Assessing Co-Locality of IP Blocks (Submitted to IEEE Global Internet 2016, currently under review) The work is released as tech report and can be found at: http://www.cs.colostate.edu/TechReports/Reports/2015/tr15-103.pdf User Annotations Currently no annotations. Categories: • LANDER • LANDER:Datasets • Datasets