Home

rangerBlog

Jul. 21st, 2007

05:33 am - Unreal Estate (IP Borders, part 1)

The 25 countries with the 'largest' IPv4 Footprint
Country Size %
1)UNITED STATES48.59 46.90
2)JAPAN6.11 5.89
3)AUSTRALIA2.42 2.33
4)CHINA6.24 6.03
5)UNITED KINGDOM3.95 3.81
6)GERMANY4.31 4.16
7)FRANCE3.08 2.97
8)CANADA2.76 2.66
9)REPUBLIC OF KOREA3.18 3.06
10)NETHERLANDS2.39 2.30
11)ITALY2.04 1.97
12)SPAIN1.26 1.22
13)SWEDEN1.24 1.19
14)BRAZIL1.15 1.11
15)SWITZERLAND1.13 1.09
16)TAIWAN1.10 1.06
17)MEXICO0.97 0.94
18)RUSSIAN FEDERATION0.89 0.86
19)NORWAY0.76 0.73
20)FINLAND0.73 0.71
21)POLAND0.69 0.67
22)AUSTRIA0.56 0.54
23)DENMARK0.53 0.51
24)INDIA0.51 0.49
25)BELGIUM0.47 0.46

I'm doing some work for the Science Museum of Minnesota this summer- one of the projects involves visualizations of globalization; we wanted to show how the world is connected-- a map of the internet.

I'll be honest, it was a modest goal. Surely there has to be a map out there somewhere showing how the world would be rearranged if if were laid out the way IP addresses are... right? Just crib from that and make a pretty version!

Apparently not. There's lots of maps of the internet, some beautiful traceroute visualizations, network topology, even physical density of IP addresses (which is approaching a 1:1 correlation with human physical density- regardless of where you are in the world.)

These are some beautiful visualizations, I love them all- none of them are what we want and I so callously committed myself to. I don't even have a clue what 'if IP addresses determined geographcal layout' means! I suppose I could just kludge something together and declare it fits, but this is for the NOAA's exhibit- I'd like it to have some actual scientific basis.

...which is how I found myself running regular expressions over a datafile containing IP address tuples and their corresponding countries. A couple hours re-teaching myself php classes (I know, I know, I never learned to run a database local, I'm scratch-writing stats analysis on a web server...) yielded a list with some useful numbers, the top 25 of which I've listed at left.

the names for the bytes of an IPv4 address
(which I just made up)
255. 255. 255. 255
Ambit Windward Dative Squib
<--General Specific-->

IP addresses are like phone numbers for every machine physically connected to the internet. (If you're wireless, the modem your wirelessing-into has an IP address.) Example: 52.149.33.204 4 numbers ('bytes,') valued between 0 and 255. I gave the IP bytes names while I was working on them,, it amused me so. Put together there are 4.2 billion combinations, this is actually a problem since there's already 6 billion people on Earth. IPv6, still in development, is expected to fix this potential 'crunch' with plenty of time to spare before the entire planet gets wired.

IP addresses are a hierarchy. The final set of values is the narrowest, the first the broadest, like genus and species, an IP address pares the options down until only one remains.

For my purposes, I wanted to know about the Ambit, the broadest upper category of IP addresses. There are 256 Ambit, each representing 16 million addresses. I want to know what Ambit 'belong' to which country, and how they're divided if they belong to multiple countries.

Note: This addresses IP allocation, not utilization. That's fine, because I'm treating it like a natural resource- tapped or not, which countries are 'rich' in IP's? And utilization is maximizing anyway. One of the reasons to change to the 16-byte IPv6 system is that with more than 300 undecillion, vast swaths of addresses will be reserved for special purposes regardless of utilization. (An IP address then will convey context information beyond simple location on a network.)

bytes of an IPv6 address
Squib2551
Dative2552
Windward2553
Ambit2554
Capchaw2555
Demonstrative2556
Midwife2557
Remorse2558
Shenlong2559
Exoletus25510
Déshabillé25511
Xabungle25512
Egregore25513
Ill Solace25514
Gabwhacker25515
Bleed25516
Staring at numbers with lots of caffeine and no sleep is a dangerous thing. Somewhere along the way I decided that a group of adjacent IP addresses was a Theory, like a pride of lions, and I took the time to name all 16 bytes in the upcoming IPv6 protocol, which I inflict on you at right. I gave three of them genders too, but you have to guess which ones.

So armed with this purpose, and a database of 76,000 IP tuples from around the world, I went to work, and did just that. I discarded block-allocations that appear unused, compensated for the unused-allocations I found mentioned other places... and ended up being able to account for 94% of all IP space. The fringes of IP allocation are constantly shifting- China never had a chunk of territory assigned to it, its slowly nibbled a huge chunk of IP space out of blocks it shares with other countries. This chart is imperfect- the fudging to compensate for thsoe allocated-but-largely-unused blocks makes it appear Japan only has 2/3 of an IP address per resident- they actually have ~1.1. On the one hand Japan is wireless, which makes IP's less important. On the other it's wired to the hilt- thus near-full-utilization.

The US, no surprise, has about half of all IPv4 territory worldwide. In the final map, that will translate as area. Clustering (continents?) and borders... are more problematic. There will have to be some sort of node-analysis of which countries 'share' a border based on their overlap, and how strong that overlap is. Ideally, with some artistic statistical smudges I'll be able to get the resulting 'map' to break up into 5-10 'continents' I can position vis photoshop to reflect what the statistics tell me about which nations are actually their 'neighbors'.

So, yeah. China has 5% of the internet and 20% of the population. America has 46% (about 1/3 of which is unused) and 5% of the population.

Mappy mappy!

Current Mood: busy