Posted on 2022-02-01
In this article, I'll explain how, and why I acquired an Autonomous System Number and some IPv6 addresses.
In late 2020, I read https://blog.dave.tf/post/new-kubernetes/. In this post, the author said if they were to build something new, they would focus on "IPv6 only, mostly". This post got me to thinking about having some IPv6 connectivity again.
Before I got to Montreal, I used to have access to IPv6 Internet. I can't remember for sure, but I think it was through a Hurricane Electric tunnel.
My Internet Service Provider (ISP) is a small ISP. They don't own the last mile. They provide native IPv6 for some other subscription of theirs, where they can. However, on the service I'm subscribed to, the last mile owner is still in the process of deploying IPv6 (always-have-been-meme.png).
No native IPv6 means I'll need to setup some tunnels, one way or another.
The first thing I looked at was Hurrincane Electric, since it was the only provider I knew at the time. Unfortunately, they only offer GRE tunnels, which means no encryption. One could argue that in 2020+, with HTTPS and DoT/DoH, there is little unencrypted traffic, but to that I'll reply "meh".
I thought I could rent a virtual machine (preferably, since tunnels require little resources and VMs are way cheaper than dedicated servers) and run my own tunnel with the IPv6 it provides.
As mentioned in my infrastructure blog post, I have multiple networks (VLAN) at home. Because I didn't want to do some unholy things, I needed to have a /64 per network, meaning multiple /64s for my home.
I went on the hunt for a provider that offers something like a /56 (or the option to get multiple /64s). Unfortunately, I didn't find anything reasonable. I eventually found some high end servers that came with a /48 but since they cost nearly as much as my rent, I'll pass. Most providers give at best a /64, but it can also be a /128 (lol) or nothing (yeah who cares about IPv6).
I asked a network engineer friend if he knew any hosting services providing more than a /64 with a cheap machine and -well- he gave a network-engineer type of answer "just get your IPv6 addresses and announce them".
After inquiring more detail, he kindly answered and I decided to proceed with this.
While I don't qualify as a network engineer, I'm not completely ignorant network-wise. I used to work for a network operator (so I'm no stranger to BGP) and I used to be a volunteer for a couple of non-profit ISPs back in France.
Disclaimer: Keep in mind what follows is my own interpretation. Go read the relevant parties' websites and agreements to make your own opinion.
Following my friend's advice, I set out to get some IPv6 addresses and an ASN to announce them. I could then create my own (encrypted of course) tunnels to get IPv6 at home.
I would also be able to achieve what I had wanted for years: play with anycast.
IP addresses and an ASN can be obtained through a RIR.
Because of my personal situation (which I won't get into), there are two RIRs I could ask: ARIN and the RIPE.
ARIN is the RIR for Corporatist America. If you're not a corporation, well you're not going to go very far.
I considered creating my own, but the cost exceeded what I was ready to spend on the project. As affordable as it would have been for a corporation, it would not be for me.
RIPE is the RIR for Socialist Europe. You're an individual and you want some resources? That's totally fine, go ask for some. Well, not directly. RIPE doesn't talk to peasants, you'll have to ask a LIR. If they can provide it directly, they do. Otherwise, they act as a proxy between you and the RIPE.
I went for this option. From my time volunteering, I know quite a lot of people in quite a lot of LIRs.
I chose Grifon for no particular reason.
My initial plan was to get a /48 to get IPv6 at home and a /48 to play with anycast (because it is the smallest network you can announce on the Internet). I couldn't do anything else with the /48 I would anycast, by design.
So after completing my membership, I requested a /48 IPv6 from the RIPE (through my LIR, as explained). A few days after the request and with some follow-up questions, I got my first prefix. Now that I had some address space, I could justify the need for an ASN. I made the request and got it.
So I requested a /48 to my LIR from its own resources. Alarig kindly carved my second /48 out of the LIR reserved address space for this purpose.
(For the readers not versed in the RIPE-world technicalities, the first /48 is a PI, the second is a PA).
Shortly after I setup IPv6 at home, I noticed Google believed I was in France. Given that even huge networks struggle to fix problems, I had no hope for myself. I thought that maybe using a netblock from ARIN would solve my issue.
At first, I went to ask a non-profit I contribute to, but it didn't work because we hit a technical limitation from a common provider.
Then, I found the Nato Internet Service. They
provide a /48 (or more if you can justify the need) out of a netblock called
feda (because it comes from 2602:feda::/36
).
Unfortunately, this didn't solve my geolocation problem with Google. I even had a new problem, my FEDA block was geolocated in China, but I easily fixed it in maxmind db, and it seems to have been enough.
However, as the quote says "Everybody has a testing netblock. Some people are lucky enough enough to have a totally separate netblock to run production in.", I had now a /48 I could use to test stuff for anycast.
Are you into IPAM porn? Because if you're into IPAM porn, you're in for a treat!
Now that I had 3 netblocks that I was going to cut into smaller networks, I would need a tool to track usage. Nowadays, most people use NetBox. I thought I was going to use it, but I read a couple of times the author of sidekiq and it made me realize I didn't need such a complex tool.
For shits and giggles, I initially thought "wouldn't it be nice to use tree(1) to see everything??". I created directories for blocks, and files for addresses. Here's what it looked like:
~/git/git.chown.me/ipam/ipv6 (master=)$ tree . ├── 2001:67c:291c::-48 │ └── 2001:67c:291c::1 └── 2a0e:f43::-48 ├── 2a0e:f43:0:100:-56 │ └── NEXT-ONE ├── 2a0e:f43:0:fd00::-56 │ ├── 2a0e:f43:0:fd00::1 │ └── INTERCO-WG1 ├── 2a0e:f43:0:fe00::-56 │ ├── 2a0e:f43:0:fe00::254 │ └── INTERCO-WG0 ├── 2a0e:f43:0:ff00::-56 │ └── 2a0e:f43:0:ff00::1 └── 2a0e:f43::-56 ├── 2a0e:f43:0:10::-64 │ └── 2a0e:f43:0:10::1 ├── 2a0e:f43:0:40::-64 │ └── 2a0e:f43:0:40::1 ├── 2a0e:f43:0:60::-64 │ └── 2a0e:f43:0:60::1 ├── 2a0e:f43:0:70::-64 │ └── 2a0e:f43:0:70::1 └── 2a0e:f43:0:80::-64 └── 2a0e:f43:0:80::1
Note: This predates the move to the feda netblock.
However in the end, editing files was not easy because I had to escape all the
:
in my shell. I had a lot of fun creating this arborescence, but it was time
to move on to something more practical.
I went for a single text file in a json-inspired format. Here's what it looks like:
$ head -n 30 ipam.txt ANNOUNCED BY BGP-YYZ 2001:67c:291c::/48 { 2001:67c:291c::1 { anycast.chown.me } } ANNOUNCED BY BGP-YYZ, NS4 2602:feda:b8e::/48 { ANNOUNCED BY pancake 2602:feda:b8e::/56 { 2602:feda:b8e:10::/64 { LAN 2602:feda:b8e:10::1 { pancake:vlan10 } } 2602:feda:b8e:40::/64 { PHONE 2602:feda:b8e:40::1 { pancake:vlan40 } } 2602:feda:b8e:60::/64 { WINDOWS 2602:feda:b8e:60::1 { pancake:vlan60 } } 2602:feda:b8e:80::/64 { RTBH 2602:feda:b8e:80::1 { pancake:vlan80 } } } [...]
Note that here RTBH is only how I named the network, it's not related to actual RTBH.
I manage the file with vim and I can easily (un)fold any level whether I want an overview or a detailed view. Also this may not be entirely up to date haha.
My initial plan was to get some VMs around the world and announce the /48 on each. Easier said than done, because my requirements are to find a provider which:
I thought "anycast is easy, you just announce your IP everywhere, and done". Well, yes, but actually no. At least if you don't want to abide by RFC 7511. Proper routing requires a lot of work.
I currently have 4 VMs in this anycast network:
This is a work in progress that probably deserves its own blog post when it's fully done, so I won't go further into details.
As you just read, I have two VMs in Toronto. I wish I could have a provider in Montreal to reduce latency, unfortunately I've not been able to find one quite yet.
I had to choose some tunnelling technology. I picked up WireGuard® because it had recently made it into OpenBSD kernels (see wg(4)) and my experience with ipsec is as "good" as the next person.
My current setup is:
~/git/git.chown.me/ipam (master=)$ cat schema.txt Upstream 1 Upstream 2 | | | | R1------ wg ------R2 | | wg wg | | -------- R3 -------
R1 and R2 are my VMs in Toronto, and R3 is my router at home. Yes, my router at home uses BGP, both to announce its own netblock over BGP and to choose the best route between R1+Upstream 1 and R2+Upstream 2. Isn't that super cool??! :D
R1 and R2 both announce my /48 to their provider. They do so with my public ASN.
They have a wg link between each other. The goal is twofold:
Case 1 isn't actually a problem. Once the session with the upstream fails, it won't get the full view anymore, which means R3 won't get the full view from that router, and it will send traffic only to the other. Traffic to me will switch automatically provided the upstream stops announcing my route (it should, but sometimes it doesn't)
I prepend that path with my ASN 15 times (picked by "should be good enough lol") to avoid using it in normal condition.
This simple link was actually quite a big change because until then, R1 and R2 used to do some stateful firewalling (in addition to the one done on R3). However, this change meant traffic could flow asymmetrically, so I had to switch to stateless firewall (which I restricted to the specific network, the rest of the traffic is still checked by pf(4) with stateful rules).
R3 announces the /56 I have at home over BGP to R1 and R2. "But this is inter AS, why didn't you use an IGP???". Well wg(4) doesn't support multicast, and ospf6d (and even eigrpd) needs it. You can do without buuuut... I tried and struggled with ospf6d, so sticking with bgpd was way easier.
Fun fact: I even began to write my own igpd, but I quickly realized I was just reimplementing bgpd poorly so I aborted.
I actually use a private ASN to announce the /56. I picked 4200211935, so it's obviously both "it's my ASN", and "it's not my ASN":
danj@bgp-yyz:~$ bgpctl sh Neighbor AS MsgRcvd MsgSent OutQ Up/Down State/PrfRcvd pancake-6 4200211935 17289 2134334 0 5d22h44m 1 ns4-6 211935 1213381 1930550 0 5d23h30m 134718 xenyth-6 62513 1945805 17297 0 6d00h06m 138770
Of course since I announce a /56 and a private ASN, I needed to stop checking RPKI for this particular host. Fortunately, bgpd's rules system is really easy to work with.
Of course everything runs OpenBSD! It has a lovely bgpd in base. OpenBSD ships rpki-client which one can use to validate ROA ("improve the routing security" in layman's terms).
OpenBSD developers changed OpenBGPD config since last I used it. The thing I worry the most about is messing what I announce to my peers. They must have filters, but I don't want to be that guy. OpenBGPD's config file is set in a way that it's hard to mess up, thanks to sane defaults and a nice logic.
It ships with an excellent example config file making easy to start using it! For that reason, I'm not going to detail mine.
OpenBGPD uses little memory:
danj@ns4:~$ bgpctl show rib nei vultr-6 in | wc -l 135254 danj@ns4:~$ bgpctl show rib nei bgp-yyz-6 in | wc -l 139312 danj@ns4:~$ bgpctl show rib memory RDE memory statistics 139583 IPv6 unicast network entries using 7.5M of memory 279161 rib entries using 17.0M of memory 823926 prefix entries using 101M of memory 156446 BGP path attribute entries using 10.7M of memory and holding 823926 references 138180 BGP AS-PATH attribute entries using 11.6M of memory and holding 156446 references 819 entries for 6470 BGP communities using 178K of memory and holding 823926 references 6803 BGP attributes entries using 266K of memory and holding 41980 references 6802 BGP attributes using 54.1K of memory 306537 as-set elements in 280152 tables using 10.9M of memory 511038 prefix-set elements using 21.6M of memory RIB using 148M of memory Sets using 32.5M of memory RDE hash statistics path hash: size 131072, 156446 entries min 0 max 8 avg/std-dev = 1.194/0.759 aspath hash: size 131072, 138180 entries min 0 max 8 avg/std-dev = 1.054/0.943 comm hash: size 16384, 819 entries min 0 max 3 avg/std-dev = 0.050/0.000 attr hash: size 16384, 6803 entries min 0 max 5 avg/std-dev = 0.415/0.000
Most VMs have only 1G of ram and 1 cpu.
danj@ns4:~$ top -b -ores load averages: 0.01, 0.05, 0.02 ns4.chown.me 20:35:57 65 processes: 1 running, 63 idle, 1 on processor up 13 days, 3:58 CPU states: 2.9% user, 0.0% nice, 2.1% sys, 0.0% spin, 0.1% intr, 94.9% idle Memory: Real: 411M/713M act/tot Free: 256M Cache: 152M Swap: 192M/512M PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND 90287 _bgpd 2 0 231M 238M sleep poll 36:22 0.00% bgpd 88230 _bgpd 2 0 26M 30M idle poll 8:29 0.00% bgpd 61228 root 2 0 20M 22M sleep poll 16:59 0.00% bgpd [...]
I didn't want to run rpki-client on each and every router. I couldn't either because it uses a truckload of inodes and my /var/ partitions couldn't afford it.
I considered using RTR, however it meant running more software (e.g. gortr/stayrtr).
Also bgpd doesn't support (yet?) encrypted RTR so it would have meant either doing RTR unecrypted (yuck), or run even more software.
What I ended up doing is running rpki-client on my web server (on which I added a special partion with way more inodes).
42 * * * * -n rpki-client -v && \ cp /var/db/rpki-client/openbgpd /var/www/static.chown.me/pub/rpki/openbgpd && \ gzip -f /var/www/static.chown.me/pub/rpki/openbgpd
And on my bgpd routers
57 * * * * -n ftp -o /var/db/rpki-client/openbgpd.gz https://static.chown.me/pub/rpki/openbgpd.gz && \ gunzip -f /var/db/rpki-client/openbgpd.gz && \ bgpd -n && bgpctl reload
15 minutes ought to be enough, it used to run in 5 minutes, but apparently it now runs in around 8 minutes, I guess I should setup some monitoring haha.
Of course, I found some improvements for the software I use through this project. Here are some fixes that made it into the OpenBSD trees because of my playing around:
Of course this weird hobby of mine costs money. I'm however very happy of how low I could keep my expenses.
Here's what I paid Grifon:
Out of 4 VMs I run BGP on, I've been using one for other things, so I'm not counting it since I would pay for it regardless of this project.
Here's what I pay for the host:
Even if I messed around with BGP before, I hadn't really gone deeper than the surface. Since I had a lot to learn network engineering-wise, I read a lot of stuff. Among everything, I highly recommend the BGP For All playlist from NSRC
The Google Docs Providers that offer BGP sessions was incredibly helpful.
Probably no.
While the resources I'm using are plentiful (32-bit ASNs, 128-bit IP addresses), people's routers TCAM are not.
My 'experiment' is 3 netblocks out of the ~130k in the DFZ.
Note that I'm definitely not the first person to get an ASN for personal use. Once you begin looking into ASN, there are plenty.
If you really want to play with BGP, you can look into dn42!
I've been doing this project for a bit over a year now.
There were some boring tasks (the perpetual quest to find hosters who don't suck, administrative things to get the resources, etc), but overall, this project has been incredibly fun!
Yeah sex is good, but have you tried running mtr(8), while shutting a BGP session, or remotely triggering a black hole and watch the traffic change?