Automate Around A Country (Part 1)

Have you found yourself using this argument?

I’d learn some automation techniques, but it’s so much easier to log into the router and make my changes. I don’t really need to learn the API to do the work. I’ll spend 3 months for something I can do in 10 minutes. 

I think some of this is a problem for enterprise network engineers because it really is easy to just fire up that SSH client and make the required change. Most enterprise engineers aren’t doing massive moves-changes-adds-deletes (MACD) where scripting helps. Most don’t even have a DevOps staff to help do scripting nor do these poor souls have time to learn how to leverage systems like Ansible or Gluware. It’s the time-resources-money triangle and something is going to suffer; a project deadline, budget, or staff time. For years I was in the same boat until a pretty cool challenge came across my desk.

Every network technologist in a global environment knows about the Great Firewall of China. If you’ve never heard of it, you will. The Chinese government is serious about squashing dissident thought and the best way to do that is to filter Internet connectivity. They don’t like things like alternate views or IPSEC/SS. You may have a tunnel working one day, and the next, kah-blewey. In my particular instance, my APAC regional offices use a default route to Hong Kong and Internet access is broken out from there. With this topology the problem is when APAC offices need to access things like banking sites in China. The application performance from the “untrusted” to “trusted” side of the China firewall tanks, or is blocked altogether.

How do I get around it?

“PAPERS, PLEASE! PAPERS!”

Build Standards, Automate
Build Standards, Automate

Think of the China network like this…two Internets. There’s the Intra-China-Internet and then there’s everyone else. It’s mainland and off-site. My company uses a regional MPLS for APAC for sites like Tokyo, Sydney, and Shanghai. The MPLS default route is through Hong Kong, so if it changes to the China office (as suggested by China Telecom), then all of my internal traffic not required to be filtered by the Chinese government will de facto be filtered by the Chinese government.

Why not BGP peer with a Chinese telco and distribute mainland China routes into MPLS? Small problem. ChiComs apparently hate BGP peering to Non-Government Organizations. It’s SUPER DOOPER expensive to get a TCP 179 session. Ugh! Guess that’s how they force the default route onto businesses in the country. Hopefully you’re starting to see the issue and why automation is going to help address it.

To improve my corporate customer experience we’re dropping a DIA circuit from a Chinese provider—think China Telecom (CT)—where we will ingest the mainland-China routes from the peer. We then redistribute those summaries into the internal MPLS network. Longest-prefix rule wins, and mainland egresses through China Telecom while everything else follows the default to the unfettered Internet. Well, that’s the idea until we saw the cost. So how can we do the same thing without BGP to the provider?

Regional Internet Registries (RIRs) keep the address assignments in the global regions. What if we can scrape that information from the RIR, looking glass servers, or route servers via an API or script and ingest that to the firewall as a static route? Then it’s a simple redistribution to the internal network? I don’t want to be the guy, nor do I want to saddle the service desk, verifying the routes every day and doing a MACD for any deviation to the static route entry. Dumb! A little automation to the rescue!

There are a couple of websites I’ve found that provide clear RIR information. These can be the source of route-truth for your device. Once you have this, adding the static routes is a simple API call whose syntax varies from vendor to vendor. Route servers are publicly available but that means you’d have to log into the server and grab the routes. Then you have to figure out which are specific for your region. I’m have been looking at Hurricane Electric’s peering with China Telecom (among others) as well as Country IP Blocks.

Planning Time

Let’s start with using HE’s tool as our IP prefix source of truth for mainland-China. While the routes are available, they are unfortunately BGP ASN specific. That means you have to have a report of all ASNs within China, another lookup at APNIC for ASNs, search the HE peer report for each ASN, scrape all routes for that ASN, and then combine the list into one inventory file. Some of you may even be thinking, “What about the companies that are multi-homed? Couldn’t you get duplicate entries in your report?” Exactly. In your script you’ll have to build in some duplicate entry checking and it’ll have to match a CIDR mask, not just an exact match. If you have a 10.1.0.0/17 from one ASN, and a 10.1.4.0/24 from another ASN, your script will have to account for that.

For a seasoned developer—which I am definitely not—this may not be an issue. Some folks may be able to knock this out in a few days. For those guys, think about the ongoing support. There are a lot of moving parts to account for and if this doesn’t concern you, awesome. Just make sure script is well documented for the next guy! Remember that this will poll HE at a certain interval. You need some error checking, notification of success/failure, and with all the comparison logic it still needs to run relatively quickly.

Does HE seem like a complex solution? Yep. So did I.

What about using CIPB as your prefix provider? This site eliminates some of the issues you’ll have with HE which in turn makes your coding simpler. For example, they have already aggregated the country IP prefixes so you don’t need to worry about searching through each ASN. CIPB can provide the list in several formats ranging from Apache htaccess to web.conf allow/deny and everything in between. Inventory files are cleaner now.

The trial license of CIPB allows for up to 100 queries per day and their IP prefixes are updated every 90 days. The licensed version (as of the time of this writing, $399/year) updates every 4 hours and removes that query limit. The nice thing about this service is you can download IP subnets in CIDR format which is a common syntax supported by most vendors. This makes it an easy matter to prepend the command to the prefix variable sucked in from CIPB. HE offers an API to grab this kind of information, but in most instances it’s prohibitively expensive. HE sales said it was in the neighborhood of $20,000/month. Frankly, for API access, that neighborhood is full of Bentley’s and too rich for my blood.

Remember, most automation scripts require an “inventory” list. This contains things like hostnames, subnets, and/or SNMP communities. The common attributes in your network that can be imported and then have action taken against. What’s good/bad about HE or CIPB? In this scenario, I like the possibility that HE will be around longer than CIPB, but, I also like the benefit of using CIPB for IP prefix inventory format.

Wrapping It Up

Just a few things to say in summary. With the problem I’m trying to overcome, the hardest part was finding a source of truth that was quickly grabbed. I thought about checking the RIRs, whois, route servers, and looking glass servers. None of these really gave me a simple, scaleable, or even long term option for pulling regional routes. With this exercise, I’ve found myself trying to think like a developer even though it’s not something I’ve had a lot of experience with. But, here are just a few items rattling around the ol’melon:

  • What would “I” need if I were to automate this task? Forget the script, what information is required?
  • What are my variables? How much do they change and in what timeframe?
  • If I grab data from an external source, how long will it be there? How accurate is it? Can I get this same information internally?
  • What if my external source goes out of business? Will it impact my operations? Do I have another option?
  • Great, I’ve got a reliable source but there’s no API. Can I scrape this data from the PHP page?

I’m planning to write an article discussing the coding required to make this happen, but first I wanted to go over some of the hurdles. You don’t want to just start scripting without a plan. That’s just a recipe for disaster and frustration. I’m a big fan of planning out what you’re doing before doing it. The activation should take the shortest amount of time if you’ve properly planned.

All of these are important questions. If you’re new to the discipline of automation, it’s going to be really important to have a resource to help you through these things. A lot of guys say automate your basic tasks. Some of us didn’t have that option before a bigger project was tossed in our laps. My follow up to this article will be the code required to ingest external data to internal devices. At this time, still developing…

I am a full-time Network Engineer for a leading high tech manufacturing company in Austin, TX. My career in the IT industry has stretched over 20 years primarily in the enterprise space; primarily in financial, biomedical, and high tech sectors. Check the About page.