Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

peering_filters: checking "is ip in subnet" takes 26% execution time #33

Open
mrngm opened this issue Dec 26, 2020 · 0 comments
Open

peering_filters: checking "is ip in subnet" takes 26% execution time #33

mrngm opened this issue Dec 26, 2020 · 0 comments

Comments

@mrngm
Copy link

mrngm commented Dec 26, 2020

After another optimization (see #31), I've looked into other parts of the code that could be optimized. When running peering_filters (without all, so no calls to bgpq3; also having a locally cached version of 2 Peering DB JSON files, and the ColoClue peers YAML), pprofile reports the following:

Command line: ./peering_filters
Total duration: 42.9381s
File: /home/mrngm/.local/lib/python3.9/site-packages/ipaddr.py
File duration: 11.2164s (26.12%)

The responsible call in peering_filters:

for asn in peerings:
    # [..]
    for session in sessions:
        session_ip = ipaddr.IPAddress(session)
        for ixp in ixp_map:
            for subnet in ixp_map[ixp]:
                if session_ip in subnet: # this call

For another project, I came across PyTricia that can efficiently determine if an IP address (either IPv4 or IPv6) is in a certain subnet. I've patched this into peering_filters as follows:

diff --git a/peering_filters b/peering_filters
index ee6c03a..7666d83 100755
--- a/peering_filters
+++ b/peering_filters
@@ -23,6 +23,8 @@ import sys
 import time
 import yaml
 
+import pytricia
+
 
 def download(url):
     try:
@@ -94,8 +96,11 @@ with open("./cc-peers.yml") as f:
 ixp_map = {}
 router_map = {}
 for ixp in generic['ixp_map']:
-    ixp_map[ixp] = [ipaddr.IPNetwork(generic['ixp_map'][ixp]['ipv4_range']),
-                    ipaddr.IPNetwork(generic['ixp_map'][ixp]['ipv6_range'])]
+    #ixp_map[ixp] = [ipaddr.IPNetwork(generic['ixp_map'][ixp]['ipv4_range']),
+    #                ipaddr.IPNetwork(generic['ixp_map'][ixp]['ipv6_range'])]
+    ixp_map[ixp] = pytricia.PyTricia()
+    ixp_map[ixp].insert(generic['ixp_map'][ixp]['ipv4_range'], ixp)
+    ixp_map[ixp].insert(generic['ixp_map'][ixp]['ipv6_range'], ixp)
     router_map[ixp] = []
     for router in generic['ixp_map'][ixp]['present_on']:
         router_map[ixp].append(router)
@@ -287,11 +292,10 @@ for asn in peerings:
     else:
         continue
 
-    for session in sessions:
-        session_ip = ipaddr.IPAddress(session)
+    for session_ip in sessions:
         for ixp in ixp_map:
-            for subnet in ixp_map[ixp]:
-                if session_ip in subnet:
+            for im_circumventing_fixing_this_large_indentation_block in [1]:
+                if session_ip in ixp_map[ixp]: # pytricia lookup
                     print("found peer %s in IXP %s" % (session_ip, ixp))
                     print("must deploy on %s" % " ".join(router_map[ixp]))
                     description = peerings[asn]['description']

After profiling again:

Command line: ./peering_filters
Total duration: 27.7758s
File: ./peering_filters
File duration: 6.03189s (21.72%)
[..]
File: /home/mrngm/.local/lib/python3.9/site-packages/ipaddr.py
File duration: 0.525969s (1.89%)

(and now most of the time is in parsing YAML).

Is this something you would consider an interesting optimization? If there is any way I can verify that this patch does not break configuration, please let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant