It appears this is caused by a double quote char in cp1252 not matching the position in UTF-8. The fix is in line 72 - > soup = BeautifulSoup(open(os.path.join("doc.qt.io", c[-1]), encoding="utf8"), "lxml")