Hiatus
If anyone was watching for changes on the blog, there wasn’t anything to see from sometime in January 2025 until now. There is a tale there.
In January of 2025, I was browsing domains I host, and there were redirects I hadn’t added. Getting on the server confirmed that some hacker/cracker had compromised multiple sites, and, indeed, root-accessible content on the host. I contacted Marc, and spent about three hours getting content off the VM partition, and then we nuked the whole VM. (“Nuke it from space” “It’s the only way to be sure.”)
That left me with a problem. I host a bunch of domains, including this one, and it was obvious that the technical debt bill had come due. Simply setting up Apache, MySQL, and the rest was just asking for the same thing to happen yet again.
I used OpenAI’s ChatGPT for a solution that would help to minimize the blast radius on compromised sites in the future. What it came up with was a system where all the domain hosting is handled via Docker containers. Each web app gets its own stack. Every static source gets its own server. Routing and host-wide security are run from their own Docker containers. So long as I maintain separate credentials for everything, that should mean that compromise of one site no longer threatens the host and every other domain I have.
So, containers were the way forward. The next step was what to put in them. Simply restoring possibly compromised dynamic content was out, so that left static sites. A couple of my domains already hosted mostly static content, so those worked for a simple transition. That included the TalkOrigins Archive. The total downtime for the Archive due to the cracking incident was under 24 hours from recognition of the problem, followed by backups, VM nuking, VM creation, host configuration with Docker Compose, new software stack implementation based on Docker, deployment of software stack, and provisioning data. Not great, but also not as bad as it might have been.
Doing something about the dynamic sites was more difficult. I didn’t really have tooling then to work through getting those re-established from their state there at the end of the old VM. But ChatGPT pointed out that there was a Ruby tool to recover pages out of the Internet Archive (https://archive.org), ‘wayback-machine-downloader’. Let’s call it WMD for short. I hadn’t been aware of it before, but the WMD offered a stopgap: recover recently-served pages and make a static representation of the original dynamic site. Some experimentation showed this this worked, more or less. There were some issues with recovery of images and assets that I was able to add some code to address those as a separate effort, helping fill in missing graphical elements that WMD left blank. So for the preponderance of domains I had, I applied the WMD and fixup scripting and got static versions out, which allowed me to get at least some form of the remainder of sites serving again, including TalkDesign.org and this blog, austringer.net/wp.
Other things intervened, plus the hurdle of figuring out simultaneous software upgrades with security fixes always made fully restoring dynamic content here made it easier to defer.
More recently, tools like Anthropic’s Claude Code and OpenAI’s Codex have made more complex software problems and development approachable. Getting acquainted with this class of tool saw some immediate use in doing things like migrating the Avida-ED website from its old host at MSU to simpler Github Pages hosting (https://avida-ed.github.io). With that experience, I started tackling some personal projects of greater complexity. And that led to the first major dynamic content recovery project here, getting this blog operating again.
After launching the tool, I prompted with a summary description of the issues, including the need to check for and mitigate any residual problems from the compromise, pointing to the static site content and the backup taken from the last day of the old VM. Over the course of about two hours, it worked through getting a new instance of WordPress running on a different subdomain, invalidating old logins and cookies, disabling old plugins, writing a compatibility layer to handle shortcodes and post content affected by the absence of some truly ancient plugins, finding remaining unresolved content links and fixing those. After various checks in the live, but not published-in-the-same-place site, the blog was ready to cut over to its usual place. There were a few trailing effects regarding exact edit-update paths, but this post will test those fixes.
So as I post this, there is a chunk of information learned from this process to be applied to other dynamic sites, especially the WordPress-based ones. I will probably aim to do one each weekend for a while. And I hope to be able to resume at least some posting here now that the software is back in place to support that.
Other web-hosting work has happened as well. I’ve made a site that is largely static, but has interactive elements, and which is aimed at learning and teaching evolutionary biology, ecology, and evolutionary computation: https://evo-edu.org . There is work on the TalkOrigins Archive (TOA), including a new feedback system (https://talkorigins.org/fb) and a site redesign effort (demo sample at https://www2.talkorigins.org) that uses responsive design for use on mobile devices as well as the desktop. I’m working on software to support an update of the TOA’s keyword-indexed bibliography; Marty Leipzig contributed the content of the bibliography back in 1993, so there is scope to add to that, plus a variety of keywords that are useful now that weren’t even coined back then. I have a repo for that as a project called ‘CiteGeist’ at https://git/cns.fyi/welsberr/CiteGeist .There’s a search capability in development that will allow selectable searches of the various TalkOrigins Foundation site content: TalkOrigins, TalkDesign, and PandasThumb. There’s work being done to make the PT pre-2016 comment base a searchable resource.
It took a while, but I think the hosting basis has improved, and certainly the ability to make progress on tasks has advanced considerably since the wake-up call last year.