Five rounds of hardening on a working site
RouteLog.wiki is a small wiki for USPS mail carriers. Each route gets a page. Carriers document stops, CBUs, gate codes, hazards, dog locations, whatever would have saved them time on their first day running that route cold. It went live in March. Word of mouth brought 21 real carriers through registration in the first ten days.
Then I ran the registration flow myself and it broke. A missing parameter in a SQL INSERT had been there since launch. New signups had been silently failing for several days and the only reason I knew was a carrier sent me a screenshot and asked if the site was up.
That was the moment that turned a toy into a site that needed to be audited like infrastructure. What follows is the actual five rounds of work, shipped against the live site, in the order they happened.
Round 1: stop the bleeding
The registration fix was one line. The rest of Round 1 was everything I should have done before launch and had not. Secrets came out of the Flask config and into /etc/routelog.env with chmod 600 root:root, loaded by systemd via EnvironmentFile. The hardcoded fallback SECRET_KEY came out too: the app now crashes if the key is missing, which is the behavior I want on a server I might rebuild in a hurry some day.
I found a guessable metrics token in a Git-tracked config and replaced it with secrets.token_urlsafe(32). I found CSV exports of user emails and IPs sitting in an exports/ directory that was being served by Apache. I deleted them, added UFW, opened only 22, 80, 443, and added a 512 MB swap file so the 1 GB Lightsail instance could survive a Gunicorn fork without OOM.
Round 2: lock the doors
Logout was a GET. Any link on the internet could log a user out by referencing the URL. It is now a POST with a CSRF token, and the GET returns a 405.
Any logged-in user could edit any office, any route, any map marker. Those endpoints now check ownership: creator or admin only, 403 JSON on the API endpoints so the map client can surface the error cleanly. Authorization at the route layer is where this should always have been; having it at the UI layer only was security theater.
Round 3: verification and backups
New registrations now require email verification before first login. 24-hour token, single use, hashed in the database, delivered via SES. The 21 existing users were grandfathered as verified because they were real and I knew them.
Backups were local only, which is not a backup. I set up S3 backups to a dedicated bucket with versioning and a 30-day retention policy. The daily cron now gzips the SQLite file and ships it to S3 as well as keeping it on disk.
Round 4: scale-adjacent fixes
Rate limiting was in-memory with a defaultdict. That works for one worker. Gunicorn runs three. I moved the limiter into a SQLite table so it is shared across workers and persists across restarts. Password policy went from four characters to eight with at least one letter and one number, enforced in registration, profile change, and password reset.
Round 5: Wikipedia for route knowledge
The original wiki editor saved the latest version and overwrote everything else. That is terrifying in a collaborative context. Every route now has a wiki_revisions table. Every edit inserts a new row with content, editor, timestamp, revision number, char delta, and optional edit summary. The wiki page has a History button. The History page shows every revision with who, when, and a diff. Any revision can be reverted in one click. Reverts are themselves revisions, tagged as such, and they trigger notifications to subscribers of that route.
What the five rounds were really about
The fixes were necessary. The more useful output was the habit. Every round was shipped on a live site with real users. Every change landed with a commit message that said what and why. The audit exists, in the open, because the commits exist. A carrier who cares can see the history. A carrier who does not care is still protected by it.
When the tool grows past the point where I personally know every contributor, the audit trail is what has to do the trust work. Writing it down now is cheaper than reconstructing it later.
The wiki is at routelog.wiki. The about page shows the live counts.