Universally-applicable pip design decisions

pkgsrc lists these features of pip, emphasis added:

  • All packages are downloaded before installation. Partially-completed installation doesn’t occur as a result.
  • Care is taken to present useful output on the console.
  • The reasons for actions are kept track of. For instance, if a package is being installed, pip keeps track of why that package was required.
  • Error messages should be useful.
  • The code is relatively concise and cohesive, making it easier to use programmatically.
  • Packages don’t have to be installed as egg archives, they can be installed flat (while keeping the egg metadata).
  • Native support for other version control systems (Git, Mercurial and Bazaar)
  • Uninstallation of packages.
  • Simple to define fixed sets of requirements and reliably reproduce a set of packages.

I’d say the emphasised lines would be useful for any tool, not least a package manager.

Add MJ12bot to your web server spam lists

Multiple sites I administer saw a huge uptick in spam from a so-called mj12bot last Monday afternoon, from at least a dozen separate IP addresses. I’m assuming the bot was named for the Majestic 12, a UFO conspiracy theory which may hold as much useful information as their site.

My FreeBSD and Debian cloud VMs all had hundreds of lines of these:

[22/Jun/2020:16:55:16 +1000] "GET / HTTP/1.1" 200 29503 "-" \
"Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/)"

I figured it was another Semalt situation, but just for fun I checked their site. It was immediately telling that most of the text went into describing how to reduce the requests from their bots. They explained how their service worked:

MJ12bot will make an up to 20 seconds delay between requests to your site - note however that while it is unlikely, it is still possible your site may have been crawled from multiple MJ12bots at the same time.

I had the same IP crawling my site more than three times a minute, so their 20-second delay claim is a lie. But even if it wasn’t, such a claim is meaningless if multiple bots are free to do this concurrently. We limit the number of people that can be on each bus to this nature reserve, but we don’t limit the number of buses.

I’m curious to understand the design decisions behind bots like this, and what they technically hope to achieve with these frequent requests. It’d be trivial to diff a day’s worth of requests, realise that these kinds of sites are largely static and only change twice a day at most, and become more strategic with future requests. For bonus points they could observe the site only changes when the author is awake and writing in an Asia-Pacific timezone, so requests after those hours are pointless. They could check the TTL on the site’s RSS feed. There’s so much low hanging fruit here, but instead they decide to spam requests to such an extent that their activities illuminate in logs like a Christmas tree.

Don’t bother with following their advise and adding their site to robots.txt, just block their useragents entirely, or better still tarpit them with nginx 444s. Some third-party nginx blocklists also have mj12bot.

Selling your fire extinguisher

Fire extinguisher

I already can’t remember where I read this, but it’s so great:

Selling your fire extinguisher because you’ve never used it.

Microsoft Word would have once told us, without a trace of irony, that’s “it’s a frament, reconsider”. I’m so glad I live in LaTeX world now, but I digress.

It reminds me of someone I met complaining that their Australian taxes pay for Medicare, given he’d rarely had to use it. I said I agreed that not getting violently sick or injured was a real shame.

It’s so easy, and politically expedient, to deprioritise health and safety when things go well. But it’s during that time when we’re the best positioned to prepare for that unavoidable eventuality. Only fools think good times last forever.

Fire extinguisher image by TPaign on Wikimedia Commons.

Two-factor auth codes need to be chunked better

Two-factor authentication over SMS is problematic and increasingly insecure, but it’s still in wide use, and better than one-factor authentication. So while we still get these codes sent to us, how can we make them easier to use?

Most sites send a code like this:


My favourite maths teacher in high school taught us that chunking is the best way to remember numbers. She said most of us would read the above number in our heads as this:

Eight Three Two Four One Six

That’s six discrete numbers to remember, and potentially mess up. So she challenged us to read the entire thing out loud as a class, as a single number:

Eight hundred and thirty-two thousand, four hundred and sixteen.

We all laughed at how absurd this long sentence was, but we soon realised she’d done it on purpose. Sneaky Ms Harris. By stringing together numbers like this, we’d read out 832 and 416. Drop the word thousand, and you only have two numbers to remember. 2FA developers, please put a space in your numbers.

And while we’re at it, I find three two-digit numbers even easier to remember:

83 24 16

Eighty-three, twenty-four, and sixteen. Donezo.

Curiously, most 2FA tokens I get today are just a single long number. It would be trivial to present these as two or three numbers, and make a huge difference to accessibility and usability. Heck you could probably even extend it to eight digits if this technique were applied.

“Your unpopular opinion about Re:Zero”

Key visual from the second season.

Ram > Rem, and not just because Rie Murakawa voices her! Via @kyoajin.

I finally got around to watching the whole first season during my morning commute last year, but never got around to reviewing it. There were aspects I shamelessly enjoyed, though one of the repeating plot devices did got old, and the gore was a bit much. One day I’ll actually review it here properly. 3/5?

Goodbye Home-Fix, and Singapore’s DIY scene

COVID has unfortunately made retail and restaurant closures seem inevitable, but I missed that Singaporean DIY outlet Home-Fix shut down in December last year. Tiffany Fumiko Tay wrote for the Straits Times:

Once a familiar name in many a Singapore shopping mall—and with more than 20 outlets at its peak—home-grown hardware chain Home-Fix will be shutting its last store here by the end of the week.

Among the challenges that the do-it-yourself (DIY) chain had to grapple with were high mall rentals, competition from e-commerce and neighbourhood stores that sell the same wares at lower prices and a sluggish economy.

We lived for a period of time in Balmoral Park, which was walking distance from their Tanglin Mall store pictured in the article. The staff was friendly and knowlegable, and the invariably had exactly what we needed despite its small size. But I have to be honest and admit I’m not surprised that online shopping snuffed them out.

Jalelah Abu Baker wrote for Channel NewsAsia that the stores were looking to reinvent themselves. But six months later, the Home-Fix’s website still just has a landing page, and the footer says it’s “under judicial management”.

Photos from Home-Fix's Instagram account

Singapore doesn’t have as much of a DIY culture as in other countries, though I noticed an uptick in interest just before we moved. Australia has Bunnings Warehouse which has become a cultural institution, and the US has hardware stores large enough to host Olympic events. Even Malaysia has branches of ACE Hardware that my dad and I used to spend hours wandering around in Mutiara Damansara.

I’m not sure if it’s still around either, but my other favourite hardware store in the area was in Shaw Centre. I used to go there so often when I built custom computers, I got to calling the owners uncle and auntie. Lovely people.

The inline photo was from Home-Fix’s dormant Instagram page. I wonder how many abandoned company social media accounts will be formed over the next year.

More FreeBSD HPE Microserver homelab answers

My homelab post generated a ton of questions and comments, most of them specific to running FreeBSD on a Gen8 HPE Microserver. I answered a question about how to boot off the optical drive SATA port, then promptly forgot to answer any more!

This post will be a grab-bag summary of some other questions that were asked a few times.

Is the budget spec’d G8 Microserver with a Celeron any good?

Short answer, yes. Here’s the specs of my lowest-end unit:

$ sysctl hw.model hw.ncpu
hw.model: Intel(R) Celeron(R) CPU G1610T @ 2.30GHz
hw.ncpu: 2

This may not seem like much, but it’s still a server board with ECC memory for your ZFS arrays, which I care about more than performance. It even has enough power for serving multiple file shares while transcoding PleX to our Apple TV at 1080p, and all under 35W TDP so the fans stay nice and quiet.

The only two things I don’t like about this CPU: no AES-NI offloading, and while it has VT-x, it doesn’t have VT-d. The former surprised me given how easily it delivers data off GELI-encrypted drives to our Macs with netatalk, but I suppose that shows the bottleneck there isn’t the crypto.

My other Microserver uses an old 4-core Xeon E31260L which has all the above features and lets me thinker with hypervisors. It also only hits 45W, so the fans don’t take off late at night while we’re trying to sleep.

Have you made any modifications to the hardware?

Save for running an SSD in the optical drive bay and adding a few more DIMMs, not really. The previous owner of my Xeon Microserver upgraded the CPU from its original Celeron.

The one other minor change was the addition of a passive heatsync on the Broadcom NIC chips as John Stutsman described. In Sydney summers and Singapore all year round, this one chip regularly exceeded 70°C which didn’t measurably impact performance, but still had me worried. I should do a proper post about this one day.

Play Cooling the Broadcom Chip at Location 13-LOM on my Gen8 MicroServer

What Linux distros have you used on it?

At work we use Debian with Xen, so I ran it on my Xeon Microserver too with OpenZFS and btrfs for testing, the former of which worked fine and the latter isn’t the fault of the hardware! I still do use Debian on one now, albeit as a bhyve guest.

While I’m talking about OSs, they’ve also run Oracle Solaris 11, VMware ESXi, and Windows Server 2012 R2. One day I intend to try OpenBSD as well.

Where did your hostnames come from?

Naming schemes are an incredibly important and detailed topic for which a dedicated post is warranted and forthcoming. For my Microservers, mio.lan is named for Akiyama Mio, the timid bassist from the K-On! franchise. And aino.lan was named for Minako Aino from Sailor Moon. They weren’t my favourite characters from their respective shows, but they played important supporting roles… like servers do.

As an aside, it’s it amazing how anime art has changed over the years. And K-On! is already almost a decade old itself, which makes me feel incredibly old.

What NICs does it have?

Mine have older Mellanox MT26428 QDR InfiniBand cards, after we upgraded hardware at work. But the two built-in NICs are Broadcom NetXtreme Gigabit Ethernet that Bill Paul’s excellent bge(4) drivers support out of the box on FreeBSD.

In a past life one of the servers also ran a 4-port Intel Gigabit PCIe card when I was using it as a glorified router and VPN end-point with some storage attached. I thought I had a dmesg saved, but I can’t remember.

I still owe people some answers about what my GELI and ZFS pool layouts look like, and how iLO works, but that’s it for now. Feel free to ping me at @Rubenerd on Twitter or my email if you need more info.

Music Monday: The Vanilla Bean Situation

It’s time for another Music Monday, that weekly audible series of posts written about music on a specific day.

These completely legitimate tunes that definitely exist and weren’t just comedic cutaways may still be the shortest I’ve ever reviewed. This is the second of three videos by Bon Appétite’s Claire and Brad attempting to make sour-dough-nuts:

Play Brad and Claire Make Doughnuts Part 2: The Disaster | It’s Alive | Bon Appétit

And from their third video attempting to make sour-dough-nuts:

Play Brad and Claire Make Doughnuts Part 3: Redemption | It's Alive | Bon Appétit

The first video didn’t include, but we’ll forgive them for the stress they incurred on the second.

Play Brad and Claire Make Doughnuts Part 1: The Beginning | It's Alive | Bon Appétit

Still getting mail from a leaked database

A famous pizza chain in Australia had their customer account database stolen a few years ago. I was living in Mascot with Clara at the time, a suburb just south of Sydney. I was lucky that I never use my real name on these services, and I use one-off passwords with KeePassXC which you should also use because its great.

Years later I still get regular emails like this:

Subject: Jeff, Mascot?

are you in Mascot?

These kinds of social engineering attacks are far more dangerous than general spam. Your location is a piece of information an attacker would need to know in advance, which unsuspecting or trusting email users could interpret as adding legitimacy. Like my hat.

A related, widely-discussed scam involves sending a leaked password you once used to scare you into sending them money:

Some time ago your computer was infected with my private software, RAT (Remote Administration Tool). I know your password is ce#Dz!7oy]m(Fc$. My malware gave me access to all your accounts, contacts and it was possible to spy on you over your webcam.

This is unrelated, but I thought it was funny that my long passphrase of gibberish was truncated with the first dollar sign. Some suspect scammer’s software must have alliterated parsed it as regex.

Next time you have another video call or catchup with family, it might be a great opportunity to bring up what they know about email and web scams. Education is our best defence against these kinds of attacks. Attackers making mistakes may be #2.

Bringing my own git in-house

GitHub generated a lot of positive press for their renaming of Master branches to Main. I think they missed an opportunity to call it Trunk, but either way it’s an entirely hollow gesture for those of us who care about human dignity and rights.

Regardless of where you stand there though, this has reminded me that I need to bring more stuff in-house. Hosting your own git on a cloud instance or VPS is fairly simple with tools like Gitea, though I’m thinking of just using straight-up git with GitWeb for publishing a web frontend for others to view. I don’t need pull requests and other workflow tools for my personal projects, and GitWeb would give people visibility.

It’s also got me thinking about where Subversion fits. I still prefer it for some reasons among others, but with FreeBSD potentially moving to git (IIRC), and almost all of my work being git now, dare I say it makes sense to standardise on it and make my life easier. The site you’re reading now has been on git since I moved off WordPress five seven years ago, as too are my dotfile configs and lunchbox.

Now that I think about it, all my public-facing repos and work are on git, either from the start or having been migrated from hg. It’s my private stuff that’s on Subversion. Maybe that could be a useful separation to maintain.