Design engineering in practice: releasing an app to prod

Jakub picks up the Pockets story where the MVP left off and walks through everything it took for a designer to ship a real, notarized, self-updating macOS app to production: the Apple paperwork nobody warns you about, code signing and notarization, a Sparkle update channel, the empty-states and accessibility work that makes a build feel finished, and the on-device AI that shipped as v1.1.

AI Disclosure

Some parts of these notes were clarified and refined using AI tools such as ChatGPT, Gemini, and Claude to enhance clarity and readability. All ideas are my own, and I reviewed and edited all texts myself.

Date

/

Category

🎓 MUNI: Practise III.

🎓 MUNI: Practise III.

Terminal uploading the Pockets app to the cloud

For months, Pockets ran beautifully on exactly one Mac in the world: mine. I could press Cmd+R, drag a file into a floating window, and watch on-device AI sort it in a second. It felt finished. It was not finished. The distance between an app that works for its author and an app a stranger can download, open without a scary warning, and trust with their files turned out to be the hardest part of the whole project.

The previous case study ended on a high note that was also, quietly, a cliffhanger. Pockets had reached a working MVP, survived user testing, and produced alpha builds I could hand to a few people. But those builds were unsigned. Handing someone an unsigned Mac app in 2026 means handing them a Gatekeeper wall: the one that says Pockets cannot be opened because Apple cannot check it for malicious software. The honest workaround was telling friends to right-click, hold their breath, and override their own operating system's safety advice. That is not shipping. That is a favor you ask of people who already like you.

This case study is about closing that gap. It is the story of taking Pockets from works-on-my-machine to a real production release, v1.0.0, that anyone can download and run, told from a place most product designers never stand: owning the entire pipeline myself.

Where we left off

I want to start with an honest recap, because the plan I wrote at the end of the last case study and what the git history actually did are two different stories.

If you read the last installment (pockets-praxe-iii), here is the recap in one breath. I had a menu-bar app that creates floating pockets for files, text, links, images, colors, and code, with automatic categorization. User testing told me the core idea held up. The plan I wrote down at the end was modest and specific: a TestFlight beta in April 2026, then a real release.

Plans are easy to write. The git history is more honest about what actually happened. By the time v1.0.0 shipped, the project had crossed 500 commits, walking through tags from v0.0.1 to v1.0.0 (v0.0.1 was the only tag before the release push). Between MVP and a-stranger-downloads-it, a long list of unglamorous things had to be true at once: a paid Apple Developer Program membership, a Developer ID Application certificate, an app-specific password so Apple's notary service would accept my builds, code signing and notarization so Gatekeeper would step aside, and an update channel that could actually deliver new versions without me emailing people a file.

One early decision set the tone for all of it. I kept the Pockets repository private. That single choice quietly broke the easy distribution path, because you can't offer anonymous downloads straight from a private repo. So the release had to publish its installer and its update feed somewhere public instead, which is how the whole thing ended up living at updates.jholec.com, a Netlify subdomain. None of that is a feature. It is the price of admission.

What design engineering actually means here

There is a tidy version of design engineering where a designer who can code is just a designer who makes nicer prototypes. This project taught me the messier version.

Owning the pipeline means the polish and the plumbing turn out to be the same job, and the plumbing has no Figma frame. A few moments made that concrete. The build that compiled perfectly in Debug crashed in Release because of a SIL optimizer bug in the Swift 26.5 compiler, so I had to ship a deliberate workaround for a problem in Apple's toolchain, not in my code. The auto-updater silently refused to verify updates until I signed the appcast with the right EdDSA key instead of the legacy DSA one, a one-flag difference that decided whether users got updates at all. Notarization, certificates, a Team ID that I now know by heart, an update feed that has to be signed and hosted and correct, all of it sitting between my finished design and a person's Downloads folder.

The same instinct shaped a product call I'm proud of. I wanted Pockets to run AI privately, on-device, with no account and no API key. But the on-device migration to Apple Foundation Models and Vision was the riskier change, and BYOK was already safe, so I refused to let it hold the release hostage. v1.0.0 shipped first, and the on-device AI landed right after as v1.1. Progress over perfection, made literal in a version number.

That is the angle for everything that follows. Not here is how I designed a screen, but here is what it takes for a designer to get a real app into a stranger's hands, and trusted there.

What a designer actually needs to ship to production

Before any of the clever automation, there's a boring list of accounts, certificates, and passwords. Skip one and Gatekeeper quietly refuses to open your app on someone else's Mac. This is the part I wish someone had written down for me.

If you want a Mac app that opens with a double-click on a stranger's machine, not a right-click-Open-anyway dance, here is the real prerequisite list. None of it is hard. All of it is easy to miss one piece of. One thing first: enrollment isn't instant. Apple verifies your identity, which can take days, so start before you think you need to.

  1. A paid Apple Developer Program membership, 99 USD a year. The free account signs apps for your own Mac only. To notarize for other people, you pay, and it's a recurring dependency: let the membership lapse and your notarization ability stops.

  2. A Developer ID Application certificate. This is the identity that signs the app for distribution outside the App Store, and it's distinct from the Apple Distribution cert you'd use for the App Store and TestFlight. You export it as a .p12 so CI can use it.

  3. A way to authenticate to Apple's notary service. Easiest to start with: an app-specific password tied to your Apple ID. For automation, graduate to a .p8 App Store Connect API key with notarytool, which is what I use in CI because it's cleaner than juggling an Apple ID and password in a workflow.

  4. Code signing then notarization, in that order. Sign with Developer ID, send the build to Apple's notary service, wait for the all-clear, then staple the ticket onto the DMG (the distributable), not just the .app inside it. Without the staple, the first launch needs an internet round-trip to Apple, and offline machines stall.

  5. A provisioning profile if your app uses entitlements like iCloud. Pockets is sandboxed for the CloudKit sync feature, so the build embeds a "Pockets Developer ID" profile to keep iCloud and push authorised.

  6. A distribution path. For Pockets that's a notarized DMG plus a Sparkle appcast for in-app updates, with TestFlight as a separate channel for beta testers.

The gotcha that cost me the most: signing is not one step. The app is signed, then the DMG wrapping it is signed again, then the whole thing is notarized and stapled. Miss the DMG signature or the staple and Gatekeeper still complains even though the app inside is perfectly signed. Skip notarization entirely and the very first thing a new user meets is the "unidentified developer" dialog, the worst possible first impression for a no-account app asking to be trusted.

The release pipeline: push a tag, get a notarized DMG and a signed update

I did not want releasing to be a ritual I'd get wrong at 2am. A release a solo maker performs by hand eventually ships an unsigned or unsignaled build.

So the whole thing is a GitHub Actions workflow (.github/workflows/release.yml) that fires on a version tag. Pushing v1.1.0 runs the entire chain on a fresh macos-26 runner: build, sign, notarize, package, sign the update, regenerate the appcast, publish.

A few decisions matter more than they look. The archive uses manual signing with an explicitly installed provisioning profile, not Xcode's automatic signing, because CI has no signing UI and the embedded profile is what keeps CloudKit authorised. After export the workflow asserts that embedded.provisionprofile and Sparkle.framework are actually present, then runs codesign --verify --deep --strict. I'd rather the build fail loudly than ship an app that won't sync.

Notarization runs with an App Store Connect .p8 API key via xcrun notarytool submit --wait, then stapler staple so the DMG works offline. The canonical download is the Netlify update channel (a versioned DMG plus a Pockets-latest.dmg alias); the GitHub Release exists mostly as an archive. Every secret, the .p12 and its password, the keychain password, the provisioning profile, the ASC key, the Sparkle private key, the Netlify tokens, lives in GitHub secrets, and the workflow refuses to run if a required one is missing.

Sparkle, and the bugs that actually broke shipping

Two things genuinely broke. One stopped the app from compiling at all in Release. The other was worse: the auto-updater looked fine, shipped twice, and never worked. Progress over perfection means writing down the embarrassing ones too.

The compiler crash that only happened in Release

Debug built fine for months. Then xcodebuild archive -configuration Release started crashing inside Swift 26.5's SIL optimizer. The EarlyPerfInliner pass died while inlining into the implicit deinit of two generic NSHostingView subclasses I use for window drag and drop (DragBlockingHostingView and DropEnabledHostingView). The app was correct; the optimizer wasn't. The fix is three lines and ugly in the honest way: give each class an explicit empty deinit pinned with @_optimize(none) so the crashing pass skips it (commit a14bc1b, in Pockets/Utilities/WindowDragHelpers.swift). Debug never saw it because it only surfaces under -O.

Sparkle: signed, hosted, and completely non-functional

Sparkle 2.8.0 handles in-app updates. Because Pockets is sandboxed (App Sandbox is required for iCloud sync), it runs Sparkle inside the sandbox: SUEnableInstallerLauncherService = true plus two narrow mach-lookup entitlements for the -spks and -spki services. The app checks https://updates.jholec.com/appcast.xml every 24 hours (SUScheduledCheckInterval = 86400), with a manual Check for Updates… menu item for the impatient. CI signs each update ZIP with EdDSA and the app verifies it against the public key in Info.plist. That part is right.

What wasn't right, and shipped in both v1.0 and v1.1:

  1. Wrong signing flag. I first ran generate_appcast with -f, the legacy DSA key flag. EdDSA needs --ed-key-file. Different algorithm entirely (commit de0170d).

  2. Unsigned enclosures. In CI, generate_appcast 2.8 didn't embed the EdDSA signature at all. Sparkle rejects unsigned updates when a public key is set, so the workflow now signs every archive with sign_update and injects sparkle:edSignature into each enclosure, with a hard-fail guard if any are still unsigned (commit 6d05347).

  3. The Info.plist never shipped. The app target had GENERATE_INFOPLIST_FILE=YES and no INFOPLIST_FILE, so my real Info.plist, the one with SUFeedURL and SUPublicEDKey, was ignored and a minimal generated one took its place. The shipped apps had no feed URL. They literally could not find updates. v1.0 and v1.1 can't self-update; those users need a manual reinstall of v1.1.1 (commit 9ba8283).

  4. The controller was always nil. Under @NSApplicationDelegateAdaptor, NSApp.delegate as? AppDelegate returns nil because the delegate is a SwiftUI wrapper, so "Check for Updates" did nothing. Fixed with an AppDelegate.shared, plus activating the menu-bar app first so Sparkle's dialog isn't hidden behind the dashboard.

  5. Non-monotonic build numbers. Sparkle compares updates by CFBundleVersion. I derived it from git rev-list --first-parent --count, which isn't monotonic across branches: a tag cut on a side branch had fewer ancestors, so v1.0.0 got build 122 but v1.1.0 got 118. Sparkle concluded the older 1.0.0 was the newest available. Switched to the HEAD commit timestamp (git show -s --format=%ct): strictly increasing in time, branch-independent.

The other obligation, the one you only notice when you treat updates as a service, is the signing key itself. An auto-updater is by design a thing that downloads code and runs it on your machine. If someone could swap the file at that URL, they'd own every Pockets install. The EdDSA public half lives in the app; the private half never leaves my control, and it lives in a password manager and an offline backup, not just on my laptop. The slightly scary footnote: lose that one file and I can never sign an update for existing users again, which is exactly why it's backed up in two places.

The lesson I keep relearning as a designer who ships: an update system that looks correct in code and even validates its own signatures can still be totally inert in the hands of a real user. The only proof is installing the shipped artifact and watching it update itself.

The states nobody screenshots

A green build proves the code compiles. It says nothing about the first thirty seconds a stranger spends in your app, when there's no data yet, no muscle memory, and no benefit of the doubt. That gap is where most of my road-to-1.0 work actually went.

When I shipped the MVP, the Dashboard rendered nothing when you had no pockets. Technically correct: an empty grid is an empty grid. But an empty rectangle reads as broken, not as a fresh start. So I gave every empty path an explicit state. The Dashboard now uses ContentUnavailableView with a tray.fill icon, the line "Create a new pocket to get started," and a prominent New Pocket button right there in the void. Search has its own branch: when a query returns nothing it shows ContentUnavailableView.search(text:), so "no pockets exist" and "nothing matched what you typed" never get confused for each other. The Todo editor does the same, a checklist glyph with "No todos yet" and "Add your first todo above."

These are six lines of SwiftUI each. They are also the difference between an app that feels finished and one that feels like a prototype someone handed you early.

The harder, less visible work was accessibility. A floating menu-bar app that's all custom views is exactly the kind of thing VoiceOver falls off a cliff on, because none of it is standard controls. I took the todo row as the proving ground. The checkbox is a styled ZStack of two circles, not a real checkbox, so to VoiceOver it was invisible. I rebuilt it as one accessibility element: accessibilityElement(children: .ignore), a label of "Toggle completion for [todo text]," the .isButton trait, and a value that announces "checked" or "unchecked." The todo text carries a custom accessibilityAction(named: "Edit") with the hint "Activates edit mode," so a VoiceOver user gets the same double-click-to-edit affordance a sighted user gets, through a route that makes sense without a trackpad.

"Production-ready, for a designer, is not the moment the build is green. It's the moment the worst-case state, no data, no sight, no patience, is still legible."

The same logic shaped how analysis failures surface. On-device models and network fetches fail in ordinary ways, and the old behavior was a silent empty description. Now each failure mode maps to a plain-language line in the expanded item: "Cannot analyze - only http:// and https:// links are supported," "network error (will retry)," "timed out (will retry)," with the technical detail deliberately withheld. A loading item shows a sparkles glyph and a shimmer; a failed one shows a warning triangle and auto-retries. None of this is glamorous and none of it shows up in a hero shot. It's precisely the layer a designer is uniquely positioned to own, because we're the ones who notice when the absence of a state is itself a state.

Tucked pockets and choosing how the AI works

The features people remember aren't the big ones. They're the small ones that feel considered. Two of those carried real polish weight on the way to 1.0: how a pocket looks when it's tucked away at the screen edge, and how someone decides, on first launch, whether AI runs on their Mac, in the cloud, or not at all.

Two ways to disappear

A tucked pocket hides behind a pull tab at the edge of the screen: hover to reveal, click to use. The question was what that tab should look like, and I didn't think one answer fit every desk. So the tucked-pocket appearance is a real choice in Settings, under a Tucked Pocket Style picker.

Glass is the default: the pull tab fills with .ultraThinMaterial and casts a soft colored shadow tinted to the pocket's own color, offset away from the screen edge so the tab reads as floating inward. Cutout is the opposite instinct: the tab fills with pure Color.black, no shadow, deliberately mimicking the iPhone notch so a tucked pocket reads as part of the hardware rather than a window stuck to it. Picking Cutout reveals a second picker, Icon Style: Monochrome draws the glyphs white at rest and flips each one to the pocket's color the instant it's selected, or Colored keeps every icon in its own color all the time. That second control only appears when it's relevant, and the whole section animates in with .smooth(duration: 0.3) so the panel doesn't jump.

The detail I'm fondest of is the smallest. Whatever style you pick, selecting a tucked pocket springs its icon up to 1.15 scale with a .spring(response: 0.35, dampingFraction: 0.55), a low damping fraction on purpose, so it overshoots and settles with a tiny bounce. In Cutout-Monochrome that bounce lands at the same moment the white glyph flips to the pocket's color and a colored border appears, three signals firing together to say "this one's live now." The single-tuck tab and the tuck-away tray share the same color rule, so turning on Cutout-Monochrome looks consistent whether you've tucked one pocket or all of them.

Letting people decide where their data goes

Pockets categorizes content with AI, and the honest, unavoidable design problem is that "AI" means very different privacy trade-offs depending on where it runs. I refused to bury that in a settings sub-menu. It's a full step in onboarding, "Choose How AI Works," with three cards.

  • On-device Intelligence, the default: Apple Foundation Models and Vision running locally, subtitled "Private, free, works offline, no account needed."

  • Use My Own OpenAI Key, for people who want cloud-quality image analysis and have a GPT-4o key.

  • No AI: deterministic categorization by file type and keyword, with an honest note that custom sorting instructions are ignored in this mode.

The on-device card doesn't just claim availability, it checks. On appear it queries FoundationModelsAnalyzer().availability() and shows a live badge: green "Available," or an orange reason you can act on, "Requires Apple Silicon," "Enable Apple Intelligence in System Settings," "Model downloading…," "Requires macOS 26+." If Apple Intelligence isn't ready, the card says so plainly and explains the keyword-heuristic fallback rather than letting someone pick a mode that silently won't work. Whatever you choose in onboarding is the same AIPreferences the Settings AI Mode picker reads later, so the decision is never trapped on one screen.

A passing build would have shipped with AI on and a key field hidden three menus deep. Treating the privacy choice as a first-class, honest, self-diagnosing step is the part I'd call design engineering, the spot where owning the whole pipeline let me make a product decision and the code that backs it in the same afternoon.

Privacy as a default, and a fallback chain with a floor

The previous case study ended at an MVP that needed an OpenAI key to think. The version I shipped needs nothing. No account, no key, no internet. That sounds like a feature line. It's actually a promise the architecture has to keep on every device, including the ones that can't keep it.

The headline of the production app is that categorization runs privately on your Mac: Apple Foundation Models for text, URLs, code, and files; Apple Vision for images. Private, offline, free by default. For a tool that quietly reads everything you drop into it, receipts, screenshots, links, code, "your content never leaves the device" isn't a nice-to-have, it's the difference between a thing people trust with their stuff and a thing they don't. Making that the default, with no key to paste and no signup, is the single biggest service decision in the whole release.

A promise like that only holds if it degrades gracefully, because a lot of Macs can't run the on-device model. Foundation Models needs macOS 26 with Apple Intelligence enabled and eligible hardware. So the promise is kept by a provider fallback chain rather than a single dependency. In On-Device mode the coordinator tries Foundation Models, then Vision for images, then a deterministic heuristic. The heuristic, categorization by file type and content patterns, is always last and never fails to produce a result. That last clause is the whole reliability story: whatever the device, whatever the OS, dropping something into a pocket always returns something. The AI got rebuilt around this exact shape during the migration. AIService stopped being an HTTP/URLSession client and became a thin coordinator over a ContentAnalyzer protocol, picking the first provider that's both capable and available and falling through on any failure, with the old hand-rolled JSON parsing replaced by @Generable guided generation over a fixed 13-label PocketCategory enum.

BYOK is the deliberate escape hatch, not the main road. In Settings → AI you can switch to Advanced and paste your own OpenAI key, stored in the Keychain. I kept it for one honest reason: images. On-device Vision is genuinely good at OCR, documents, receipts, screenshots, anything with text, its RecognizeTextRequest is comparable to gpt-4o there. But its scene recognition (ClassifyImageRequest) is a fixed taxonomy of around a thousand labels, so it can't read the meaning of an abstract photo the way gpt-4o can. The preferCloudForImages toggle (off by default) exists for exactly that gap. For text, URLs, and code, on-device quality is enough and BYOK is just an advanced lever. The design choice is to be candid about where the free, private path is fully sufficient and where a power user might reach for cloud, rather than pretending on-device wins everywhere.

The one place a real feedback loop exists is TestFlight (.github/workflows/testflight.yml, bundle id com.holecjakub.pockets, a MAC_APP_STORE provisioning profile, Apple Distribution and Mac Installer Distribution certs, build numbers auto-incremented from the latest App Store Connect build). A notarized DMG plus a Sparkle channel gets the app to people, but it's a one-way street: it tells me nothing about how a build behaves on hardware I don't own. TestFlight is the closest a solo maker gets to a QA team, a way to find out that the on-device model isn't ready, or that an update didn't land, before it reaches everyone.

Which is the quiet truth underneath all of this. For a designer who owns the whole pipeline, shipping isn't a finish line, it's the start of ownership. There's an appcast that has to stay valid, a signing key I can't lose, a notarization credential that expires, an on-device promise that has to keep degrading gracefully as Apple changes the model out from under me, and two distribution channels, DMG and TestFlight, to keep coherent. None of it is visible in the product. All of it is the product, in the sense that a tool people leave running in their menu bar is a tool they're trusting to keep working when I'm not looking.

Pockets is out in the world. You can download the latest macOS build (v1.1.1, signed and notarized); once it’s installed it keeps itself current through the Sparkle feed described above. The source is on GitHub.

Take-aways

  1. The prerequisites are paperwork, not engineering. A paid Apple Developer Program membership, a Developer ID Application certificate, a way to authenticate to the notary service, and sign-then-notarize-then-staple are most of what stands between "works on my machine" and "a stranger can double-click it." None of it is hard; all of it is easy to miss one piece of, and enrollment alone can take days.

  2. Signing is several steps, not one. The app is signed, the DMG is signed again, then the whole thing is notarized and stapled onto the DMG. Miss the DMG signature or the staple and Gatekeeper still blocks a perfectly-signed app, and offline machines stall on first launch.

  3. An update system can look correct and be inert. v1.0 and v1.1 validated their own signatures and still couldn't self-update, because the real Info.plist never shipped and the build numbers weren't monotonic. The only proof is installing the released artifact and watching it update itself.

  4. Ship the safe thing first. v1.0.0 went out on the safe BYOK path; the riskier on-device AI migration landed right after as v1.1 instead of holding the release hostage. Progress over perfection, written into a version number.

  5. The polish a designer owns is the absent states. Empty states, VoiceOver labels on custom controls, and plain-language failure messages don't appear in a hero shot, but they're the difference between a build that passes and an app a stranger trusts. Noticing that the absence of a state is itself a state is the designer's job.

  6. Default to private, and let it degrade gracefully. On-device AI with no account or key is the headline, but it only holds because a deterministic heuristic is always last in the chain and always returns something. The promise is kept by the architecture, not the marketing line.

  7. Shipping is the start of ownership. A shipped v1.0.0 is one moment. Staying shipped, a valid appcast, a key I can't lose, an expiring notarization credential, two distribution channels, is the part nobody warns a designer about.