11 minute read

I tried to write a standalone post about Notarisation and its ecosystem, only to realise that it’s practically impossible. I ended up producing enough pages to fill an entire book chapter — which isn’t particularly useful for this medium. The problem is the topic itself: it’s huge, and cutting parts out would make it even worse.

So, the solution? I decided to split what I wrote into several posts and publish them here under the tag Apple Security 101.

Introduction

It’s a well-known fact that Apple has always invested heavily in its image. In recent years, though, the focus on security — in the broadest sense of the word — has become impossible to miss. As discussed in Apple Defences, Apple implements a long list of security controls to protect users, and one of the most important among them is Code Signing.

Code Signing is the foundation on which Apple has built the entire macOS security ecosystem (and all the other Apple OSes, for that matter).

Code signing is rooted in a universally accepted security paradigm: Accept Known Good. In cybersecurity, this refers to blacklisting everything by default, and whitelisting only what you explicitly trust. This is, for example, how a firewall should be configured — also because the opposite paradigm, Reject Known Bad, tends to produce fragile systems.

In simple terms: macOS considers code unsafe by default; unless you provide hard evidence to the contrary, that code won’t run. More specifically, a non-notarised program can’t even be loaded — unless the user explicitly overrides Gatekeeper, which is exactly the social engineering vector most macOS malware relies on. We’ll look at concrete examples later.

There’s only one type of evidence that Apple accepts: the code-signature chain of trust.

Accept Known Good, in this context, solves a simple dilemma: if you don’t know who wrote the code, or if the code was modified after it was signed, you can’t make informed security decisions. All the remaining controls — sandboxing, policy enforcement, and the rest — stop being effective.

And this is, in a nutshell, what code signing does: it answers two questions:

  • Who signed this binary (Developer ID, Team, Authority)
  • Integrity: was the binary modified after it was signed?

As is often the case in security, simplicity brings effectiveness. Code Signing doesn’t care about the security posture of the code (does this code contain vulnerabilities?), whether it’s well written, or even what it does. It’s not designed to prevent malware from running either, as long as the signatures are present and valid.

From this perspective, Code Signing is a trust mechanism, not a safety one.

What is Code Signing

Cryptography alert: if you are not comfy with Cryptography basic principles, you will have some problems here. If so, just have a look at how asymmetric encryption/digest hashing works. Also Wikipedia would do.

The best way to understand what is Code Signing is to figure out how it works.

It all starts with a Mach-o File

Any executable entity on any Apple OS can be signed:

  • app main executables
  • frameworks and dynamic libraries
  • plugins and extensions
  • XPC services
  • command-line tools
  • kernel or system extensions
  • helper binaries inside bundles

All of these rely on the same underlying Code Signing architecture.

The Integrity Model: Closer to a Merkle Tree Than People Realize

One of the biggest misconceptions is the “hash → sign → done” mental model.

That’s not how Apple does it, and thank God, because that model breaks as soon as a binary grows beyond a toy.

Apple instead builds something structurally similar to a Merkle Tree, even if nobody in Cupertino ever uses that word:

  • every 4 KB executable page is hashed individually
  • special metadata slots (entitlements, Info.plist…) are hashed separately
  • everything is collected into a single structure called the CodeDirectory, which acts as the root of this little forest

You change a single byte in a single page? That leaf hash changes ›› That invalidates the CodeDirectory ›› That breaks the whole signature chain.

The Signature: Apple Signs the Root, Not the Binary

Once the CodeDirectory is assembled, Apple signs that, not the file itself.

This is done via CMS using your Developer ID or Apple certificate.

It’s a bit like notarising the index of a book rather than every single page: if the index changes, you know the book has changed.

This has an unspoken advantage: the signature is actually proof of identity, not just integrity.

If you break the signature, macOS doesn’t ask philosophically “Why?”. It just refuses to run the binary.

The SuperBlob: The Suitcase Where Everything Is Packed

All of this—root hash, page hashes, entitlements, requirements, signature— is stored in a single structure at the end of the Mach-O called the SuperBlob. Referenced by LC_CODE_SIGNATURE.

The structure is surprisingly elegant for something you rarely see unless you enjoy opening binaries at 1 AM “just to check one thing.” And yes: modifying anything, either inside the binary or inside the SuperBlob, breaks the chain.

There’s no “almost signed.”

It’s like being “almost pregnant” - it doesn’t work that way.

How the System Actually Verifies It

In theory, everyone can validate a signature using the public chain. In practical terms, Apple does much more. This signature is crucial for the whole process - and hence it’s crucial to obtain users’ trust.

Gatekeeper (When receiving the file):
  • Checks the signature
  • Checks the certificate chain
  • Checks the notarization ticket
  • Checks if Apple has revoked your certificate because you did something too creative
Kernel + AMFI (when the process starts):
  • Validates the CodeDirectory
  • Validates page hashes lazily
  • Verifies entitlements
  • Enforces hardened runtime
  • Looks at your Team ID and decides how much it trusts you
Runtime (dyld, SIP, TCC)
  • Library validation: you cannot load random dylibs
  • Sandbox: entitlements decide what you can touch
  • TCC: wants to see your Team ID before giving you the microphone
  • SIP: laughs at root unless you have very specific signatures

Every privilege you get—debugging, JIT, camera access, Network Extensions— is ultimately tied back to your signature.

It’s not integrity for its own sake: it’s control.

Why This Matters More Than People Think

In essence, macOS is a BSD with a bunch of restrictions.

The thing that makes it secure rather than merely pretty is the fact that Code Signing sits under:

  • Gatekeeper
  • Notarisation
  • Hardened Runtime
  • TCC
  • SIP
  • AMFI
  • Sandboxing
  • and half of what makes modern macOS different from “Linux with taste”

Remove Code Signing and the whole structure becomes optional.

The LC_CODE_SIGNATURE load command

I am not assuming you know what a load command is, how the loader uses it to supply resources to the executable, or any of the surrounding details.

For now, you can think of a Load Command as a piece of metadata used by the Mach-O binary format (on macOS, iOS, tvOS, watchOS, and whateverOS) to describe a specific behaviour or requirement of the binary.

The definition of LC_CODE_SIGNATURE looks deceptively simple:

struct linkedit_data_command {
    uint32_t    cmd;            /* LC_CODE_SIGNATURE, LC_SEGMENT_SPLIT_INFO,
                                   LC_FUNCTION_STARTS, LC_DATA_IN_CODE,
                                   LC_DYLIB_CODE_SIGN_DRS, LC_ATOM_INFO,
                                   LC_LINKER_OPTIMIZATION_HINT,
                                   LC_DYLD_EXPORTS_TRIE,
                                   LC_FUNCTION_VARIANTS, LC_FUNCTION_VARIANT_FIXUPS, or
                                   LC_DYLD_CHAINED_FIXUPS. */
    uint32_t    cmdsize;        /* sizeof(struct linkedit_data_command) */
    uint32_t    dataoff;        /* file offset of data in __LINKEDIT segment */
    uint32_t    datasize;       /* file size of data in __LINKEDIT segment  */
};

Several load commands share this exact structure. The underlying idea is always the same:

  • a relative offset (dataoff), meaning the number of bytes from the beginning of the file (or the start of the FAT slice), and
  • a size (datasize), meaning how many bytes make up the relevant information.

In this specific case, the “information” is much more involved than the structure suggests. Apple itself calls it a Superblob — essentially, a blob containing other blobs.

We cannot be sure about why Apple has envisioned such an intricate structure, but let’s analyse the notarisation problem.

Why does notarisation even exist? Well, Apple had three critical problems to address:

Problem 1: Integrity of executable code

  • How do I know the binary I’m about to execute is exactly what the developer shipped?
  • How do I prevent malicious modifications (malware patching binaries)?

Problem 2: Identity & Trust

  • Who created this binary?
  • Can I trust this entity?
  • How do I implement and enforce a chain of trust?

Problem 3: Capabilities & Sandboxing

  • What permissions does this binary have? (camera, microphone, keychain?)
  • How do I enforce sandboxing without hardcoding rules in the kernel?

Apple did not solve everything with a “digital signature” (far too simplistic). Instead, Apple created a SuperBlob containing specialised blobs, each with a precise role.

The reason is simple: Separation of Concerns. Every blob solves a specific problem.

The 5 Fundamental Blobs (and why they exist)

Blob 1: CodeDirectory (Slot 0)

The problem
  • How do I verify the integrity of a 100MB binary without loading it fully into memory?
  • If I hash the whole file, I must load it all, which is too slow.
Solution
  • The binary is split into pages (typically 4KB)
  • Each page has its own hash (SHA-256)
  • The CodeDirectory contains:
    • Array of per-page hashes
    • Metadata (version, flags, hash type)
    • Identifier (bundle ID such as com.apple.ls)
    • Team ID (who signed the binary)
    • CDHash (hash of the CodeDirectory itself — this is the “identity” Gatekeeper uses)
Why this architecture?
  • Lazy verification: macOS only verifies the pages it loads (on-demand paging)
  • If only 10 pages of a 1000-page binary are executed, only those 10 are verified

Variants: There is a variant, called Alternate CodeDirectory (Slot 0x1000+): multiple hash algorithms (legacy SHA-1 + SHA-256)

Blob 2: Entitlements (Slot 5) and  DER Entitlements (Slot 7)

The problem
  • How do I declare “this binary can access the camera” without kernel hardcoding?
  • How do I implement sandboxing with flexible rules?
    Solution
  • XML (or DER for modern binaries) declaring capabilities
    • Example: `com.apple.security.device.camera

Why XML AND DER?

  • XML (Slot 5): legacy, human-readable
  • DER (Slot 7): compact ASN.1 binary, faster to parse

Where does macOS check them?

  • During execve(), the kernel calls cs_entitlements_blob_get()
  • Gatekeeper validates them before allowing execution

Blob 3: Requirements (Slot 2)

The problem
  • “Signed” is not enough. You need to know:
    • signed by WHO
    • signed under WHICH CONDITIONS?
  • How do I enforce rules like “only binaries signed by Apple”?
Solution
  • A mini-bytecode language expressing conditions
  • Types:
    • Designated requirement (main rule — e.g. “Team ID = EQHXZ8M8AV”)
    • Library requirement
    • Host requirement, Guest requirement, etc.

Example:

identifier "com.apple.Safari" and anchor apple

Meaning: “Accept only if identifier = Safari AND the signer chains to Apple’s root CA.”

Why a custom language?

  • Flexibility: Apple updates rules without kernel changes
  • Compactness: complex logic in few bytes

Blob 4: CMS Signature (Slot 0x10000)

The problem
  • Given all these blobs, how do I guarantee none were modified?
  • How do I implement a trust chain (root CA intermediate developer cert)?
Solution
  • A PKCS#7/CMS blob containing:
    • Certificate chain (developer intermediate Apple root)
    • Digital signature of the CDHash
    • Timestamp

Why sign only the CDHash?

  • Performance: 32 bytes vs 100MB
  • Indirection: the CDHash “covers” all page hashes through the CodeDirectory
  • Chain: CMS CDHash CodeDirectory Page Hashes Binary

Ad-hoc signatures:

  • Missing CMS blob means ad-hoc signature
  • No certificates, just local hashes
  • Used for development/testing

Blob 5: Notarization Ticket (Slot 0x10002)

The problem
  • Developer signature does not mean safe software
  • How do I add a layer meaning “Apple scanned this and found it clean”?
Solution
  • Developer uploads binary after signing
  • Apple scans for malware/security issues
    • If the software is clean, Apple staples a signed ticket
  • The ticket means “Apple has seen this CDHash and approves it”

Why separate from CMS?

  • Timing: developer signs, then distributes the software. Notarisation can come later
  • CMS signature cannot be modified afterwards
  • Ticket is added without invalidating it

Gatekeeper checks:

  1. Verify CMS (developer signature)
  2. Verify staple (notarisation)
  3. If no staple, then perform an online lookup

How the blobs interact

Below, the validation order that takes place when macOS executes a binary:

  1. The kernel reads the LC_CODE_SIGNATURE load command.
  2. It parses the SuperBlob and locates all embedded blobs.
  3. It reads the CodeDirectory, identifying the hash type and page size.
  4. It verifies the CMS Signature:
    • Extracts the certificate chain
    • Validates it against Apple’s root CA
    • Verifies the signature of the CDHash
  5. It compares the CDHash:
    • Computes the hash of the CodeDirectory just read
    • Compares it with the one signed in the CMS
    • If they differ, then REJECT
  6. It reads the Entitlements blob and loads them into the kernel for enforcement.
  7. It validates the Requirements blob — executes the bytecode and checks conditions.
  8. It verifies the Notarization Ticket, if present.
  9. During execution: for each page loaded:
    • Computes the page hash
    • Compares it with the hash in the CodeDirectory
    • If they differ, raise a SIGKILL

What’s Next

If you’ve made it this far, you now understand the architecture of Code Signing.

The bad news? That’s the easy part.

The hard part is knowing what to do with it. How do you actually inspect a signature when you’re staring at a suspect binary at 2 AM? What does codesign -dv --verbose=4 spit out, and why should you care? What’s the difference between ad-hoc signing and Developer ID beyond “one works, one doesn’t”? And when do you escalate a sample because something smells wrong?

That’s the next post: Code Signing in Practice. Less theory, more “here’s what you actually do.”

Before you ask: yes, this series is going to drag on for weeks. Code Signing alone could fill a book nobody would read. And we still have Notarization, Hardened Runtime, Entitlements, and XProtect waiting in line — each one deceptively large, each one impossible to cover properly in 10 minutes.

Cutting corners would make it worse, not shorter.

So settle in. We’re nowhere near done.

Want the deep dive?

If you’re a security researcher, incident responder, or part of a defensive team and you need the full technical details (labs, YARA sketches, telemetry tricks), email me at info@bytearchitect.io or DM me on X (@reveng3_org). I review legit requests personally and will share private analysis and artefacts to verified contacts only.

Prefer privacy-first contact? Tell me in the first message and I’ll share a PGP key.


Subscribe to The Byte Architect mailing list for release alerts and exclusive follow-ups.


Gabriel(e) Biondo
ByteArchitect · RevEng3 · Rusted Pieces · Sabbath Stones