Top 10 web hacking techniques of 2022

Top 10 web hacking techniques of 2022

James Kettle

Welcome to the Top 10 Web Hacking Techniques of 2022, the 16th edition of our annual community-powered effort to identify the most important and innovative web security research published in the last year.

Since publishing our call for nominations in January, you’ve submitted a record 46 nominations, and cast votes to single out 15 final-round candidates. Over the last two weeks, an expert panel of researchers Nicolas Grégoire, Soroush Dalili, Filedescriptor, and myself have analysed, conferred and voted on the 15 finalists, to bring you the final top 10 new web hacking techniques of 2022. As usual, we haven’t excluded our own research, but panellists can’t vote for anything they’re affiliated with.

This year, for the third year running, there’s been a noticeable improvement in the number of quality nominations. While outright novel techniques and class-breaks have gotten rarer, there’s more people pushing at the boundaries and sharing their findings than ever. It’s great to see the research ecosystem flourishing, even if it makes it harder to select a top ten!

Before we begin the countdown, I should note that any attempt to compress a year of research into a top ten list is going to leave valuable techniques overlooked. If you’re hungry for knowledge, I highly recommend reading the entire nomination list.

From the final ten, two key themes stand out – single-sign on, and request smuggling. Let’s take a closer look!

10 – Exploiting Web3’s Hidden Attack Surface: Universal XSS on Netlify’s Next.js Library

As we find ways to shovel ever more complexity into our software, even websites that look static can hide serious vulnerabilities. In Exploiting Web3’s Hidden Attack Surface, Sam Curry tears apart numerous cryptocurrency sites with a blend of XSS, SSRF and cache poisoning originating from Netlify’s Next.js library. We’re eager to see if the methodology and cryptocurrency ecosystem insights fuel further discoveries in this field.

9 – Practical client-side path-traversal attacks

Sometimes a vulnerability class can be quite visible, but remain overlooked for years due to low apparent severity.

In Practical client-side path-traversal attacks, Medi explores a website behaviour that’s very common – placing user input inside a request path – and demonstrates a clear pathway to real impact. This behaviour has surfaced in exploit chains a few times over the years but this post shows it’s time to recognise it as a vulnerability in its own right. This cousin of client-side parameter pollution has already inspired a follow-up that uses it for CSRF, and we’re sure more will come.

8 – Psychic Signatures in Java

The catchily-named Psychic Signatures in Java by Neil Madden shows a critical and really very simple attack using the number 0 to forge ECDSA signatures, undermining the cryptographic foundation of numerous core web technologies including JWT and SAML.

This use of an ancient crypto attack to topple modern web tech is a great reminder that you don’t always need a complex attack to achieve massive impact – and as nice as abstractions are, sometimes it’s worth looking further down the stack.

7 – Worldwide Server-side Cache Poisoning on All Akamai Edge Nodes

Back in 2019, one of the nominations for the top 10 was an article theorising about the exploitation potential of HTTP hop-by-hop headers, and calling for further research on the topic. Three years later, Worldwide Server-side Cache Poisoning on All Akamai Edge Nodes makes use of this concept for massive impact and a whole lot of bug bounties, establishing the technique as essential knowledge for web hackers and server implementers alike.

We also appreciate how Jacopo Tediosi throws some rare light into the world of pain people using advanced techniques can encounter when trying to get their bug bounty reports triaged.

6 – Making HTTP header injection critical via response queue poisoning

Ever wondered why people sometimes inject a HTTP response header, but then refer to it as ‘Response Splitting’ even though they never actually split the response? In Making HTTP header injection critical via response queue poisoning, I explore the long-forgotten response-splitting technique with a high-impact, high-payout case study.

As noted by filedescriptor, “It seems like there’s an infinite amount of anomalies among proxies that bring you an infinite amount of Request Smuggling techniques”… more on that later.

5 – Bypassing .NET Serialization Binders

In 2020, the .NET SerializationBinder documentation changed the statement “SerializationBinder can also be used for security” to “SerializationBinder can not be used for security”.

From this simple teaser, Bypassing .NET Serialization Binders by Markus Wulftange delves into why, ultimately building exploits for DevExpress and Microsoft Exchange as case-studies. This research cites an exceptional amount of documentation and related research, making it a great gateway into .NET serialization innards.

Quality research like this often inspires action, and it turns out Soroush has already built on it and integrated the results into YSoSerial.Net. You’ll be left wondering what security-critical documentation has changed since you last read it.

4 – Hacking the Cloud with SAML

Just like OAuth, you can’t discuss SSO without discussing SAML. While some SAML vulnerabilities are all too well-known, Hacking the Cloud with SAML by Felix Wilhelm shines in showing how terrifyingly expansive SAML’s attack-surface is. This culminates with an XML document that uses an integer truncation bug to trigger arbitrary bytecode execution when Java attempts to verify its signature.

This is must-watch research. The presentation recording makes the topic exceptionally accessible and provides valuable context, but if you’re already familiar with SAML exploitation you might get by with the slides or condensed writeup.

3 – Zimbra Email – Stealing Clear-Text Credentials via Memcache injection

Application-layer caching is widely used but only rarely exploited. In Zimbra Email – Stealing Clear-Text Credentials via Memcache injection, Simon Scannell takes this rare bug class to an unseen level by triggering the memcache equivalent of response queue poisoning, causing protocol-level chaos.

This innovative research is a superb demonstration of what can be achieved with deep knowledge of a target. We’ll be keeping watch for both server-side cache injection and alternate-protocol desync attacks in future.

2 – Browser-Powered Desync Attacks: A New Frontier in HTTP Request Smuggling

In Browser-Powered Desync Attacks, I explore new vectors for HTTP Request Smuggling, compromising targets ranging from Amazon to Apache and ultimately taking the attack client-side into victim’s browsers.

I found this research seriously technically challenging, but thankfully the Web Security Academy team was able to provide labs to help readers practise, and we’ve seen people having success with the technique in the wild since. The panel had numerous nice things to say including “The creativity from desync worm (reminiscence of XSS worm) to client-side desync is off the chart” “New concepts, numerous use-cases, impressive as usual” “also gives the audience/reader things to explore”.

Request smuggling has proved itself an unstoppable source of novel threats for several years running now, and with plenty of unexplored avenues this will likely continue until HTTP/1 has been fully stamped out, which I imagine will take a while.

1 – Account hijacking using dirty dancing in sign-in OAuth-flows

OAuth has become the foundation of modern SSO, and attacks on it have been a hacker staple ever since. Recently, browsers introduced referrer-stripping mitigations which thwarted one of the most popular techniques – or so we thought. In Account hijacking using dirty dancing in sign-in OAuth-flows, Frans Rosen shows that modern web stacks provide ample alternatives.

This must-read research provides a masterclass in chaining OAuth quirks with low-impact URL-leak gadgets including promiscuous postMessages, third-party XSS and URL storage. Many of these bugs would previously have been dismissed as having no significant security impact, so they’ve had years to proliferate.

The sheer potential of these attacks quickly inspired us to take a stab at enabling automated URL-leak detection, and we’re also exploring integrating these techniques into labs for our OAuth Academy topic to help the community get practical experience applying them.

This is an outstanding piece of research that we expect to yield fruit for years to come. Congratulations to Frans for a well deserved win!

Conclusion

In 2022 the community published more outstanding research than ever, leading to a broad vote spread, and fierce competition for the top 10 slots. Condensing 40+ nominations into a top 10 list does an inevitable injustice to the 30 runners up so we recommend enthusiasts read the entire nomination list. Also, if your own favourite didn’t make it, feel free to tweet us to let us know!

Part of what lands an entry in the top 10 is its expected longevity, so it’s well worth getting caught up with past year’s top 10s too. If you’re interested in getting a preview of what might win in 2023, you can subscribe to @PortSwiggerRes, r/websecurityresearch, and @[email protected]

If you’re interested in doing this kind of research yourself, I’ve shared a few lessons I’ve learned over the years in Hunting Evasive Vulnerabilities, and So you want to be a web security researcher? Also, if you’re already putting out this kind of work, I should also mention that we’re looking to hire another researcher.

Thanks again to everyone who took part! Without your nominations, votes, and most-importantly research, this wouldn’t be possible.

Till next time!

Back to all articles

How John Li. Cracked His CISSP Exam

studynotesandtheory.com – How I passed the CISSP without ever picking up a book I passed the CISSP exam. 100 questions with 96 minutes left on the clock. It was hard. Tips: focus on process steps, phases, lifecycles. If it’s …

Tweeted by @sectest9 https://twitter.com/sectest9/status/1623331648161947651

NIST Standardizes Ascon Cryptographic Algorithm for IoT and Other Lightweight Devices

Feb 08, 2023Ravie LakshmananEncryption / IoT Security

The U.S. National Institute of Standards and Technology (NIST) has announced that a family of authenticated encryption and hashing algorithms known as Ascon will be standardized for lightweight cryptography applications.

“The chosen algorithms are designed to protect information created and transmitted by the Internet of Things (IoT), including its myriad tiny sensors and actuators,” NIST said. “They are also designed for other miniature technologies such as implanted medical devices, stress detectors inside roads and bridges, and keyless entry fobs for vehicles.”

Put differently, the idea is to adopt security protections via lightweight cryptography in devices that have a “limited amount of electronic resources.”

Ascon is credited to a team of cryptographers from the Graz University of Technology, Infineon Technologies, Lamarr Security Research, and Radboud University.

The suite comprises authenticated ciphers ASCON-128, ASCON-128a, and a variant called ASCON-80pq that comes with resistance against quantum key-search. It also offers a set of hash functions ASCON-HASH, ASCON-HASHA, ASCON-XOF, and ASCON-XOFA.

It’s primarily aimed at constrained devices, and is said to be “easy to implement, even with added countermeasures against side-channel attacks,” according to its developers. This means that even if an adversary manages to glean sensitive information about the internal state during data processing, it cannot be leveraged to recover the secret key.

Ascon is also engineered to provide authenticated encryption with associated data (AEAD), which makes it possible to bind ciphertext to additional information, such as a device’s IP address, to authenticate the ciphertext and prove its integrity.

“The algorithm ensures that all of the protected data is authentic and has not changed in transit,” NIST said. “AEAD can be used in vehicle-to-vehicle communications, and it also can help prevent counterfeiting of messages exchanged with the radio frequency identification (RFID) tags that often help track packages in warehouses.”

Implementations of the algorithm are available in different programming languages, such as C, Java, Python, and Rust, in addition to hardware implementations that offer side-channel protections and energy efficiency.

Found this article interesting? Follow us on Twitter and LinkedIn to read more exclusive content we post.
Reverse Engineering Yaesu FT-70D Firmware Encryption

Reverse Engineering Yaesu FT-70D Firmware Encryption

Written by @landaire
on 

This article dives into my full methodology for reverse engineering the tool mentioned in this article. It’s a bit longer but is intended to be accessible to folks who aren’t necessarily advanced reverse-engineers.

Click on any of the images to view at its original resolution.

Background

Ham radios are a fun way of learning how the radio spectrum works, and more importantly: they’re embedded devices that may run weird chips/firmware! I got curious how easy it’d be to hack my Yaesu FT-70D, so I started doing some research. The only existing resource I could find for Yaesu radios was someone who posted about custom firmware for their Yaesu FT1DR.

The Reddit poster mentioned that if you go through the firmware update process via USB, the radio exposes its Renesas H8SX microcontroller and can have its flash modified using the Renesas SDK. This was a great start and looked promising, but the SDK wasn’t trivial to configure and I wasn’t sure if it could even dump the firmware… so I didn’t use it for very long.

Other Avenues

Yaesu provides a Windows application on their website that can be used to update a radio’s firmware over USB:

The zip contains the following files:

1.2 MB  Wed Nov  8 14:34:38 2017  FT-70D_ver111(USA).exe
682 KB  Tue Nov 14 00:00:00 2017  FT-70DR_DE_Firmware_Update_Information_ENG_1711-B.pdf
8 MB  Mon Apr 23 00:00:00 2018  FT-70DR_DE_MAIN_Firmware_Ver_Up_Manual_ENG_1804-B.pdf
3.2 MB  Fri Jan  6 17:54:44 2012  HMSEUSBDRIVER.exe
160 KB  Sat Sep 17 15:14:16 2011  RComms.dll
61 KB  Tue Oct 23 17:02:08 2012  RFP_USB_VB.dll
1.7 MB  Fri Mar 29 11:54:02 2013  vcredist_x86.exe

I’m going to assume that the file specific to the FT-70D, “FT-70D_ver111(USA).exe”, will likely contain our firmware image. A PE file (.exe) can contain binary resources in the .rsrc section — let’s see what this file contains using XPEViewer:

Resources fit into one of many different resource types, but a firmware image would likely be put into a custom type. What’s this last entry, “23”? Expanding that node we have a couple of interesting items:

RES_START_DIALOG is a custom string the updater shows when preparing an update, so we’re in the right area!

RES_UPDATE_INFO looks like just binary data — perhaps this is our firmware image? Unfortunately looking at the “Strings” tab in XPEViewer or running the strings utility over this data doesn’t yield anything legible. The firmware image is likely encrypted.

Reverse Engineering the Binary

Let’s load the update utility into our disassembler of choice to figure out how the data is encrypted. I’ll be using IDA Pro, but Ghidra (free!), radare2 (free!), or Binary Ninja are all great alternatives. Where possible in this article I’ll try to show my rewritten code in C since it’ll be a closer match to the decompiler and machine code output.

A good starting point is the the string we saw above, RES_UPDATE_INFO. Windows applications load resources by calling one of the FindResource* APIs. FindResourceA has the following parameters:

  1. HMODULE, a handle to the module to look for the resource in.
  2. lpName, the resource name.
  3. lpType, the resource type.

In our disassembler we can find references to the RES_UPDATE_INFO string and look for calls to FindResourceA with this string as an argument in the lpName position.

We find a match in a function which happens to find/load all of these custom resources under type 23.

We know where the data is loaded by the application, so now we need to see how it’s used. Doing static analysis from this point may be more work than it’s worth if the data isn’t operated on immediately. To speed things up I’m going to use a debugger’s assistance. I used WinDbg’s Time Travel Debugging to record an execution trace of the updater while it updates my radio. TTD is an invaluable tool and I’d highly recommend using it when possible. rr is an alternative for non-Windows platforms.

The decompiler output shows this function copies the RES_UPDATE_INFO resource to a dynamically allocated buffer. The qmemcpy() is inlined and represented by a rep movsd instruction in the disassembly, so we need to break at this instruction and examine the edi register’s (destination address) value. I set a breakpoint by typing bp 0x406968 in the command window, allow the application to continue running, and when it breaks we can see the edi register value is 0x2be5020. We can now set a memory access breakpoint at this address using ba r4 0x2be5020 to break whenever this data is read.

Our breakpoint is hit at 0x4047DC — back to the disassembler. In IDA you can press G and enter this address to jump to it. We’re finally at what looks like the data processing function:

We broke when dereferencing v2 and IDA has automatically named the variable it’s being assigned to as Time. The Time variable is passed to another function which formats it as a string with %Y%m%d%H%M%S. Let’s clean up the variables to reflect what we know:

1bool __thiscall sub_4047B0(char *this)
2{
3 char *encrypted_data; // esi
4 BOOL v3; // ebx
5 char *v4; // eax
6 char *time_string; // [esp+Ch] [ebp-320h] BYREF
7 int v7; // [esp+10h] [ebp-31Ch] BYREF
8 __time64_t Time; // [esp+14h] [ebp-318h] BYREF
9 int (__thiscall **v9)(void *, char); // [esp+1Ch] [ebp-310h]
10 int v10; // [esp+328h] [ebp-4h]
11
12 // rename v2 to encrypted_data
13 encrypted_data = *(char **)(*((_DWORD *)AfxGetModuleState() + 1) + 160);
14 Time = *(int *)encrypted_data;
15 // rename this function and its 2nd parameter
16 format_timestamp(&Time, (int)&time_string, "%Y%m%d%H%M%S");
17 v10 = 1;
18 v7 = 0;
19 v9 = off_4244A0;
20 sub_4082C0(time_string);
21 v3 = sub_408350(encrypted_data + 4, 0x100000, this + 92, 0x100000, &v7) == 0;
22 v4 = time_string - 16;
23 v9 = off_4244A0;
24 v10 = -1;
25 if ( _InterlockedDecrement((volatile signed __int32 *)time_string - 1) <= 0 )
26 (*(void (__stdcall **)(char *))(**(_DWORD **)v4 + 4))(v4);
27 return v3;
28}

The timestamp string is passed to sub_4082c0 on line 20 and the remainder of the update image is passed to sub_408350 on line 21. I’m going to focus on sub_408350 since I only care about the firmware data right now and based on how this function is called I’d wager its signature is something like:

status_t sub_408350(uint8_t *input, size_t input_len, uint8_t *output, output_len, size_t *out_data_processed);

Let’s see what it does:

1int __stdcall sub_408350(char *a1, int a2, int a3, int a4, _DWORD *a5)
2{
3 int v5; // edx
4 int v7; // ebp
5 int v8; // esi
6 unsigned int i; // ecx
7 char v10; // al
8 char *v11; // eax
9 int v13; // [esp+10h] [ebp-54h]
10 char v14[64]; // [esp+20h] [ebp-44h] BYREF
11
12 v5 = a2;
13 v7 = 0;
14 memset(v14, 0, sizeof(v14));
15 if ( a2 <= 0 )
16 {
17LABEL_13:
18 *a5 = v7;
19 return 0;
20 }
21 else
22 {
23 while ( 1 )
24 {
25 v8 = v5;
26 if ( v5 >= 8 )
27 v8 = 8;
28 v13 = v5 - v8;
29 for ( i = 0; i < 0x40; i += 8 )
30 {
31 v10 = *a1;
32 v14[i] = (unsigned __int8)*a1 >> 7;
33 v14[i + 1] = (v10 & 0x40) != 0;
34 v14[i + 2] = (v10 & 0x20) != 0;
35 v14[i + 3] = (v10 & 0x10) != 0;
36 v14[i + 4] = (v10 & 8) != 0;
37 v14[i + 5] = (v10 & 4) != 0;
38 v14[i + 6] = (v10 & 2) != 0;
39 v14[i + 7] = v10 & 1;
40 ++a1;
41 }
42 sub_407980(v14, 0);
43 if ( v8 )
44 break;
45LABEL_12:
46 if ( v13 <= 0 )
47 goto LABEL_13;
48 v5 = v13;
49 }
50 v11 = &v14[1];
51 while ( 1 )
52 {
53 --v8;
54 if ( v7 >= a4 )
55 return -101;
56 *(_BYTE *)(a3 + v7++) = v11[6] | (2
57 * (v11[5] | (2
58 * (v11[4] | (2
59 * (v11[3] | (2
60 * (v11[2] | (2
61 * (v11[1] | (2
62 * (*v11 | (2 * *(v11 - 1))))))))))))));
63 v11 += 8;
64 if ( !v8 )
65 goto LABEL_12;
66 }
67 }
68}

I think we’ve found our function that starts decrypting the firmware! To confirm, we want to see what the output parameter’s data looks like before and after this function is called. I set a breakpoint in the debugger at the address where it’s called (bp 0x404842) and put the value of the edi register (0x2d7507c) in WinDbg’s memory window.

Here’s the data before:

After stepping over the function call:

We can dump this data to a file using the following command:

.writemem C:userslanderdocumentsmaybe_deobfuscated.bin 0x2d7507c L100000

010 Editor has a built-in strings utility (Search > Find Strings…) and if we scroll down a bit in the results, we have real strings that appear in my radio!

At this point if we were just interested in getting the plaintext firmware we could stop messing with the binary and load the firmware into IDA Pro… but I want to know how this encryption works.

Encryption Details

Just to recap from the last section:

  • We’ve identified our data processing routine (let’s call this function decrypt_update_info).
  • We know that the first 4 bytes of the update data are a Unix timestamp that’s formatted as a string and used for an unknown purpose.
  • We know which function begins decrypting our firmware image.

Data Decryption

Let’s look at the firmware image decryption routine with some renamed variables:

1int __thiscall decrypt_data(
2 void *this,
3 char *encrypted_data,
4 int encrypted_data_len,
5 char *output_data,
6 int output_data_len,
7 _DWORD *bytes_written)
8{
9 int data_len; // edx
10 int output_index; // ebp
11 int block_size; // esi
12 unsigned int i; // ecx
13 char encrypted_byte; // al
14 char *idata; // eax
15 int remaining_data; // [esp+10h] [ebp-54h]
16 char inflated_data[64]; // [esp+20h] [ebp-44h] BYREF
17
18 data_len = encrypted_data_len;
19 output_index = 0;
20 memset(inflated_data, 0, sizeof(inflated_data));
21 if ( encrypted_data_len <= 0 )
22 {
23LABEL_13:
24 *bytes_written = output_index;
25 return 0;
26 }
27 else
28 {
29 while ( 1 )
30 {
31 block_size = data_len;
32 if ( data_len >= 8 )
33 block_size = 8;
34 remaining_data = data_len - block_size;
35
36 // inflate 1 byte of input data to 8 bytes of its bit representation
37 for ( i = 0; i < 0x40; i += 8 )
38 {
39 encrypted_byte = *encrypted_data;
40 inflated_data[i] = (unsigned __int8)*encrypted_data >> 7;
41 inflated_data[i + 1] = (encrypted_byte & 0x40) != 0;
42 inflated_data[i + 2] = (encrypted_byte & 0x20) != 0;
43 inflated_data[i + 3] = (encrypted_byte & 0x10) != 0;
44 inflated_data[i + 4] = (encrypted_byte & 8) != 0;
45 inflated_data[i + 5] = (encrypted_byte & 4) != 0;
46 inflated_data[i + 6] = (encrypted_byte & 2) != 0;
47 inflated_data[i + 7] = encrypted_byte & 1;
48 ++encrypted_data;
49 }
50 // do something with the inflated data
51 sub_407980(this, inflated_data, 0);
52 if ( block_size )
53 break;
54LABEL_12:
55 if ( remaining_data <= 0 )
56 goto LABEL_13;
57 data_len = remaining_data;
58 }
59 // deflate the data back to bytes
60 idata = &inflated_data[1];
61 while ( 1 )
62 {
63 --block_size;
64 if ( output_index >= output_data_len )
65 return -101;
66 output_data[output_index++] = idata[6] | (2
67 * (idata[5] | (2
68 * (idata[4] | (2
69 * (idata[3] | (2
70 * (idata[2] | (2
71 * (idata[1] | (2 * (*idata | (2 * *(idata - 1))))))))))))));
72 idata += 8;
73 if ( !block_size )
74 goto LABEL_12;
75 }
76 }
77}

At a high level this routine:

  1. Allocates a 64-byte scratch buffer
  2. Checks if there’s any data to process. If not, set the output variable out_data_processed to the number of bytes processed and return 0x0 (STATUS_SUCCESS)
  3. Loop over the input data in 8-byte chunks and inflate each byte to its bit representation.
  4. After the 8-byte chunk is inflated, call sub_407980 with the scratch buffer and 0 as arguments.
  5. Loop over the scratch buffer and reassemble 8 sequential bits as 1 byte, then set the byte at the appropriate index in the output buffer.

Lots going on here, but let’s take a look at step #3. If we take the bytes 0xAA and 0x77 which have bit representations of 0b1010_1010 and 0b0111_1111 respectively and inflate them to a 16-byte array using the algorithm above, we end up with:

| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |    | 8 | 9 | A | B | C | D | E | F |
|---|---|---|---|---|---|---|---|----|---|---|---|---|---|---|---|---|
| 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |    | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 |

This routine does this process over 8 bytes at a time and completely fills the 64-byte scratch buffer with 1s and 0s just like the table above.

Now let’s look at step #4 and see what’s going on in sub_407980:

1_BYTE *__thiscall sub_407980(void *this, _BYTE *a2, int a3)
2{
3 // long list of stack vars removed for clarity
4
5 v3 = (int)this;
6 v4 = 15;
7 v5 = a3;
8 v32[0] = (int)this;
9 v28 = 0;
10 v31 = 15;
11 do
12 {
13 for ( i = 0; i < 48; *((_BYTE *)&v33 + i + 3) = v18 )
14 {
15 v7 = v28;
16 if ( !v5 )
17 v7 = v4;
18 v8 = *(_BYTE *)(i + 48 * v7 + v3 + 4) ^ a2[(unsigned __int8)byte_424E50[i] + 31];
19 v9 = v28;
20 *(&v34 + i) = v8;
21 if ( !v5 )
22 v9 = v4;
23 v10 = *(_BYTE *)(i + 48 * v9 + v3 + 5) ^ a2[(unsigned __int8)byte_424E51[i] + 31];
24 v11 = v28;
25 *(&v35 + i) = v10;
26 if ( !v5 )
27 v11 = v4;
28 v12 = *(_BYTE *)(i + 48 * v11 + v3 + 6) ^ a2[(unsigned __int8)byte_424E52[i] + 31];
29 v13 = v28;
30 *(&v36 + i) = v12;
31 if ( !v5 )
32 v13 = v4;
33 v14 = *(_BYTE *)(i + 48 * v13 + v3 + 7) ^ a2[(unsigned __int8)byte_424E53[i] + 31];
34 v15 = v28;
35 v38[i - 1] = v14;
36 if ( !v5 )
37 v15 = v4;
38 v16 = *(_BYTE *)(i + 48 * v15 + v3 + 8) ^ a2[(unsigned __int8)byte_424E54[i] + 31];
39 v17 = v28;
40 v38[i] = v16;
41 if ( !v5 )
42 v17 = v4;
43 v18 = *(_BYTE *)(i + 48 * v17 + v3 + 9) ^ a2[(unsigned __int8)byte_424E55[i] + 31];
44 i += 6;
45 }
46 v32[1] = *(int *)((char *)&dword_424E80
47 + (((unsigned __int8)v38[0] + 2) | (32 * v34 + 2) | (16 * (unsigned __int8)v38[1] + 2) | (8 * v35 + 2) | (4 * v36 + 2) | (2 * v37 + 2)));
48 v32[2] = *(int *)((char *)&dword_424F80
49 + (((unsigned __int8)v38[6] + 2) | (32 * (unsigned __int8)v38[2] + 2) | (16
50 * (unsigned __int8)v38[7]
51 + 2) | (8
52 * (unsigned __int8)v38[3]
53 + 2) | (4 * (unsigned __int8)v38[4] + 2) | (2 * (unsigned __int8)v38[5] + 2)));
54 v32[3] = *(int *)((char *)&dword_425080
55 + (((unsigned __int8)v38[12] + 2) | (32 * (unsigned __int8)v38[8] + 2) | (16
56 * (unsigned __int8)v38[13]
57 + 2) | (8 * (unsigned __int8)v38[9]
58 + 2) | (4 * (unsigned __int8)v38[10] + 2) | (2 * (unsigned __int8)v38[11] + 2)));
59 v32[4] = *(int *)((char *)&dword_425180
60 + (((unsigned __int8)v38[18] + 2) | (32 * (unsigned __int8)v38[14] + 2) | (16
61 * (unsigned __int8)v38[19]
62 + 2) | (8 * (unsigned __int8)v38[15] + 2) | (4 * (unsigned __int8)v38[16] + 2) | (2 * (unsigned __int8)v38[17] + 2)));
63 v32[5] = *(int *)((char *)&dword_425280
64 + (((unsigned __int8)v38[24] + 2) | (32 * (unsigned __int8)v38[20] + 2) | (16
65 * (unsigned __int8)v38[25]
66 + 2) | (8 * (unsigned __int8)v38[21] + 2) | (4 * (unsigned __int8)v38[22] + 2) | (2 * (unsigned __int8)v38[23] + 2)));
67 v32[6] = *(int *)((char *)&dword_425380
68 + (((unsigned __int8)v38[30] + 2) | (32 * (unsigned __int8)v38[26] + 2) | (16
69 * (unsigned __int8)v38[31]
70 + 2) | (8 * (unsigned __int8)v38[27] + 2) | (4 * (unsigned __int8)v38[28] + 2) | (2 * (unsigned __int8)v38[29] + 2)));
71 v32[7] = *(int *)((char *)&dword_425480
72 + (((unsigned __int8)v38[36] + 2) | (32 * (unsigned __int8)v38[32] + 2) | (16
73 * (unsigned __int8)v38[37]
74 + 2) | (8 * (unsigned __int8)v38[33] + 2) | (4 * (unsigned __int8)v38[34] + 2) | (2 * (unsigned __int8)v38[35] + 2)));
75 v19 = (char *)(&unk_425681 - (_UNKNOWN *)a2);
76 v20 = &unk_425680 - (_UNKNOWN *)a2;
77 v33 = *(int *)((char *)&dword_425580
78 + (((unsigned __int8)v38[42] + 2) | (32 * (unsigned __int8)v38[38] + 2) | (16
79 * (unsigned __int8)v38[43]
80 + 2) | (8
81 * (unsigned __int8)v38[39]
82 + 2) | (4 * (unsigned __int8)v38[40] + 2) | (2 * (unsigned __int8)v38[41] + 2)));
83 result = a2;
84 if ( v4 <= 0 )
85 {
86 v30 = 8;
87 do
88 {
89 *result ^= *((_BYTE *)v32 + (unsigned __int8)result[v20] + 3);
90 result[1] ^= *((_BYTE *)v32 + (unsigned __int8)v19[(_DWORD)result] + 3);
91 result[2] ^= *((_BYTE *)v32 + (unsigned __int8)result[&unk_425682 - (_UNKNOWN *)a2] + 3);
92 result[3] ^= *((_BYTE *)v32 + (unsigned __int8)result[byte_425683 - a2] + 3);
93 result += 4;
94 --v30;
95 }
96 while ( v30 );
97 }
98 else
99 {
100 v29 = 8;
101 do
102 {
103 v24 = result[32];
104 v22 = *result ^ *((_BYTE *)v32 + (unsigned __int8)result[v20] + 3);
105 result += 4;
106 result[28] = v22;
107 *(result - 4) = v24;
108 v25 = result[29];
109 result[29] = *(result - 3) ^ *((_BYTE *)v32 + (unsigned __int8)result[(_DWORD)v19 - 4] + 3);
110 *(result - 3) = v25;
111 v26 = result[30];
112 result[30] = *(result - 2) ^ *((_BYTE *)v32 + (unsigned __int8)result[&unk_425682 - (_UNKNOWN *)a2 - 4] + 3);
113 *(result - 2) = v26;
114 v27 = result[31];
115 result[31] = *(result - 1) ^ *((_BYTE *)v32 + (unsigned __int8)result[byte_425683 - a2 - 4] + 3);
116 *(result - 1) = v27;
117 --v29;
118 }
119 while ( v29 );
120 }
121 v5 = a3;
122 v3 = v32[0];
123 v4 = v31 - 1;
124 v23 = v31 - 1 <= -1;
125 ++v28;
126 --v31;
127 }
128 while ( !v23 );
129 return result;
130}

Oof. This is substantially more complicated but looks like the meat of the decryption algorithm. We’ll refer to this function, sub_407980, as decrypt_data from here on out. We can see what may be an immediate roadblock: this function takes in a C++ this pointer (line 5) and performs bitwise operations on one of its members (line 18, 23, etc.). For now let’s call this class member key and come back to it later.

This function is the perfect example of decompilers emitting less than ideal code as a result of compiler optimizations/code reordering. For me, TTD was essential for following how data flows through this function. It took a few hours of banging my head against IDA and WinDbg to understand, but this function can be broken up into 3 high-level phases:

  1. Building a 48-byte buffer containing our key material XOR’d with data from a static table.
1 int v33;
2 unsigned __int8 v34; // [esp+44h] [ebp-34h]
3 unsigned __int8 v35; // [esp+45h] [ebp-33h]
4 unsigned __int8 v36; // [esp+46h] [ebp-32h]
5 unsigned __int8 v37; // [esp+47h] [ebp-31h]
6 char v38[44]; // [esp+48h] [ebp-30h]
7
8 v3 = (int)this;
9 v4 = 15;
10 v5 = a3;
11 v32[0] = (int)this;
12 v28 = 0;
13 v31 = 15;
14 do
15 {
16 // The end statement of this loop is strange -- it's writing a byte somewhere? come back
17 // to this later
18 for ( i = 0; i < 48; *((_BYTE *)&v33 + i + 3) = v18 )
19 {
20 // v28 Starts at 0 but is incremented by 1 during each iteration of the outer `while` loop
21 v7 = v28;
22 // v5 is our last argument which was 0
23 if ( !v5 )
24 // overwrite v7 with v4, which begins at 15 but is decremented by 1 during each iteration
25 // of the outer `while` loop
26 v7 = v4;
27 // left-hand side of the xor, *(_BYTE *)(i + 48 * v7 + v3 + 4)
28 // v3 in this context is our `this` pointer + 4, giving us *(_BYTE *)(i + (48 * v7) + this->maybe_key)
29 // so the left-hand side of the xor is likely indexing into our key material:
30 // this->maybe_key[i + 48 * loop_multiplier]
31 //
32 // right-hand side of the xor, a2[(unsigned __int8)byte_424E50[i] + 31]
33 // a2 is our input encrypted data, and byte_424E50 is some static data
34 //
35 // this full statement can be rewritten as:
36 // v8 = this->maybe_key[i + 48 * loop_multiplier] ^ encrypted_data[byte_424E50[i] + 31]
37 v8 = *(_BYTE *)(i + 48 * v7 + v3 + 4) ^ a2[(unsigned __int8)byte_424E50[i] + 31];
38
39 v9 = v28;
40
41 // write the result of `key_data ^ input_data` to a scratch buffer (v34)
42 // v34 looks to be declared as the wrong type. v33 is actually a 52-byte buffer
43 *(&v34 + i) = v8;
44
45 // repeat the above 5 more times
46 if ( !v5 )
47 v9 = v4;
48 v10 = *(_BYTE *)(i + 48 * v9 + v3 + 5) ^ a2[(unsigned __int8)byte_424E51[i] + 31];
49 v11 = v28;
50 *(&v35 + i) = v10;
51
52 // snip
53
54 // v18 gets written to the scratch buffer at the end of the loop...
55 v18 = *(_BYTE *)(i + 48 * v17 + v3 + 9) ^ a2[(unsigned __int8)byte_424E55[i] + 31];
56
57 // this was probably the *real* last statement of the for-loop
58 // i.e. for (int i = 0; i < 48; i += 6)
59 i += 6;
60 }
  1. Build a 32-byte buffer containing data from an 0x800-byte static table, with indexes into this table originating from indices built from the buffer in step #1. Combine this 32-byte buffer with the 48-byte buffer in step #1.
1 // dword_424E80 -- some static data
2 // (unsigned __int8)v38[0] + 2) -- the original decompiler output has this wrong.
3 // v33 should be a 52-byte buffer which consumes v38, so v38 is actually data set up in
4 // the loop above.
5 // (32 * v34 + 2) -- v34 should be some data from the above loop as well. This looks like
6 // a binary shift optimization
7 // repeat with different multipliers...
8 //
9 // This can be simplified as:
10 // size_t index = ((v34 << 5) + 2)
11 // | ((v37[1] << 4) + 2)
12 // | ((v35 << 3) + 2)
13 // | ((v36 << 2) + 2)
14 // | ((v37 << 1) + 2)
15 // | v38[0]
16 // v32[1] = *(int*)(((char*)&dword_424e80)[index])
17 v32[1] = *(int *)((char *)&dword_424E80
18 + (((unsigned __int8)v38[0] + 2) | (32 * v34 + 2) | (16 * (unsigned __int8)v38[1] + 2) | (8 * v35 + 2) | (4 * v36 + 2) | (2 * v37 + 2)));
19 // repeat 7 times. each time the reference to dword_424e80 is shifted forward by 0x100.
20 // note: if you do the math, the next line uses dword_424e80[64]. We shift by 0x100 instead of
21 // 64 because is misleading because dword_424e80 is declared as an int array -- not a char array.
  1. Iterate over the next 8 bytes of the output buffer. For each byte index of the output buffer, index into yet another static 32-byte buffer and use that as the index into the table from step #2. XOR this value with the value at the current index of the output buffer.
1// Not really sure why this calculation works like this. It ends up just being `unk_425681`'s address
2// when it's used.
3 v19 = (char *)(&unk_425681 - (_UNKNOWN *)a2);
4 v20 = &unk_425680 - (_UNKNOWN *)a2;
5
6// v4 is a number that's decremented on every iteration -- possibly bytes remaining?
7 if ( v4 <= 0 )
8 {
9 // Loop over 8 bytes
10 v30 = 8;
11 do
12 {
13 // Start XORing the output bytes with some of the data generated in step 2.
14 //
15 // Cheating here and doing the "draw the rest of the owl", but if you observe that
16 // we use `unk_425680` (v20), `unk_425681` (v19), `unk_425682`, and byte_425683, the
17 // the decompiler generated suboptimal code. We can simplify to be relative to just
18 // `unk_425680`
19 //
20 // *result ^= step2_bytes[unk_425680[output_index] - 1]
21 *result ^= *((_BYTE *)v32 + (unsigned __int8)result[v20] + 3);
22
23 // result[1] ^= step2_bytes[unk_425680[output_index] + 1]
24 result[1] ^= *((_BYTE *)v32 + (unsigned __int8)v19[(_DWORD)result] + 3);
25
26 // result[2] ^= step2_bytes[unk_425680[output_index] + 2]
27 result[2] ^= *((_BYTE *)v32 + (unsigned __int8)result[&unk_425682 - (_UNKNOWN *)a2] + 3);
28
29 // result[3] ^= step2_bytes[unk_425680[output_index] + 3]
30 result[3] ^= *((_BYTE *)v32 + (unsigned __int8)result[byte_425683 - a2] + 3);
31 // Move our our pointer to the output buffer forward by 4 bytes
32 result += 4;
33 --v30;
34 }
35 while ( v30 );
36 }
37 else
38 {
39 // loop over 8 bytes
40 v29 = 8;
41 do
42 {
43 // grab the byte at 0x20, we're swapping this later
44 v24 = result[32];
45
46 // v22 = *result ^ step2_bytes[unk_425680[output_index] - 1]
47 v22 = *result ^ *((_BYTE *)v32 + (unsigned __int8)result[v20] + 3);
48
49 // I'm not sure why the output buffer pointer is incremented here, but
50 // this really makes the code ugly
51 result += 4;
52
53 // Write the byte generated above to offset 0x1c
54 result[28] = v22;
55 // Write the byte at 0x20 to offset 0
56 *(result - 4) = v24;
57
58 // rinse, repeat with slightly different offsets each time...
59 v25 = result[29];
60 result[29] = *(result - 3) ^ *((_BYTE *)v32 + (unsigned __int8)result[(_DWORD)v19 - 4] + 3);
61 *(result - 3) = v25;
62 v26 = result[30];
63 result[30] = *(result - 2) ^ *((_BYTE *)v32 + (unsigned __int8)result[&unk_425682 - (_UNKNOWN *)a2 - 4] + 3);
64 *(result - 2) = v26;
65 v27 = result[31];
66 result[31] = *(result - 1) ^ *((_BYTE *)v32 + (unsigned __int8)result[byte_425683 - a2 - 4] + 3);
67 *(result - 1) = v27;
68 --v29;
69 }
70 while ( v29 );
71 }

The inner loop in the else branch above I think is kind of nasty, so here it is reimplemented in Rust:

1for _ in 0..8 {
2 // we swap the `first` index with the `second`
3 for (first, second) in (0x1c..=0x1f).zip(0..4) {
4 let original_byte_idx = first + output_offset + 4;
5
6 let original_byte = outbuf[original_byte_idx];
7
8 let constant = unk_425680[output_offset + second] as usize;
9
10 let new_byte = outbuf[output_offset + second] ^ generated_bytes_from_step2[constant - 1];
11
12 let new_idx = original_byte_idx;
13 outbuf[new_idx] = new_byte;
14 outbuf[output_offset + second] = original_byte;
15 }
16
17 output_offset += 4;
18}

Key Setup

We now need to figure out how our key is set up for usage in the decrypt_data function above. My approach here is to set a breakpoint at the first instruction to use the key data in decrypt_data, which happens to be xor bl, [ecx + esi + 4] at 0x4079d3. I know this is where we should break because in the decompiler output the left-hand side of the XOR operation, the key material, will be the second operand in the xor instruction. As a reminder, the decompiler shows the XOR as:

v8 = *(_BYTE *)(i + 48 * v7 + v3 + 4) ^ a2[(unsigned __int8)byte_424E50[i] + 31];

The breakpoint is hit and the address we’re loading from is 0x19f5c4. We can now lean on TTD to help us figure out where this data was last written. Set a 1-byte memory write breakpoint at this address using ba w1 0x19f5c4 and press the Go Back button. If you’ve never used TTD before, this operates exactly as Go would except backwards in the program’s trace. In this case it will execute backward until either a breakpoint is hit, interrupt is generated, or we reach the start of the program.

Our memory write breakpoint gets triggered at 0x4078fb — a function we haven’t seen before. The callstack shows that it’s called not terribly far from the decrypt_update_info routine!

  • set_key (we are here — function is originally called sub_407850)
  • sub_4082c0
  • decrypt_update_info

What’s sub_4082c0?

Not a lot to see here except the same function called 4 times, initially with the timestamp string as an argument in position 0, a 64-byte buffer, and bunch of function calls using the return value of the last as its input. The function our debugger just broke into takes only 1 argument, which is the 64-byte buffer used across all of these function calls. So what’s going on in sub_407e80?

The bitwise operations that look supsiciously similar to the byte to bit inflation we saw above with the firmware data. After renaming things and performing some loop unrolling, things look like this:

1// sub_407850
2int inflate_timestamp(void *this, char *timestamp_str, char *output, uint8_t *key) {
3 for (size_t output_idx = 0; output_idx < 8; output_idx++) {
4 uint8_t ts_byte = *timestamp_str;
5 if (ts_byte) {
6 timestamp_str += 1;
7 }
8
9 for (int bit_idx = 0; bit_idx < 8; bit_idx++) {
10 uint8_t bit_value = (ts_byte >> (7 - bit_idx)) & 1;
11 output[(output_idx * 8) + bit_idx] ^= bit_value;
12 }
13 }
14
15 set_key(this, key);
16 decrypt_data(this, output, 1);
17
18 return timestamp_str;
19}
20
21// sub_4082c0
22int set_key_to_timestamp(void *this, char *timestamp_str) {
23 uint8_t key_buf[64];
24 memset(&key_buf, 0, sizeof(key_buf));
25
26 char *str_ptr = inflate_timestamp(this, timestamp_str, &key_buf, &static_key_1);
27 str_ptr = inflate_timestamp(this, str_ptr, &key_buf, &static_key_2);
28 str_ptr = inflate_timestamp(this, str_ptr, &key_buf, &static_key_3);
29 inflate_timestamp(this, str_ptr, &key_buf, &static_key_4);
30
31 set_key(this, &key_buf);
32}

The only mystery now is the set_key routine:

1int __thiscall set_key(char *this, const void *a2)
2{
3 _DWORD *v2; // ebp
4 char *v3; // edx
5 char v4; // al
6 char v5; // al
7 char v6; // al
8 char v7; // al
9 int result; // eax
10 char v10[56]; // [esp+Ch] [ebp-3Ch] BYREF
11
12 qmemcpy(v10, a2, sizeof(v10));
13 v2 = &unk_424DE0;
14 v3 = this + 5;
15 do
16 {
17 v4 = v10[0];
18 qmemcpy(v10, &v10[1], 0x1Bu);
19 v10[27] = v4;
20 v5 = v10[28];
21 qmemcpy(&v10[28], &v10[29], 0x1Bu);
22 v10[55] = v5;
23 if ( *v2 == 2 )
24 {
25 v6 = v10[0];
26 qmemcpy(v10, &v10[1], 0x1Bu);
27 v10[27] = v6;
28 v7 = v10[28];
29 qmemcpy(&v10[28], &v10[29], 0x1Bu);
30 v10[55] = v7;
31 }
32 for ( result = 0; result < 48; result += 6 )
33 {
34 v3[result - 1] = v10[(unsigned __int8)byte_424E20[result] - 1];
35 v3[result] = v10[(unsigned __int8)byte_424E21[result] - 1];
36 v3[result + 1] = v10[(unsigned __int8)byte_424E22[result] - 1];
37 v3[result + 2] = v10[(unsigned __int8)byte_424E23[result] - 1];
38 v3[result + 3] = v10[(unsigned __int8)byte_424E24[result] - 1];
39 v3[result + 4] = v10[(unsigned __int8)byte_424E25[result] - 1];
40 }
41 ++v2;
42 v3 += 48;
43 }
44 while ( (int)v2 < (int)byte_424E20 );
45 return result;
46}

This function is a bit more straightforward to reimplement:

1void set_key(void *this, uint8_t *key) {
2 uint8_t scrambled_key[56];
3 memcpy(&scrambled_key, key, sizeof(scrambled_key));
4
5 for (size_t i = 0; i < 16; i++) {
6 size_t swap_rounds = 1;
7 if (((uint32_t*)GLOBAL_KEY_ROUNDS_CONFIG)[i] == 2) {
8 swap_rounds = 2;
9 }
10
11 for (int i = 0; i < swap_rounds; i++) {
12 uint8_t temp = scrambled_key[0];
13 memcpy(&scrambled_key, &scrambled_key[1], 27);
14 scrambled_key[27] = temp;
15
16 temp = scrambled_key[28];
17 memcpy(&scrambled_key[28], &scrambled_key[29], 27);
18 scrambled_key[55] = temp;
19 }
20
21 for (size_t swap_idx = 0; swap_idx < 48; swap_idx++) {
22 size_t scrambled_key_idx = GLOBAL_KEY_SWAP_TABLE[swap_idx] - 1;
23
24 size_t persistent_key_idx = swap_idx + (i * 48);
25 this->key[persistent_key_idx] = scrambled_key[scrambled_key_idx];
26 }
27 }
28}

Putting Everything Together

  1. Update data is read from resources
  2. The first 4 bytes of the update data are a Unix timestamp
  3. The timestamp is formatted as a string, has each byte inflated to its bit representation, and decrypted using some static key material as the key. This is repeated 4 times with the output of the previous run used as an input to the next.
  4. The resulting data from step 3 is used as a key for decrypting data.
  5. The remainder of the firmware update image is inflated to its bit representation 8 bytes at a time and uses the dynamic key and 3 other unique static lookup tables to transform the inflated input data.
  6. The result from step 5 is deflated back into its byte representation.

My decryption utility which completely reimplements this magic in Rust can be found at https://github.com/landaire/porkchop.

Loading the Firmware in IDA Pro

IDA thankfully supports disassembling the Hitachi/Rensas H8SX architecture. If we load our firmware into IDA and select the “Hitachi H8SX advanced” processsor type, use the default options for the “Disassembly memory organization” dialog, then finally choose “H8S/2215R” in the “Choose the device name” dialog…:

We don’t have shit. I’m not an embedded systems expert, but my friend suggested that the first few DWORDs look like they may belong to a vector table. If we right-click address 0 and select “Double word 0x142A”, we can click on the new variable unk_142A to go to its location. Press C at this location to define it as Code, then press P to create a function at this address:

We can now reverse engineer our firmware 🙂

Sydney Man Sentenced for Blackmailing Optus Customers After Data Breach

Feb 08, 2023Ravie LakshmananCyber Crime / SMS Fraud

A Sydney man has been sentenced to an 18-month Community Correction Order (CCO) and 100 hours of community service for attempting to take advantage of the Optus data breach last year to blackmail its customers.

The unnamed individual, 19 when arrested in October 2022 and now 20, used the leaked records stolen from the security lapse to orchestrate an SMS-based extortion scheme.

The suspect contacted dozens of victims to threaten that their personal information would be sold to other hackers and “used for fraudulent activity” unless an AU$ 2,000 payment is made to a bank account under their control.

The scammer is said to have sent the SMS messages to 92 individuals whose information was part of a larger cache of 10,200 records that was briefly published in a criminal forum in September 2022,

The Australian Federal Police (AFP), which launched Operation Guardian following the breach, said there is no evidence that any of the affected customers transferred the demanded amount.

Following his arrest, the offender pleaded guilty in November 2022 to two counts of using a telecommunications network with intent to commit a serious offense.

“The criminal use of stolen data is a serious offense and has the potential to cause significant harm to the community,” AFP Commander Chris Goldsmid said.

The Australian telecom service provider suffered a massive hack last year, with passport information and Medicare numbers pertaining to nearly 2.1 million of its current and former customers exposed.

Found this article interesting? Follow us on Twitter and LinkedIn to read more exclusive content we post.