Yogesh Khatri's forensic blog: 2020

Monday, December 28, 2020

Introducing ios_apt - iOS Artifact Parsing Tool

ios_apt is the new shiny companion to mac_apt

ios_apt is not a separate project, it's just a part of the mac_apt framework, and serves as a launch script that processes iOS/iPadOS artifacts.

Why yet another iOS parsing tool, don't we already have too many?

In addition to paid tools, we have iLEAPP, APOLLO and a few others, and I am also an active contributor to some of them. This isn't meant to compete with them, rather it utilizes the mac_apt framework to prevent duplication of work.

Many artifacts on iOS and macOS share common backend databases, configuration and artifact types. Among the artifacts that are almost identical are -

Spotlight
UnifiedLogging logs
Network usage database
Networking artifacts like hardware info and last IP leases
Safari
Notes
FSevents
ScreenTime

There are a few others too that aren't listed here. But you get the picture. Since mac_apt already parsed all of them, it made sense to just create an ios variant that parses these from ios extractions.

Also many of these artifacts are fairly complex and other FOSS tools don't have the architecture needed to handle them. APOLLO only gathers information from SQLite databases. iLEAPP is geared towards single artifact parsing per plugin. It is not designed for multiple layers of parsing where information parsed from one artifact/file may be used as a key to jump to an artifact elsewhere on disk.

Limitations

In its first version, ios_apt only works on full file system images extracted out to a folder. No support yet for zip/tar/dar/7z/other archives.

Available Plugins / Modules

The following Plugins are available as of now -

APPS
BASICINFO
FSEVENTS
NETUSAGE
NETWORKING
NOTES
SAFARI
SCREENTIME
SPOTLIGHT
TERMSESSIONS
WIFI

Download the latest version of mac_apt to get ios_apt.

Sunday, July 19, 2020

KTX to PNG in Python for iOS snapshots

App snapshots on iOS are stored as KTX files, this is fairly well known at this point, thanks to the research by Geraldine Blay (@i_am_the_gia) and Alex Brignoni (@AlexisBrignoni) here and here. They even came up with a way to collect and convert them to PNG format. However that solution was only for macOS, and hence this research..

KTX

KTX is a file format used to store textures, and used commonly by OpenGL programs. The KTX file format is known and available here. There aren't many standalone utilities that work with KTX files, as it is mostly used in games and not for reading/distributing standalone textures. There are also no readily available python libraries to read it! The Khronos group that created the format distributes libktx, but it is C++ only. Even so, it would not be able to read iOS created ktx files (read on for the reasons mentioned below). The few Windows applications I could find like PicoPixel would not recognize Apple created KTX files as valid files.

So what is different here? A quick glance over the file in the hex editor showed that the texture data was stored in LZFSE compressed form, which currently only macOS/iOS can read.

Figure - Ascii view of 010 hex editor with ktx template

Now using pyliblzfse, I could decompress the data, and recreate a new KTX file with raw texture data. Even so, it would not render with KTX viewers other than macOS's Finder/Quickview and Preview. So I tried a few different approaches to get to the data.

Attempt 1 - Rendering & Export

Textures are different from 2D images and there is hence not a direct conversion from a textures to an image format. From available research, it seemed like the easiest way to do this would be to use OpenGL to render the texture data (extracted from the KTX file), then use OpenGL to save a 2D image of the rendered image. There is some sample code available too on the internet, but in order to get it to work, one would need to know how to use OpenGL to render textures, a learning curve that was too steep for me..

After spending several hours trying to get this to work in Python, I ultimately gave up as python is not the platform where major OpenGL development takes place, therefore there is little to no support, and libraries are platform dependent. I barely got one of the libraries to install correctly in Linux, and every step of the way I got more errors than I wanted to debug, ultimately I threw in the towel.

Attempt 2 - Convert texture data to RAW image data

Reading the KTX file header, the glInternalFormat value field from the header is 0x93B0 for all iOS produced KTX files (as seen in screenshot above). This value is the enumeration for COMPRESSED_RGBA_ASTC_4x4. So now we know the format is ASTC, which is Adaptive Scalable Texture Compression, a lossy compressed format for storing texture data, and uses a block size of 4x4 pixels. That simplifies our task to now finding a way to convert ASTC data to raw image data. A bit of searching led me to the python library astc_decomp which does precisely that. So what I needed now was to put the pieces together as follows:

Read KTX file and parse format to get LZFSE compressed data, and parameters of Width and Height
Decompress LZFSE to get ASTC data
Convert ASTC to RAW image stream
Save RAW image as PNG using PIL library

Combining this together, we are able to create a python script that can convert KTX files to PNG files. Get it here:
https://github.com/ydkhatri/MacForensics/tree/master/IOS_KTX_TO_PNG

There is also a windows compiled executable there if you need to do this on windows without python. Alex Brignoni was helpful in sending me samples of KTX files to work with from multiple images. The code also works with KTX files that are not really KTX, ie, they have the .ktx extension but the header is 'AAPL'. The format is however similar and my code will parse them out too. If you do come across a file that does not work, send it to me and I can take a look.

A point to note is that not all KTX files use the COMPRESSED_RGBA_ASTC_4x4 format, only the iOS created ones do. So you may come across many KTX files deployed or shipped with apps that can't be parsed with this tool, as it only handles ASTC 4x4 format.

Enjoy!

Tuesday, June 9, 2020

Screentime Notifications in Catalina (10.15)

If you routinely perform mac forensics, you've probably done a few macOS Catalina (10.15) examinations already. And if you are the kind that verifies your data, you may have noticed that for ScreenTime notifications the databases don't show you the same strings that you see in the actual displayed Notification and several forensic tools don't either.

Let's explore why.

To start with, lets review the format of the Notifications database. For macOS High Sierra (10.) and above, it is located at:

/private/var/folders/<xx>/<yyyyyyy>/0/com.apple.notificationcenter/db2/db

where the <xx>/<yyyyyy> portion represents what might appear like random strings, but they are not random. This folder path represents the DARWIN_USER_DIR for a specific user. For more details on this read my old post here.

Inside the database, the record table holds the actual notification data (title, sub-title, body) and date of notification among other fields. A simple database query can get the useful data.

SELECT
(SELECT identifier FROM app WHERE app.app_id=record.app_id) as app,
uuid, data, presented, delivered_date
FROM record

The actual notification data is within a plist stored in the column data. Inside this plist, you can easily navigate to the items titl, subt and body to get the title, sub-title and body. However for screentime notifications, the data looks different. Instead of individual strings in these values, they are lists.

Figure 2 - Embedded plist for screentime notification

Screentime uses format strings and a list of data, which needs to be put back together. This is similar to how Event logs in windows or Unified logging in macOS works. The format strings are located at the paths shown below (for english) and are available in other languages too:

/System/Library/UserNotifications/Bundles/com.apple.ScreenTimeNotifications.bundle/Contents/Resources/en.lproj/Localizable.strings
/System/Library/UserNotifications/Bundles/com.apple.ScreenTimeNotifications.bundle/Contents/Resources/en.lproj/InfoPlist.strings

These files are plists which consist of a single dictionary each. So WeeklyReportNotificationNegativeDeltaBody seen in plist above resolves to the message :
"Your screen time was down %@ last week, for an average of %@ a day." The %@ will be replaced with data provided (15% and 6 hours, 24 minutes) becoming:
"Your screen time was down 15% last week, for an average of 6 hours, 24 minutes a day."

Figure 3 - Snippet of Localizable.strings plist

Similarly WeeklyReportNotificationTitle becomes Weekly Report Available. So now, we are able to reconstruct the complete original message.

mac_apt's NOTIFICATIONS plugin has now been updated with this functionality.

Monday, March 30, 2020

Parsing unknown protobufs with python

Protocol Buffers are quite popular, more and more apps and system files are storing data in this format in both iOS and Android operating systems. If you aren't familiar with Protocol Buffers, read this post. There I use the protoc.exe utility (by google), as does everyone else who needs to view this data, when you do not have the corresponding .proto file.

This is great! But the raw view/output has one big disadvantage. While this approach (--decode_raw) works fine if you just want to see the text strings stored in your data, it does not always provide the correct conversions for all the raw data types!

According to google, when the message (data) is encoded, there are only 6 different types of data types allowed. These are known as wire types. Here are the allowed types (below).

Figure - Allowed wire types from https://developers.google.com/protocol-buffers/docs/encoding#structure

Unless you have the .proto file, you really don't know what the original data type may be. Even protoc.exe just makes a best guess. For instance, all binary blobs are also converted to strings with protoc as both the string and bytes type use the Length-delimited wiretype. There is also no way to tell if a number is to be interpreted as signed or unsigned, because they both use the same underlying type (varint)!

Now to raw-decode a protobuf in python, there are a couple of libraries I have seen so far that do a decent job. I will list out the libraries, then demonstrate parsing with them, and compare.

1. Protobuf-decoder (https://github.com/nevermoe/protobuf-decoder)

This seems to be more than 4 years old and not maintained any more. It is also in python2. There is a python3 port somewhere. It makes several assumptions regarding data types and attempts to produce output similar to protoc.

2. BlackboxProtobuf (https://github.com/nccgroup/blackboxprotobuf)

This is a more mature library that provides much more in functionality. It makes relatively few assumptions about data types. In addition to parsing the protobuf and returning a dictionary object, it also provides a type definition dictionary for the parsed data.

To demonstrate what I am talking about, I created a demo protocol buffer file called addressbook.proto and defined a protobuf message as shown below.

message Person {

  required string name = 1;

  required int32 id = 2;

  optional string email = 3;

  required int64 id64 = 4;

  required uint64 uid64 = 5;

  optional double double = 6;

  optional bytes bytes = 7;

}

Then compiled it using protoc.exe.

protoc --python_out=. addressbook.proto

Now used a python script to include the compiled python protobuf header and generated a binary protobuf file called tester_pb. The data contained in it is shown below.

Actual data

[
name: "John Doe"
id: 1234
email: "[email protected]"
id64: -22
uid64: 13360317589766481554
double: 4.5566
bytes: b'\x00\x124V'
]

Protoc output ( protoc --decode_raw < ..\tester_pb )

1 {
1: "John Doe"
2: 1234
3: "[email protected]"
4: 18446744073709551594
5: 13360317589766481554
6: 0x401239f559b3d07d
7: "\000\0224V"
}

protobuf-decoder output

{
'01:00:string': 'John Doe',
'02:01:Varint': 1234,
'03:02:string': '[email protected]',
'04:03:Varint': 18446744073709551594,
'05:04:Varint': 13360317589766481554,
'06:05:64-bit': 4.5566
'07:06:string': '\x00\x124V'
}

blackboxprotobuf output (includes data dictionary and types dictionary)

{'1':
{
'1': b'John Doe',
'2': 1234,
'3': b'[email protected]',
'4': -22,
'5': -5086426483943070062,
'6': 4616816293942907005,
'7': b'\x00\x124V'
}
}

{'1': {'type': 'message', 'message_typedef':
{
'1': {'type': 'bytes', 'name': ''},
'2': {'type': 'int', 'name': ''},
'3': {'type': 'bytes', 'name': ''},
'4': {'type': 'int', 'name': ''},
'5': {'type': 'int', 'name': ''},
'6': {'type': 'fixed64', 'name': ''},
'7': {'type': 'bytes', 'name': ''}
}, 'name': ''}
}

As seen in the outputs above, each decoder makes some default assumptions about the data types encountered. The items highlighted in red are the ones that are interpreted using incorrect types. I like blackboxprotobuf because it lets you specify the real type via a types dictionary similar to the one it outputs. So once we have figured out the correct types, we can pass this into the decode_message() function to get the correct output. See code snippet below.

import blackboxprotobuf

with open('tester_pb', 'rb') as f:
pb = f.read()
types = {'1': {'type': 'message', 'message_typedef':
{
'1': {'type': 'str', 'name': 'name'},
'2': {'type': 'int', 'name': 'id'},
'3': {'type': 'str', 'name': 'email'},
'4': {'type': 'int', 'name': 'id64'},
'5': {'type': 'uint', 'name': 'uid64'},
'6': {'type': 'double', 'name': 'double'},
'7': {'type': 'bytes', 'name': 'bytes'}
}, 'name': ''}
}
values, _ = blackboxprotobuf.decode_message(pb, types)
print(values)

That produces the desired output -

{'1':
{
'name': 'John Doe',
'id': 1234,
'email': '[email protected]',
'id64': -22,
'uid64': 13360317589766481554,
'double': 4.5566,
'bytes': b'\x00\x124V'
}
}

In summary, I recommend using the blackboxprotobuf library, ~~however note that it is not exactly plug and play. Since it is not on pypi, you have to use it from code.~~ Also, to use it with python3, I had to make one small tweak. I also added the 'str' type decode as that was not available. Since then I have tested this with numerous protobuf streams and it has not failed me! For my updated version of this library, get it here.

Update (8/2020): I made more bug fixes and published it to pypi, so you can install it via pip now.
pip install blackboxprotobuf

Saturday, March 28, 2020

Google Search & Personal Assistant data on android

The Google app, previously known as Google Now, is installed by default on most phones. From the app's description -

The Google app keeps you in the know about things that matter to you. Find quick answers, explore your interests, and stay up to date with Discover. The more you use the Google app, the better it gets.

Search and browse:
- Nearby shops and restaurants
- Live sports scores and schedules
- Movies times, casts, and reviews
- Videos and images
- News, stock information, and more
- Anything you’d find on the web

It is that ubiquitous bar/widget sometimes called the Google Assistant Search Bar or just google Search widget found on the phone's home screen.

Figure 1 - Google Search / Personal Assistant Bar

The internal package goes by the name com.google.android.googlequicksearchbox. It's artifacts are found at /data/data/com.google.android.googlequicksearchbox/

There are many files and folders here, but the most interesting data is the sub-folder files/recently

Your recent searches along with some full screen screenshots of search results are stored here. Screenshots (saved as jpg) are in .webp format. The unique number in the name is referenced by the data in the protobuf file (file name is the email address of the logged in user account). If you are not logged in, nothing is populated in this folder. See screenshots below.

Figure 2 - Folder 'recently' has no entries when no account was logged on.

Figure 3 - Folder 'recently' has files when searches were performed after logging in

The protobuf file ([email protected] in this case) when decoded has entries that look like this (see below) for a typical search. If you aren't familiar with protobuf decoding, read this.

1 {
1: 15485946382791341007
3: 0
4: 1585414188066
5: "dolphin"
8 {
1: "web"
2: "google.com"
}
9: 10449902870035666886
17: 1585413397978
}

In the protobuf data (decoded using protoc.exe), as seen above, we can easily distinguish the relevant fields:

Item	Description
1	session id
4	timestamp1 (unix epoch)
5	search query
8	dictionary 1 = type of search (web, video, ..) 2 = search engine
9	screenshot-id (needs conversion to int from uint)
17	timestamp2 (unix epoch)

Here is the corresponding screenshot saved in the same folder -

Figure 4 - Screenshot of search for"dolphin"

If you clicked on a recent news story in the app, the protobuf entry looks like this (below):

1 {
1: 9016892896339717414
3: 1
4: 1572444614834
5: ""
7 {
1: "https://9to5mac.com/2019/10/30/photos-of-airpods-pro/"
2: "9to5mac.com"
3: "Photos of AirPods Pro arriving in stores around the world - 9to5Mac"
}
9: 9016892896339717414
10: 9
17: 1572444614834
}

Figure 5 - Screenshot for news article clicked from link in google app

Last week, I added a plugin for ALEAPP to read these recent search artifacts. This isn't all, there is actually more data to be read here.

The search widget can be used to make any kind of query, which may then be forwarded to the web browser or Android Auto or the Email or Messaging apps depending on what was queried for. This makes for an interesting artifact.

From my test data, all searches are stored in the app_session folder as protobuf files having the extension .binarypb. See screenshot below.

Figure 6 - .binarypb files

Each of these files is a protobuf that stores a lot of data about the searches. This includes searches from Android Auto too. Josh Hickman did some excellent research on Android Auto and addressed some of this briefly in his talk here. A parser is not available to read this as the format of the data contained in the protobufs is unknown. I've attempted to reverse-engineer parts of it enough to get the useful bits of information out, such as the search queries. There are also mp3 recordings of the replies from google assistant stored in some of them. These are being added to ALEAPP to parse.

The format here is a bit too much to write about. Below is the raw protobuf structure (sans the binary blobs, replaced by ...). The search term here was "tom and jerry".

{
1: 0x00000053b0c63c1b
2: 0x11f0299e
3: "search"
132242267: ""
132264001 {
1: "..."
2: 0x00000000
3: 0
4: 0x00000000000e75fe
}
132269388 {
2: 0x0000000000000040
3 {
1: "..."
2: ""
3: "and.gsa.launcher.allapps.appssearch"
}
}
132269847 {
1 {
1: "..."
2: ""
3: "and.gsa.launcher.allapps.appssearch"
}
2 [
0: "...",
1: "... tom and jerry ..."
2: "..."
3: 1
]
}
146514374 {
1: "and.gsa.launcher.allapps.appssearch"
}
206022552 {
1: 0
}
}

After studying this and several other samples, here are the important pieces in the parsed protobuf dictionary:

Item

Description

session id (same number as in filename)

type of query (search, car_assistant, opa)
car_assistant = Android Auto
opa = personal assistant

132269388

dictionary
1 = mp3 recording of response

132269847

dictionary

1 = dictionary

Item	Description
2	last query

2 = List of session queries (in blobs)

For more details, refer the module googleQuickSearchbox.py in ALEAPP. Below is a screenshot of the parsed out data.

Figure 7 - ALEAPP output showing Google App / Personal assistant queries

Wednesday, January 8, 2020

Usagestats on Android 10 (Q)

UsageStats

If you are unfamiliar with this artifact, Alex Brignoni explains the UserStats artifact in the blog post here. Located at /data/system/usagestats/ this information can be useful in cases. Up until Android 9 (Pie), this was in XML format, however since Android 10(Q), it is now in a different format. So the tool written by Alex didn't work out for me or my students investigating this artifact a couple of months back.

The file name has the same format (unix millisecond time as integer) and below you can see what the new data looks like.

Figure 1 - File 1572840777639 - raw hex view (complete file not shown)

It appeared to be some sort of binary format, but without a standard consistent header (after I compared a few files). Taking a cue from fellow DFIR researchers (Sarah Edwards and Phill Moore), I tested if this was a protocol buffer. If you aren't familiar with a Protocol Buffer, read these posts from Sarah and Phill. This is a google creation, and as they describe it - ...a language-neutral, platform-neutral extensible mechanism for serializing structured data.

To test for protocol buffer presence (on windows), you will need to download protoc.exe from here. Run protoc.exe as shown below. Here 1572840777639 is the filename. If you got output, its a protobuf.

W:\usagestats\0\daily>protoc --decode_raw < 1572840777639
1: 1862148
3: 1
4: 1
2 {
1: 74
2: "com.google.android.youtube"
2: "com.google.android.ext.services"
2: "com.android.providers.telephony"
2: "com.android.dynsystem"
2: "com.android.settings.CryptKeeper"
...
...output snipped...
...
22 {
2: 23
4: 60
5: 1249881
7: 23
14: 92830887
15: 23
16: 60
}

OK, so we got some decoded json data back. But it still did not look like anything we are used to seeing (see XML below).

Figure 2 - XML usagestats snippet

The way protocol buffers work, you need a .proto file that defines the structure and data types of the data contained in the buffer. So to decode this, we need the .proto file!

Since Android is open source, so why not peek at the source code of AOSP? To avoid downloading the entire source code, just browse the aosp-mirror on github.

Figure 3 - aosp source code on github

After a bit of searching, we find the file we are looking for at:
platform_frameworks_base/core/proto/android/server/usagestatsservice.proto

Figure 4 - usagestatsservice.proto file snippet

As seen above, the file references other .proto files too. So we must get those too, and any dependencies in those as well (recursively). We eventually end up with 7 files:

usagestatsservice.proto
configuration.proto
privacy.proto
locale.proto
rect.proto
protobuf_descriptor.proto
window_configuration.proto

Next, we need to transform (google says compile) our .proto files into python libraries. Use protoc.exe to do so. The syntax is :

protoc -I=$SRC_DIR --python_out=$DST_DIR $SRC_DIR/your_proto_file.proto

Do this for every .proto file. It will generate a .py file for each one. For example, the usagestatsservice.proto compiles to usagestatsservice_pb2.py. Now all that remains is to use these generated python files to read our raw protocol buffer from file. We will need to write some code to do so.

Peeking into the usagestatsservice.proto file, you get some idea of how this might work. I constructed a basic python script to read this (below).

import usagestatsservice_pb2

input_path = "W:\\usagestats\\0\\daily\\1572840777639"
stats = usagestatsservice_pb2.IntervalStatsProto()
with open (input_path, 'rb') as f:
    stats.ParseFromString(f.read())
    # GET PACKAGES
    for usagestat in stats.packages:
        print('package = '+ stats.stringpool.strings[usagestat.package_index - 1])
        print(usagestat)

    # GET CONFIGURATIONS
    for conf in stats.configurations:
        print(conf)

    # GET EVENT LOGS
    for event in stats.event_log:
        print(event)

You can check for the existence of a field using the HasField() function. So here is what a package object consists of:

package = com.android.settings
package_index: 58
last_time_active_ms: 663647
total_time_active_ms: 4897
app_launch_count: 3
last_time_service_used_ms: -1572840673324
last_time_visible_ms: 673237
total_time_visible_ms: 25221

A configuration object consists of:

config {
font_scale: 1.0
locales {
language: "en"
country: "US"
}
screen_layout: 268435794
color_mode: 5
touchscreen: 3
keyboard: 2
keyboard_hidden: 1
hard_keyboard_hidden: 1
navigation: 1
navigation_hidden: 2
orientation: 1
screen_width_dp: 411
screen_height_dp: 659
smallest_screen_width_dp: 411
density_dpi: 560
window_configuration {
app_bounds {
right: 1440
bottom: 2392
}
windowing_mode: 1
bounds {
right: 1440
bottom: 2560
}
}
}
last_time_active_ms: 662163
total_time_active_ms: 37
count: 1

An event log object contains:

package = com.google.android.apps.nexuslauncher
class = com.google.android.apps.nexuslauncher.NexusLauncherActivity
task root package = com.google.android.apps.nexuslauncher
task root class = com.google.android.apps.nexuslauncher.NexusLauncherActivity
type = MOVE_TO_FOREGROUND
time_ms: 34440

So now, our protobuf parsed and file read and interpreted successfully! That's it for now. On to the next artifact..

Pages