Saturday, March 28, 2020

Google Search & Personal Assistant data on android

The Google app, previously known as Google Now, is installed by default on most phones. From the app's description -

The Google app keeps you in the know about things that matter to you. Find quick answers, explore your interests, and stay up to date with Discover. The more you use the Google app, the better it gets.

Search and browse:
- Nearby shops and restaurants
- Live sports scores and schedules
- Movies times, casts, and reviews
- Videos and images
- News, stock information, and more
- Anything you’d find on the web


It is that ubiquitous bar/widget sometimes called the Google Assistant Search Bar or just google Search widget found on the phone's home screen.

Figure 1 - Google Search / Personal Assistant Bar 

The internal package goes by the name com.google.android.googlequicksearchbox. It's artifacts are found at /data/data/com.google.android.googlequicksearchbox/

There are many files and folders here, but the most interesting data is the sub-folder files/recently

Your recent searches along with some full screen screenshots of search results are stored here. Screenshots (saved as jpg) are in .webp format. The unique number in the name is referenced by the data in the protobuf file (file name is the email address of the logged in user account). If you are not logged in, nothing is populated in this folder. See screenshots below.

 
Figure 2 - Folder 'recently' has no entries when no account was logged on.

Figure 3 - Folder 'recently' has files when searches were performed after logging in


The protobuf file ([email protected] in this case) when decoded has entries that look like this (see below) for a typical search. If you aren't familiar with protobuf decoding, read this.

1 {
  1: 15485946382791341007
  3: 0
  4: 1585414188066
  5: "dolphin"
  8 {
    1: "web"
    2: "google.com"
  }
  9: 10449902870035666886
  17: 1585413397978
}

In the protobuf data (decoded using protoc.exe), as seen above, we can easily distinguish the relevant fields:

Item Description
1 session id
4 timestamp1 (unix epoch)
5 search query
8 dictionary
1 = type of search (web, video, ..)
2 = search engine
9 screenshot-id (needs conversion to int from uint)
17 timestamp2 (unix epoch)


Here is the corresponding screenshot saved in the same folder -
Figure 4 - Screenshot of search for"dolphin"

If you clicked on a recent news story in the app, the protobuf entry looks like this (below):

1 {
  1: 9016892896339717414
  3: 1
  4: 1572444614834
  5: ""
  7 {
    1: "https://9to5mac.com/2019/10/30/photos-of-airpods-pro/"
    2: "9to5mac.com"
    3: "Photos of AirPods Pro arriving in stores around the world - 9to5Mac"
  }
  9: 9016892896339717414
  10: 9
  17: 1572444614834
}
Figure 5 - Screenshot for news article clicked from link in google app


Last week, I added a plugin for ALEAPP to read these recent search artifacts. This isn't all, there is actually more data to be read here.

The search widget can be used to make any kind of query, which may then be forwarded to the web browser or Android Auto or the Email or Messaging apps depending on what was queried for. This makes for an interesting artifact.

From my test data, all searches are stored in the app_session folder as protobuf files having the extension .binarypb. See screenshot below.

Figure 6 - .binarypb files
Each of these files is a protobuf that stores a lot of data about the searches. This includes searches from Android Auto too. Josh Hickman did some excellent research on Android Auto and addressed some of this briefly in his talk here. A parser is not available to read this as the format of the data contained in the protobufs is unknown. I've attempted to reverse-engineer parts of it enough to get the useful bits of information out, such as the search queries. There are also mp3 recordings of the replies from google assistant stored in some of them. These are being added to ALEAPP to parse.

The format here is a bit too much to write about. Below is the raw protobuf structure (sans the binary blobs, replaced by ...). The search term here was "tom and jerry".

{
  1: 0x00000053b0c63c1b
  2: 0x11f0299e
  3: "search"
  132242267: ""
  132264001 {
    1: "..."
    2: 0x00000000
    3: 0
    4: 0x00000000000e75fe
  }
  132269388 {
    2: 0x0000000000000040
    3 {
      1: "..."
      2: ""
      3: "and.gsa.launcher.allapps.appssearch"
    }
  }
  132269847 {
    1 {
      1: "..."
      2: ""
      3: "and.gsa.launcher.allapps.appssearch"
    }
    2 [
      0: "...",
      1: "... tom and jerry ..."
      2: "..."
      3: 1
    ]
  }
  146514374 {
    1: "and.gsa.launcher.allapps.appssearch"
  }
  206022552 {
    1: 0
  }
}

After studying this and several other samples, here are the important pieces in the parsed protobuf dictionary:


ItemDescription
1session id (same number as in filename)
3type of query (search, car_assistant, opa)
car_assistant = Android Auto
opa = personal assistant
132269388dictionary
1 = mp3 recording of response
132269847dictionary

1 = dictionary
ItemDescription
2last query

2 = List of session queries (in blobs)



For more details, refer the module googleQuickSearchbox.py in ALEAPP. Below is a screenshot of the parsed out data.
Figure 7 - ALEAPP output showing Google App / Personal assistant queries

Wednesday, January 8, 2020

Usagestats on Android 10 (Q)

UsageStats

If you are unfamiliar with this artifact, Alex Brignoni explains the UserStats artifact in the blog post here. Located at /data/system/usagestats/ this information can be useful in cases. Up until Android 9 (Pie), this was in XML format, however since Android 10(Q), it is now in a different format. So the tool written by Alex didn't work out for me or my students investigating this artifact a couple of months back.

The file name has the same format (unix millisecond time as integer) and below you can see what the new data looks like.

Figure 1 - File 1572840777639 - raw hex view (complete file not shown)
It appeared to be some sort of binary format, but without a standard consistent header (after I compared a few files). Taking a cue from fellow DFIR researchers (Sarah Edwards and Phill Moore), I tested if this was a protocol buffer. If you aren't familiar with a Protocol Buffer, read these posts from Sarah and Phill. This is a google creation, and as they describe it - ...a language-neutral, platform-neutral extensible mechanism for serializing structured data.

To test for protocol buffer presence (on windows), you will need to download protoc.exe from here. Run protoc.exe as shown below. Here 1572840777639 is the filename. If you got output, its a protobuf.

W:\usagestats\0\daily>protoc --decode_raw < 1572840777639
1: 1862148
3: 1
4: 1
2 {
  1: 74
  2: "com.google.android.youtube"
  2: "com.google.android.ext.services"
  2: "com.android.providers.telephony"
  2: "com.android.dynsystem"
  2: "com.android.settings.CryptKeeper"
...
...output snipped...
...
22 {
  2: 23
  4: 60
  5: 1249881
  7: 23
  14: 92830887
  15: 23
  16: 60
}

OK, so we got some decoded json data back. But it still did not look like anything we are used to seeing (see XML below).
Figure 2 - XML usagestats snippet
The way protocol buffers work, you need a .proto file that defines the structure and data types of the data contained in the buffer. So to decode this, we need the .proto file!

Since Android is open source, so why not peek at the source code of AOSP? To avoid downloading the entire source code, just browse the aosp-mirror on github.
Figure 3 - aosp source code on github

After a bit of searching, we find the file we are looking for at:
platform_frameworks_base/core/proto/android/server/usagestatsservice.proto

Figure 4 - usagestatsservice.proto file snippet
As seen above, the file references other .proto files too. So we must get those too, and any dependencies in those as well (recursively). We eventually end up with 7 files:

  • usagestatsservice.proto
  • configuration.proto
  • privacy.proto
  • locale.proto
  • rect.proto
  • protobuf_descriptor.proto
  • window_configuration.proto

Next, we need to transform (google says compile) our .proto files into python libraries. Use protoc.exe to do so. The syntax is :

protoc -I=$SRC_DIR --python_out=$DST_DIR $SRC_DIR/your_proto_file.proto

Do this for every .proto file. It will generate a .py file for each one. For example, the usagestatsservice.proto compiles to usagestatsservice_pb2.py. Now all that remains is to use these generated python files to read our raw protocol buffer from file. We will need to write some code to do so.

Peeking into the usagestatsservice.proto file, you get some idea of how this might work. I constructed a basic python script to read this (below).
import usagestatsservice_pb2
input_path = "W:\\usagestats\\0\\daily\\1572840777639" stats = usagestatsservice_pb2.IntervalStatsProto() with open (input_path, 'rb'as f:     stats.ParseFromString(f.read())     # GET PACKAGES     for usagestat in stats.packages:         print('package = '+ stats.stringpool.strings[usagestat.package_index - 1])         print(usagestat)
# GET CONFIGURATIONS for conf in stats.configurations:         print(conf)
# GET EVENT LOGS for event in stats.event_log:         print(event)

You can check for the existence of a field using the HasField() function. So here is what a package object consists of:
package = com.android.settings
package_index: 58
last_time_active_ms: 663647
total_time_active_ms: 4897
app_launch_count: 3
last_time_service_used_ms: -1572840673324
last_time_visible_ms: 673237
total_time_visible_ms: 25221
A configuration object consists of:
config {
  font_scale: 1.0
  locales {
    language: "en"
    country: "US"
  }
  screen_layout: 268435794
  color_mode: 5
  touchscreen: 3
  keyboard: 2
  keyboard_hidden: 1
  hard_keyboard_hidden: 1
  navigation: 1
  navigation_hidden: 2
  orientation: 1
  screen_width_dp: 411
  screen_height_dp: 659
  smallest_screen_width_dp: 411
  density_dpi: 560
  window_configuration {
    app_bounds {
      right: 1440
      bottom: 2392
    }
    windowing_mode: 1
    bounds {
      right: 1440
      bottom: 2560
    }
  }
}
last_time_active_ms: 662163
total_time_active_ms: 37
count: 1

An event log object contains:
package = com.google.android.apps.nexuslauncher
class = com.google.android.apps.nexuslauncher.NexusLauncherActivity
task root package = com.google.android.apps.nexuslauncher
task root class = com.google.android.apps.nexuslauncher.NexusLauncherActivity
type = MOVE_TO_FOREGROUND
time_ms: 34440
So now, our protobuf parsed and file read and interpreted successfully! That's it for now. On to the next artifact.. 

Monday, October 28, 2019

macOS 10.15 Volumes & Firmlink magic

With macOS 10.15 - Catalina, Apple has introduced a change in the way system and user data is stored on disk. In prior versions, the root '/' volume was stored in a single volume usually named 'Macintosh HD'. This did not change with the update to APFS. However with Catalina, there are now two distinct volumes -
  • Macintosh HD
  • Macintosh HD - Data
The screenshot below shows the two different volumes -

Figure 1 - diskutil output showing a split Macintosh HD volume into two

The Macintosh HD volume stores the system files and is mounted as read-only, while the Macintosh HD - Data volume has all the other files on your system which include user profiles, system and user data, and user installed Applications.

However when booted, only a single logical volume is presented (as root /) that combines the contents of both. This is enabled through APFS using its Volume Role feature. This is mentioned in Apple's official APFS documentation but its usage or working is not documented. Each volume can be assigned a role in its volume Superblock structure (apfs_superblock_t). There are 8 possible roles documented. From the Apple docs:

  #define APFS_VOL_ROLE_NONE      0x0000
  #define APFS_VOL_ROLE_SYSTEM    0x0001
  #define APFS_VOL_ROLE_USER      0x0002
  #define APFS_VOL_ROLE_RECOVERY  0x0004
  #define APFS_VOL_ROLE_VM        0x0008
  #define APFS_VOL_ROLE_PREBOOT   0x0010
  #define APFS_VOL_ROLE_INSTALLER 0x0020
  #define APFS_VOL_ROLE_DATA      0x0040
  #define APFS_VOL_ROLE_BASEBAND  0x0080

The SYSTEM volume contains the folders /bin, /sbin and most of the /usr and /System folders. A few subfolders of /usr and /System are on the DATA volume. The volumes are joined using a new construct that Apple calls firmlinks. They describe it as a Bi-directional wormhole in path traversal. Firmlinks are used on the system volume to point to the user data on the data volume.

They are somewhat similar to the unix symlinks and hardlinks, but only directories can be linked (from one volume to another). The file that defines/lists the firmlinks resides on the SYSTEM volume at /usr/share/firmlinks. The following paths are defined by default.

  /AppleInternal AppleInternal
  /Applications Applications
  /Library Library
  /System/Library/Caches System/Library/Caches
  /System/Library/Assets System/Library/Assets
  /System/Library/PreinstalledAssets System/Library/PreinstalledAssets
  /System/Library/AssetsV2 System/Library/AssetsV2
  /System/Library/PreinstalledAssetsV2 System/Library/PreinstalledAssetsV2
  /System/Library/CoreServices/CoreTypes.bundle/Contents/Library  
              System/Library/CoreServices/CoreTypes.bundle/Contents/Library
  /System/Library/Speech System/Library/Speech
  /Users Users
  /Volumes Volumes
  /cores cores
  /opt opt
  /private private
  /usr/local usr/local
  /usr/libexec/cups usr/libexec/cups
  /usr/share/snmp usr/share/snmp

The linked volumes will have distinct inode numbers for files/folders. The only common inode numbers seen are inodes 1 (Parent of root), 2 (root) and 3 (private-dir). All other inodes will be unique, a simple but clever scheme is used to ensure that. For the SYSTEM volume, every inode number allocated will be OR'd with 0x0FFFFFFF00000000. Take a look at the inode numbers in the  combined volume in screenshot below. The very large numbers are the files that reside on the SYSTEM volume due to the upper bits being set by the mask.

Figure 2 - Contents of root showing files from both SYSTEM and DATA
If you try to create a file or folder on the root volume (or one of its owned folders), it fails with an error: Read-only volume.

For accessing most files and folders, there should be no problem as the stitched/combined volume works seamlessly so all programs should not notice any difference. However there are situations where you might want to explicitly access a folder from a specific volume, especially for forensics. For example, if you wanted to access /./fseventsd, you would always get the read-only volume's .fseventsd folder which won't be too interesting as its a read-only volume! To get the one on the DATA volume, there is still a way. Apple has also made the DATA volume available (mounted) at the mountpoint /System/Volumes/Data. This also means that if you have scripts that run across all files, they will need to be made aware that this location should be avoided to prevent duplication.

According to Apple, you cannot opt-out of this, and it is a required feature for macOS 10.15. Forensic tools that operate on full disk images will have to adapt for this change, and so I've updated mac_apt to support macOS Catalina. If you use it, let me know of bugs/issues.

Saturday, October 26, 2019

Part 3 - ADB keyvalue backups - Wifi and System settings

This is Part 3 of the continuing blog series on ADB keyvalue backups. Today we focus on Wifi settings and other system configuration available from-
<Backup.adb>/apps/com.android.providers.settings/k/com.android.providers.settings.data
They are backed up only if you specified the -keyvalue option and are available on non-rooted devices too.

This file follows the Key-Value Data format as outlined earlier in part 1. There are 8 different types of data seen here. The Key name represents the type of data and Value represents either a single structure or a set of name-value pairs (both name and value are strings). The table below shows the data seen here.

Key Name
Description
system
settings about font sizes, screen brightness, hearing aids, haptic feedback among others
secure
more system settings on gestures, button behaviors, spell checker, screensaver, accessibility, etc..
global
Boolean settings that enable/disable options like wifi wakeup,   auto_time, sounds enabled, call auto-retry, etc..
locale
a locale string like ‘en-US’
lock_settings
owner info for display on screen if enabled
softap_config
Access point settings for Mobile hotspot
network_policies
unknown
wifi_new_config
xml data having wifi settings for connected access points

Perhaps the most interesting aspect here is the presence of wifi passwords (WPA pre-shared keys) in the wifi_new_config data as see in screenshot below. Yes, you can get wifi passwords from an adb backup now!

Figure 1 - Snippet of Wifi saved settings from com.android.providers.settings.data showing SSIDs & passwords
Here is a python script to read com.android.providers.settings.data and export the information to json files. Below you can see some of the data parsed by the this script for one of my test devices.


Figure 2 - Data from 'global' key
Figure 3 - Data from 'system' key


Figure 4 - Data from 'secure' key





Figure 5 - Data from 'softap_config' key
Figure 6 - Data from 'lock_settings' key

Tuesday, October 15, 2019

Part 2 - ADB keyvalue backups - Call Logs

This is Part 2 of the continuing blog series on ADB keyvalue backups. Today we focus on Call Log Backups. 

Call logs are backed up under 
<Backup.adb>/apps/com.android.calllogbackup/k/com.android.calllogbackup.data
They are backed up only if you specified the -keyvalue option and are available on non-rooted devices too.

This file follows the Key-Value Data format as outlined earlier in part 1. The Keys here are the call ids or serial number of calls, starting at 1 and sequentially rising. The Values are the individual call log records.

Here are the structures used in the Call Log record. All fields here are stored as Big Endian.

1. Text_Record

Position
Type
Description
00
ushort
field_length (in bytes)
02
char[field_length]
field data (text)

2. Call_Log

Position
Type
Description
00
uint
version, 0x03EF (1007) or 1005 seen
04
int64
timestamp
12
uint64
call duration in seconds
20
byte
is_phone_number_present
21
Text_Record
present if is_phone_number_present = 1
..
uint
call type
   1 = Incoming
   2 = Outgoing
   3 = Missed
   4 = voicemail
   5 = Rejected / Declined
   6 = Blocked
   7 = Answered_Externally
..
uint
number presentation
  1 = Allowed
  2 = Restricted
  3 = Unknown
  4 = Payphone
..
byte
is_servicename_present
..
Text_Record
present if is_ servicename_present = 1
..
byte
is_iccid_present
..
Text_Record
present if is_ iccid _present = 1
..
byte
is_own_num_present
..
Text_Record
present if is_ iccid _present = 1
..
byte[12]
unknown bytes, always 0
..
Text_Record
oem namespace string
..
byte[18]
unknown bytes
..
uint
block reason (only on version 1007)
   1 = Screening service
   2 = Direct to voicemail
   3 = Blocked number
   4 = Unknown number
   5 = Restricted number
   6 = Payphone
   7 = Not in contacts
..
byte[18]
unknown bytes (only on version 1007)

The screenshot below shows a raw record in the hex editor.

Figure 1 - Call log record showing some important fields

Using an 010 template to parse this information, it looks like this (below).

Figure 2 - Call log record data parsed in 010 editor
The level of detail on these records is great. There are call status codes known as Call Type (Missed, Incoming, Outgoing, ..) as well as a number Presentation code which is usually 1 (Allowed), although there are a few other values. Calls that show up on your phone as 'Private' numbers, will have presentation code 2 (Restricted). If you have enabled any call blocking features, then those show up too on blocked calls (known as block reason).

Code to automate this parsing

A python script has been created to parse call log records from the com.android.calllogbackup.data file, available here. The 010 template can be downloaded here.

Forensic Gems - Detecting Deleted call records

Since each call record has a key which is the call id or the serial number of the call, I performed an experiment to see if deleting intermittent call records would change this number. It turns out that the number does not change, in effect allowing us to detect deleted call records. This is visible in the screenshot below, where you can see call IDs (serial numbers) of 1 through 8 but its missing 4 and 7. Those are the ones I had manually deleted from the Call logs on the phone through the available feature in the Phone app. This was also tested on a real phone with several hundred call records going back several months and it appears to hold true there too.

Figure 3 - Output of callparser.py, made pretty in Excel showing missing call ids.
This can be useful knowing that there can be records which are missing, perhaps intentionally.

Stay tuned for Part 3, there is more good stuff in these key-value backups.