Thursday, September 28, 2017

APFS timestamps

From APFS documentation it is revealed that the new timestamp has a nano second resolution. From my data I can see several timestamps in what appears to be the root directory for a test APFS volume I created. Armed with this information, it was pretty easy to guess the epoch - it is the Unix epoch.

ApfsTimeStamp = number of nano-seconds since 1-1-1970

If you like to use the 010 editor for analysis, put the following code in your or file:

// ApfsTime
//  64-bit integer, number of nanoseconds since 01/01/1970 00:00:00

typedef uint64 ApfsTime <read=ApfsTimeRead, write=ApfsTimeWrite>;
FSeek( pos ); ApfsTime _aft <name="ApfsTimes">;
string ApfsTimeRead( ApfsTime t )
    // Convert to FILETIME
    return FileTimeToString( t/100L + 116444736000000000L );
int ApfsTimeWrite( ApfsTime &t, string value )
    // Convert from FILETIME
    FILETIME ft;
    int result = StringToFileTime( value, ft );
    t = (((uint64)ft - 116444736000000000L)*100L);
    return result;

Now 010 can interpret APFS timestamps :)

010 interprets ApfsTimes

If you need to read this timestamp in python (and convert to python datetime), the following function will do this:

import datetime

def ConvertApfsTime(ts):
        return datetime.datetime(1970,1,1) + datetime.timedelta(microseconds=ts / 1000. )
  return None

Monday, September 18, 2017

Interpreting volume info from FXDesktopVolumePositions

On a mac, if you wanted to get a list of all connected volumes, you would typically lookup FXDesktopVolumePositions under the file (for each user). This is available under /Users/<USER>/Library/Preferences/  The volumes listed here can include anything mounted, a CD, a DMG file, a volume from external disk, network mapped volume or anything else that mac sees as a mounted volume/device.

Figure 1: Snippet of 'FKDesktopVolumePositions' from

The above snippet shows every volume connected to one of our test machines. What's strange is the way the volume information is stored. Ignoring the icon position data (which resides a level below not shown in that screenshot), all that's available is a string that looks like this:


Studying this a bit, it appears that the random looking hex number after the name is not really random. It represents the 'Created date' of the volume's root folder (for most but not all the entries!). The date format is Mac Absolute Time (ie, number of seconds since 2001).
The picture below shows the breakup.

Figure 2: Interpreting volume info from FXDesktopVolumePositions entry

The first part of the entry (till the last underscore) is the name as it appears in the Finder window. The next part ignoring the period is the created date. This is followed by p+ and then a decimal number (usually 0, 26, 27, 28 or 29). I believe that number represents a volume type. From what I can see in some limited testing, if volume type=28, then the hex number is always the created date.

Large external disks (USB) that identify as fixed disks and most DMG volumes will show up as type 28. All USB removable disks show up under type 29 and here the hex number is not a created date, it is unknown what this may be. Sometimes this number is negative for type 29, and a lot of volumes share the same number.

There is still some mystery about the hex number. For type=28, sometimes it is less than 8 digits long, and needs to be padded with zeroes at the end. This does produce an accurate date then. Also, sometimes it is longer than 8 digits! In these cases, truncating the number at 8 digits has again produced an accurate date in limited testing. It is unclear what those extra digits would denote.

Following this discovery, mac_apt has been updated to parse this date.

Sunday, September 3, 2017

Releasing mac_apt - macOS Artifact Parsing Tool

Over the last several months, I've been developing a tool to process mac full disk images. It's an extendible framework (in python) which has plugins to parse different artifacts on a mac. The idea was to make a tool that an examiner could run as soon as a mac image was obtained to have artifacts parsed out in an automated fashion. And it had to be free and open-source, that ran on windows/linux (or mac) without the requirement of needing a mac to process mac images. 

Here are the specifics:
  • INPUT - E01 or dd image or DMG(not compressed)
  • OUTPUT - Xlsx, Csv, Sqlite
  • Parsed artifacts (files) are also exported for later manual review.
  • lzvn compressed files (quite a few on recent OSX versions including key plist files) are supported too!

As of now, there are plugins to parse network information (ip, dhcp, interfaces), wifi hotspots, basic machine settings (timezone, hostname, HFS info, disk info,..), recent items (file, volume, applications,..) , local & domain user info, notifications, Safari history and Spotlight shortcuts. More are in the works.. Tested on OSX images from 10.9-10.12

The project is currently in alpha status. Let us know of bugs/issues, any feature requests, ..

The project is available on GitHub here.

Some motivations behind this project:
- Learn Python
- Learn more about OSX

Wednesday, August 2, 2017

Finding the Serial number of a Mac from disk image

On a mac (osx/macOS), the serial number is usually not stored on the disk, it is stored in the firmware and available either printed on the backside/underside of your mac/macbook computer or accessible via software on a booted system using 'About My Mac' or System Profiler.

On recent versions of OSX, there are however a few system databases that store this information and make it available for forensic investigators to use (or for verification). These are:
  • consolidated.db
  • cache_encryptedA.db
  • lockCache_encryptedA.db
All the above files are sqlite databases located in the 'root' user's Darwin user cache folder located under /private/var/folders/zz/zyxvpxvq6csfxvn_n00000sm00006d/C/. This location should be the same for all OSX/macOS installations (10.9 & above) because UID and UUID of root is same on all systems and does not change. 
For more information on Darwin folders, see this blog post. 

Screenshot 1 - Table 'TableInfo' inside consolidated.db showing Serial Number

In the above screenshot, the serial number is seen starting with 'VM'. It starts with VM since this was a virtual machine; for real machines, you will see the actual hardware serial number here. I was able to verify this on several macs running osx 10.9 to 10.12. 

In addition, other software might retrieve and store this information too. One such software, is KeyAccess, installed by Sassafras asset management system. KeyAccess leaves behind a binary file /Library/Preferences/KeyAccess/KeyAccess Prefs which also contains the serial number.

Another place where you might find the serial is sysinfo.cache. This is created by Apple Remote Desktop and is found at /var/db/RemoteManagement/caches/sysinfo.cache.

Thursday, April 27, 2017

The mystery of /var/folders on OSX

The /var/folders (which is actually /private/var/folders) has been a known but undocumented entity for a very long time. Apple does not document why its there, or what it is. But there are plenty of guides online that suggest it may be good to periodically clear those folders to recover disk space and perhaps speed up you system. Whether that is wise is beyond the scope of this blog post.

If you've ever done a MacOS (OSX) forensic examination, you've probably noticed some odd folders here, all two character folder names with long random looking subfolders. Something like /var/folders/_p/vbvyjz297778zzwzhfpsqb2w0000gl/. A level below are 3 folders C, T, 0 and you can then see familiar data files under those. The 3 folders represent Cache (C), Temporary files (T) and User files (0)

On a live system these can be queried using the command 'getconf VARIABLE' where VARIABLE can be DARWIN_USER_CACHE_DIR, DARWIN_USER_TEMP_DIR or DARWIN_USER_DIR.

These are locations where mac stores cache and temporary files from various programs and system utilities. They are further segregated on a per-user basis. So each of those folders (Example: /var/folders/_p/vbvyjz297778zzwzhfpsqb2w0000gl) represents a single user's temporary space. 

Whats in it for forensicators?

From a forensics examination perspective, there is not a lot of artifacts here. However some useful artifacts like Notifications databases and Quicklook thumbnailcache databases are located here. The Launchpad dock database is also here. Sometimes you can find useful tidbits of cache from other apps too.

Figure: /private/var/folders on MacOS 10.12 (Sierra) 

It would be nice to be able to determine which user owned a particular folder when analyzing the artifacts within it. This is actually really easy, as you can just look up the owner uid of the folder. But if you are more interested in how the name gets generated, read on.

Reverse engineering the Algorithm

There is an old forum post here that does not provide the algorithm but hints that its likely generated from uuid and uid. Both of which are available from the user's plist (under /var/db/dslocal/nodes/Default/users/<USER>.plist). From the narration, the character set used in the folder names would be '+-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'. However that's not what is seen on newer macs (10.10 onwards)

After analyzing the folder names on my mac test machines, the data set was narrowed down to 0-9, underscore and all alphabets except vowels (aeiou). A little bit of research confirmed this, when the string '0123456789_bcdfghjklmnpqrstvwxyz' was seen in the libsystem_coreservices.dylib binary. The string is 32 char in size, so you would need 5 bits to represent an index which could point to a specific character in the string. After a bit of experimentation with creating a few user accounts and setting custom UUIDs, it was clear how this worked. The  algorithm takes the 128 bit UUID string as a binary bitstream, appends to it the binary bitstream of the UID (4 byte), then runs a single pass over that data. For each 5 bits it reads, it uses that as an index to get a single char from the charset array and copies that to output. A python implementation that generate these folder names (for both new osx versions and older ones) is provided here.


Thursday, January 5, 2017

Flexing SQL muscle for parsing an MS db on OSX

This post is about using recursive SQL queries (or rather a single recursive query) to parse the MicrosoftRegistrationDB.reg file created by Microsoft office on Mac OSX systems.

A little background..

On OSX (mac), there is no registry. Most apps just use plist files instead to save local information. Microsoft Office leaves cache and configuration information in plist files like every other OSX application. However it also keeps a copy in this file – microsoftRegistrationDB.reg. The file can be found here –

/Users/research/Library/Group Containers/xxxxxxxxxx.Office/MicrosoftRegistrationDB.reg

This is an sqlite database which is a flattened version of the registry tree that office would create in windows under HKCU\Software\Microsoft\Office, the format of which is quite straight-forward and documented. Some useful MRU artifacts and configuration settings reside here.

The sqlite database has the same fields as in the registry, namely - key, key_last_modified_time, value_name, value_type and value_data. This is nicely arranged in the following table structure.

Figure 1 - Database table schema

Pulling the data out is fairly simple in SQL. However, if you wish to recreate _all_ the registry paths from the flattened tree, then it’s a bit more involved. In the HKEY_CURRENT_USER table, each key has a single entry and along with the key name, you have the parent key reference. As an analyst, you would like to get full key path (i.e. HKCU\Software\Microsoft\...) for every value. There lies the problem. To recreate the paths for every single registry value, you would have to run several individual SQL queries, each query would fetch a key’s parent, and you keep doing that till you reach the ROOT of the tree. You could do this in a recursive function in python. Or you can let SQL do the recursion by running a recursive query. Sqlite supports recursive queries. You can read up about recursive queries in sqlite here and here.

The Final Query

SELECT t2.node_id, t2.write_time, path as Key, as valueName,
  HKEY_CURRENT_USER_values.value as value,
  HKEY_CURRENT_USER_values.type as valueType from
    under_software(path,name,node_id,write_time) AS 
      SELECT under_software.path || '\' ||,
        FROM HKEY_CURRENT_USER JOIN under_software ON
        ORDER BY 1
  SELECT name, path, write_time, node_id FROM under_software


Here the ‘WITH RECURSIVE’ part will perform the recursive querying for every value item. It will create the full key path for that value. The line:
SELECT under_software.path || '\' ||’ will concatenate the parent key path with the sub-key name using backslash as separator. The double-pipe ‘||’ is the concatenate operator. The 'ORDER BY 1' is not really necessary, but this makes it sort the output by the first parameter to the recursive function, i.e, path.

A python script to do this automatically is available here. This script will run the recursive query on the database and then provide both a csv and a plist as output. I chose to output as a plist too because this data is best viewed as a tree as shown below.

Figure 2 - Sample plist produced by script (viewed in plist Editor Pro)

Friday, October 28, 2016

WofCompressed streams in Windows 10

On windows 10, there is a new 'System Compression' option that compresses files using reparse points. This is not the NTFS-based compression that earlier versions of windows utilized, its different. This post is about the new compression scheme and how it affects forensic analysts.

With windows 10, a lot of details are automatically managed without user input and this is one of them. Windows can determine if the compression will be beneficial to the host system and automatically trigger it! This usually happens when you upgrade as opposed to clean installing the OS. Some users have reported seeing it as an option in 'Disk Cleanup' too.

Windows provides a utility called Compact.exe to do this processing manually. Using it, you can compress/decompress files and folders or simply query a system to determine if it will be beneficial at all on a specific volume. The compression algorithms are XPRESS (4K, 8K, 16K) or LZX. While the files are compressed on disk, if an application opens/reads such a file, it is still getting the original decompressed data and all decompression is handled on the fly automatically by windows 10.

Figure 1 - Compact.exe and its command usage info
The command 'compact /exe <file>' will compress any file (not just exe)

Lets get to the point, how does this impact forensics

Well, as of now, no tools will recognize and decompress these files. Hence, you can't read, keyword search or extract these files in their original uncompressed form.

Tools tested

Here is a list of tools tested so far:

Tool VersionSupport (as of 10/26/2016)
SIFT Workstation3No
Xways Forensic19.0No

How it works?

System compression utilizes reparse points and creates a new Alternate Data Stream (ADS) having the name 'WofCompressedData'. The compressed data is stored here. Reparse points are an NTFS feature that allow custom implementation like this. However this means that other applications that are not aware of this custom implementation will not be able to read/write to that file. In encase (or other forensic tools), you can see the file and the WofCompressedData stream. Clicking on the file just shows the contents to be all zeroes. Clicking on the stream, you can get the compressed data, but as of now, no automatic transparent decompression (as it does with NTFS compressed files). This is seen in screenshot below.

Note - This isn't to be confused with WOFF compression, which is a compression scheme used in Web Open Font Format!

Figure 2 - Encase shows the WofCompressedData stream. The file's original data was all text.
If you mount a volume containing such compressed files in SIFT Workstation or any linux system (they all use the same NTFS-3g FUSE driver), you will see the message 'Unsupported reparse point' when trying to list these files. Trying to access file contents will result in errors as seen in screenshot below.

Figure 3 - Files DW20.exe and upgrader_default.log are compressed here
If you attach a windows 10 formatted volume/disk to a Windows 7 system, you won't be able to access files as it does not know how to read them. See screenshot below:

Figure 3 - Notepad trying to view upgrader_default.log file (which is compressed)

Workarounds (till supported is added in by tool developers)

For Linux

If you use SIFT or another Linux system to do your forensics, the fix is simple. A few months back, Eric Biggers wrote a plugin to handle this. Its a plugin to the ntfs-3g FUSE driver. Its available here:

For this, you will first need to download, compile and install the latest version of the ntfs-3g driver (but not from Tuxera, that one is missing a file!); then proceed to download, compile and install the above mentioned plugin. You can get this working on SIFT with roughly the following steps:

1. Go to and download the source code for the latest stable release, right now its ntfs-3g_2016.2.22AR.1.orig.tar.gz.
2. Unzip and extract the file downloaded.
3. Open Terminal and browse to the extracted folder.
4. Compile and install using commands:
sudo make install
4. Go to and download the entire code as a zip file.
5. Unzip and extract the archive.
6. Open Terminal and browse to the extracted folder.
7. A few more tools need to be installed to compile this, so run the following commands:
sudo apt-get update
sudo apt-get install autoconf automake libtool
8. Run following commands to generate a configure script:
mkdir m4
autoreconf -i
9. Compile and install
sudo make install
10. If all went well (without errors), you are done!

Now you should be able to view and read those files normally, all decompression is handled on the fly automatically!

Figure 4 - No errors seen listing or reading files after installing the system compression plugin

For Windows

If you use Windows as your host machine for forensics processing, then you should only use a Windows 10 machine for processing evidence files that contain windows 10 images. This applies to tasks such as antivirus scanning, where you would typically share the entire disk out using Disk emulation (if you use Encase) which allow windows to parse and interpret the disk. This would only work (to read system compressed files) if the host system is Windows 10.

If you are looking to identify the system compressed files, you could filter on all files with ADS streams that have the name 'WofCompressedData'.

Fortunately, by default windows only compresses system files (EXE/DLL in windows and system32) and not user files, so you should mostly be fine. However, users can compress any file manually using the compact command.