Saturday, December 29, 2018

Making NSKeyedArchives human readable

If you've been doing macOS analysis, you are definitely familiar with the (now not so new) serialized plist format also known as an NSKeyedArchive. There are parsers available to extract data from this format, such as the ccl_bplist from Alex Caithness (some more sophisticated ones listed at the bottom of this post). I've been using this library in mac_apt and other projects too. That is all old news.

Need for a human readable version

I always like to manually verify the results of my code by looking at the raw plist values to make sure my analysis programs parsed the right values as well as if they missed something. That isn't possible with NSKeyedArchives.

The ccl_bplist parser is great if you want to explore the structure of a plist interactively with python or programatically. What it does not do is auto-generate and save the deserialized version of that plist as a new human readable plist file. So that is what I set out to do yesterday. 

Long story short, the code is available on github here

Below are screenshots showing an NSKeyedArchive sfl2 file (Figure 1) and its deserialized human readable form (Figure 2).
Figure 1 - NSKeyedArchive

Figure 2 - Deserialized form of NSKeyedArchive

Using ccl_bplist and biplist, this should have been a 2-5 line program, ccl_bplist to deserialize and biplist to write out the new plist. However it turned out to be a few lines more than that as I had to write a short recursion function to process the plist data because the ccl_bplist also includes $class information which needed to be stripped out. Also as I found out, this will only work with Python 3 because Python 2 does not distinguish between the types str and bytes. In Python 2, all these (below) are the same:

This creates a problem as biplist cannot distinguish between binary blobs and strings as they all appear to be strings and it fails when it encounters a byte that cannot be encoded as a string. No problems with python 3 as these are different distinct types there.

The code is here.


Since writing this, I did find another library (bpylist2) that has similar functions, reading/writing binary plists as well as creating/reading keyed archives. I haven't tested it yet.

(Update) Our good friends at Aon also maintain a library for plist parsing which seems pretty solid and worth checking out here.

Saturday, November 24, 2018

$Recycle bin and Undo operations

This week Phil Moore made an excellent finding (link here), one that most of us have seen for years but not investigated. Those $I files that seem orphaned/abandoned without explanation now have one. Phil notes that every time a file/folder is deleted and then restored, the $I file stays behind in the recycle bin. This post is about some follow up testing and results.

I did some quick tests - deleting and restoring files in a few different ways testing $I & $R file creation every time. I'm not even looking at indexes or timestamps, just file creation/deletion. My testing was on Windows 10 (32bit) version 10.0.10586.106

I'm not reproducing all the output here, but only the summary. Here are a few ways to send files to the recycle bin. Remember you cannot do any operations using the command line (as that operates directly at the file system level and does not use the recycle bin abstraction).

First we delete a file (right-click and select Delete). Let's try to restore now using any one of the following ways:

1. Right-click the file, click Restore on file in the recycle bin
2. Cut file from recycle bin and Paste elsewhere
3. Drag the file from recycle bin and into another folder
4. Undo the last operation using Ctrl-Z  OR  right-click & select Undo Delete

Only the 4th method (Undo Delete) results in deletion of the $I file. The other methods leave it behind.

I believe this to be a windows bug. Interestingly, when you Restore a file from the bin (right-click & Restore), then right-click on the desktop (or in any folder), the context menu has an new item called 'Undo Move'. It does NOT say 'Redo Delete'. For every other action, you will see a 'Redo ACTION' in the menu (see figure 2). So I believe when you restore a file, windows just performs a file move (on the $R file) and thus marked it as a move in the last performed action. 

Figure 1 - 'Undo Move' seen after restoring file
Now, clicking on the Undo Move will result in sending the file back to the recycle bin but as a new $I & $R pair, as if it was an entirely new delete operation. 

Figure 2 - 'Redo Rename' seen after using 'Undo Rename' from an unrelated file rename operation
In figure 3 below, you can see multiple abandoned $I files after doing this multiple times (delete, then restore).

Figure 3 - Recycle bin folder after multiple deletes and restores showing abandoned $I files
Let us know (comments or twitter) if you know of more ways to delete/restore.

Tuesday, November 6, 2018

The ._ (dot-underscore) file format

If you've ever looked at removable media and found several hidden files which start with ._ and there exists one for almost every file (or folder) on the disk, this is the result of having that media being used on macOS.

macOS keeps a copy of file metadata in a separate area, known as Extended Attributes (xattr) on HFS+ or APFS. However, when writing to external media which is not formatted as HFS+ or APFS (thus not having the capability to store extended attributes), it will write this information out as a separate file which will have the same name, just prefixed with dot-underscore  ._  as seen in the screenshot below.

Figure 1 - Screenshot showing exFAT volume on External USB disk
While this is well known for many years, this information is often overlooked in a forensic investigation. On media that has interacted with both macOS and Windows (or even Linux), macOS will create these files and delete them too when the original file is deleted. However, if the file is deleted or renamed on Windows or Linux, then the dot-underscore files will be left behind untouched. A while back, Lee Whitfield touched upon this here specifically pointing out its use for knowing the date & time that a file was copied onto the media. However, there is useful information inside the file too. 

This file can contain useful metadata such as kMDItemWhereFroms (URL of file if downloaded from internet) and kMDItemDownloadedDate (Date & Time when it was downloaded) among other extended attributes.

After a bit of reverse engineering, I wrote an 010 hex editor template to parse this information, it is  available at

In the screenshot below, you can see it being run on one such file. 

Figure 2 - template output

Here is analysis from the data extracted:

Attribute Name Value Meaning
kMDItemWhereFroms URL from where it was downloaded
kMDItemDownloadedDate 0x41BFB51D1CFFA4F8 (11/09/2017 23:32:44) Timestamp when file was downloaded 0083;5a04e59c;Safari;A451620D-2B49-49BD-ADC1-88DEBEA66582 File was downloaded using the Safari browser*

kMDItemDownloadedDate is stored in a plist as a date value, which is stored as a 64-bit double that is the number of seconds since 01/01/2001
The template does not parse the plist for you, you can export it out and open in any plist viewer to view the human-readable date value.

*The 3rd item in's value (separated by ;) is the Application (agent) name which downloads the file. For more details on this, read Howard Oakley's blog post

Tuesday, October 16, 2018

The user spotlight database

On macOS, the spotlight database is a central database holding metadata of all files/folders that macOS indexes and is always located at the root of any disk under /.Spotlight-V100

However, while browsing the folders on my macOS 10.14 (Mojave) image, I find a folder that contains yet another spotlight database. It appears that there are now more than one spotlight databases on a single disk. There is one for each user located at:
As with the other spotlight database, the files that hold the information are store.db and .store.db.

Mojave (10.14) isn't the first version of macOS to include this database. This appeared first in High Sierra (10.13).

What is in it?

The per-user database store is used to store metadata from items that aren't files or folders. Items seen so far are:
  • Safari browser history (web pages visited)
  • Safari browser bookmarks
  • News App history (web pages visited)
  • Notes App notes
  • Maps App data (locations?)
I would speculate that emails would also be seen here, as a number of email related fields are present too:
  • kMDItemRecipientEmailAddresses
  • kMDItemPrimaryRecipientEmailAddresses
  • kMDItemAdditionalRecipientEmailAddresses
  • kMDItemHiddenAdditionalRecipientEmailAddresses

However no email metadata was seen with a single configured IMAP account in the Mail app. 

Since this is a test environment with very little activity and almost no apps other than those that come with macOS, there is likely to be a lot more metadata from different apps in this database on real world systems.

Why a separate database?

Reading Apple's documentation here and here seems to suggest that this is the implementation of functionality intended to allow app developers to provide in-app content searches and includes the ability to define metadata to do so. Items indexed are not required to be files.

Prior to this (10.12 and below) there existed a folder at

which housed all the *.webhistory files. See pic below.

Figure 1 - webhistory files in ~/Library/Caches/Metadata/Safari/

The individual files were plists which were then indexed by spotlight.

Figure 2 - webhistory file content
That folder now does not exist and in its place we have the new spotlight database.

Parsing the data

Using mac_apt's single_plugin script to only run the spotlight plugin over individual store.db files, we can easily parse the database.

Figure 3 - mac_apt_singleplugin to parse the store.db file

There are quite a few fields, some more important than others. Below are some screenshots showing selected data (notes, safari history, and news). mac_apt gives you the data as a spreadsheet, sqlite db and a flat text file too (similar to mdls output).

Figure 4 - notes metadata from store.db (not all fields shown here)

Figure 5 - safari history from store.db (not all fields shown here)

Figure 6 - A single entry from News app showing all metadata (parsed from store.db)

mac_apt's spotlight plugin has been updated to automatically handle/process these user spotlight databases now. 

Tuesday, August 21, 2018

An open source spotlight parser

Spotlight is the name of the indexing system which comes built into macOS. It is responsible for continuous indexing of files and folders on all attached volumes. It keeps a copy of all metadata for almost every single file and folder on disk.

Thus, it can provide some excellent data for your investigation. While much of the same information can be obtained if you have access to the full disk image, it is known that there is information in this database that is not available elsewhere. Details like Date(s) Last Opened or Number of Times (an application or file) is Opened/Used are not available anywhere else on the file system. Unfortunately though it uses a proprietary undocumented format, and no publicly available code existed to read it. So over the last few months, I’ve been studying the file format of these databases and have created a tool/library to read and extract the data contained within.

The library and tool are open sourced now and located here:

The format of the database will be discussed in a later post.

For those familiar with macOS, you know this data (contained in the database) can be obtained on a locally mounted volume using the macOS built-in mdls utility. However to do this, you need to mount your disk image on a mac to do so and the utility can only be run on individual files/folders, not the entire disk. It can be run recursively (with a bit of command line fu) on the entire volume but the output is not easy to read then.

If you don't prefer to do that, run spotlight_parser instead. Just point it to the database files which are named store and .store (located in the /.Spotlight-V100/Store-V2/<UUID> folder) and let it parse out the complete database for you.

Here is a screenshot of spotlight_parser running. Depending on how much data is contained in the database, this can take anywhere between a few seconds to a few minutes (5-10 on very large disks with lots of files).

Figure 1 - Running spotlight_parser

Once done, you will have 2 files as output. One is a text file (prefix_data.txt) containing the database dump of all entries. The other is a CSV (actually tab separated) which tries to build a path for every file/folder using inode number (CNID) from data available in the database. Since not every single folder may be included, some paths may not resolve and you might get ..NOT FOUND.. in the path sometimes along with an error on the console as seen above.

In the prefix_data.txt file, you will see some XML content (configuration information) at the beginning followed by database entries for files and folders.
Below is a snippet of the prefix_data.txt file, showing only output for a single jpg image file.

Figure 2 - Output showing a single jpg file's metadata information from database

Here the text in Red is metadata pertaining to a single entry in the database, including the date and time it was last updated. This is followed by the metadata itself. The items in Blue are information only available in the spotlight database. The last two may be of particular interest to an investigator.

Note - The screenshot above is not from any special version of the code, actual output is plain text, it has no coloring! Colors were added just for explanation.

The spotlight_parser has been incorporated into mac_apt as the SPOTLIGHT plugin. In mac_apt, the output is also available in an sqlite database making it easier to query. Mark McKinnon has forked a version of this library and also added sqlite capability, it is available here.

While this exposes the data and makes it available, it is still not easy to query. Perhaps one of these days, I will write a GUI application with drop-down boxes for easily accessing and querying the output data.

Friday, July 20, 2018

APFS template for 010 Editor

For quite some time, I've been analyzing APFS mostly with custom python code, which is not very efficient and rather time consuming and is not visual. Since most people doing any kind of serious hex editing use the 010 Editor (as do I), this was long overdue.

I've created an 010 template, which is basically a port from the apfs.ksy project. This has taken quite a bit of time and I hope you find it useful. Not all structures are known, there are some parts that may be incorrect. This is a work in progress as more details about APFS emerge..


The template will not parse out the file system tree yet. With APFS this is challenging to do within 010's template capabilities as you cannot create local objects or classes and/or store temporary objects. The template does however define most of the structures and will follow most pointers (to other disk blocks and parse them) automatically when you start expanding the structures in the template viewer.

To use the template, simply load your APFS image (unencrypted only) into 010. Then edit the template to set the Apfs_Offset variable to the byte offset of wherever your APFS partition starts. Now run the template. The APFS start offset can be located easily by running the GPT template (which you can find on 010's website or in the program's template repository). The GPT template will give you the sector offset, multiply it by sector size (usually 512 or sometimes 4096) to get the byte offset (location) of the APFS partition.

Thursday, May 3, 2018

Bash sessions in macOS (and why you need to understand its working)

While all versions of macOS have provided bash_history for users, since macOS 10.11 (El Capitan), we get even more information on terminal history through the bash sessions files. This is not a replacement for the old .bash_history file which is still there.

There are several problems with bash_history - you cannot tell when any command in that file was run, the sequence of commands may not be right, and so on. For more on that, refer Hal Pomeranz's excellent talk - You don't know jack about Bash history

Even if there were no anomalies and only a single terminal was always in use, there is still the issue of how do I know which command was run when? With Bash sessions, macOS gives us more data to work with. Since El Capitan, every new terminal window will be tracked independently with a TERM_SESSION_ID which appears to be a randomly generated UUID.

Figure 1 - Fetching terminal's session id

Each session can also be restored when you shutdown and restart your machine with the "Reopen windows when logging back in" option set. Perhaps for this purpose, session history (a subset of bash history) is tracked and saved separately on a per session basis.

Figure 2 - Restored session

Show me the artifacts!

The location you want to go to is  /Users/<USER>/.bash_sessions

You will find 3 files for each session as seen in screenshot below.

Figure 3 - .bash_sessions folder contents

TERM_SESSION_ID.history    --> Contains session history
TERM_SESSION_ID.historynew --> Mostly blank/empty
TERM_SESSION_ID.session    --> Contains the last session resume date and time

Figure 4 - Sample .session file

Figure 5 - Sample .history file showing commands typed at terminal 

How this helps?

Some (but not all) of the problems associated with reading .bash_history are now gone.
Theoretically, as bash history is now also stored on a per session basis, this should make it trivial to track commands run in different windows (sessions). If you were expecting history for a single session in its .history file, then you thought wrong. The .history file contains all previous history (from earlier sessions) and then appended at the very end, the history for this session.

So can we reliably break apart commands per session? Is the sequence of commands intact? Let's run a small experiment to find out.

We create two sessions (2 terminal windows) and run a few commands in each session. Commands are interspersed, so we run a command in Session-1, then another in Session-2 and then again something in Session-1. We will try to see if order is maintained.

Session-1 started 9:44
Session-2 started 9:51
Figure 6 - Commands run with their sequence

Session-1 closed 9:57
Session-2 closed 9:59

Session-1 is closed first, followed by Session-2.  Here is a snippet of relevant metadata from the resulting files:

Figure 7 - Relevant metadata from stat command

Fun Facts

The start and stop time for a session is available if you look at the crtime (File Created time) for the .history and .historynew files. These are in bold in the screenshot above.

Created Time of TERM_SESSION_ID.historynew = Session created time
Created Time of TERM_SESSION_ID.history        = Session end time

Isolating session data

By comparing the data in various .history files (from different sessions), you can find out exactly which commands belong to a particular session. See pic below, where lines 1-181 (not shown) are from older history (other past sessions). Lines 182-184 are from Session-1 and are seen in its history file at the end. Session-2 (closed after Session-1) has the same format, ie, old session history with this session's history appended (lines 185-189).

Figure 8- .history files from Session-1 (Left) and Session-2 (Right)

This is easily done in code and the mac_apt BASHSESSIONS plugin parses this information to break out the individual commands per session, along with session start and stop time.

While you still cannot get the exact time when an individual command was run, the sessions functionality does give you a very good narrowed time frame to work with. While we do not have the absolute order of commands ("cp -h" was run before "printenv"), we do have a narrowed time-frame for the set of commands ("cp-h" run between 9:51-9:59 and "printenv" run between 9:44-9:57). This is a big thing for analysts and investigators!

Friday, February 2, 2018

Reading Notes database on macOS

The notes app comes built-in with every OSX/macOS release since OSX 10.8 (Mountain Lion). It is great for quick notes and keeps notes synced in the cloud. It can have potentially useful tidbits of user information in an investigation.

Artifact breakdown  & Forensics


Depending on version of macOS, this can vary. Sometimes there is more than one database, probably as a result of an upgrade! But only one is actively used at any given time (not both). So far, we haven't seen any duplicate data when two are present.

Location 1

In here, databases will be named as one of the following:
NotesV1.storedata ← Mountain Lion  (Thanks Geoff Black)
NotesV2.storedata ← Mavericks
NotesV4.storedata ← Yosemite
NotesV6.storedata ← Elcapitan & Sierra
NotesV7.storedata ← HighSierra
If a note has an attachment, then the attachment is usually stored at the following location:

There does not appear to be much difference between the database types. The following tables have been seen.
Tables for NotesV2
Tables for NotesV6

Each note can be associated with either a local account or an online one. The following account information can be obtained.  

Account email address, id and username from ZACCOUNT table

Individual note data is stored in ZNOTEBODY and the rest of the tables provide information about note parent folder, sync information, and attachment locations. 

Notes converts all data to HTML as seen below.

ZNOTES Table with note Html content
The graphic below shows how you can find and resolve note attachments to their locations on disk. If an attachment is present, ZNOTEBODY.ZHTMLSTRING will contain the UUID of that which can be matched up to the ZATTACHMENT.ZCONTENTID to get a binary plist blob. When parsed, you can find the full path to the attachment in the plist.

Resolving attachment location

The dates and times fetched are Mac Absolute time, which is the number of seconds since 1/1/2001.

Reading the data

The following SQL query will pull out most pertinent information from this database:

SELECT n.Z_PK as note_id, datetime(n.ZDATECREATED + 978307200, 'unixepoch') as created, datetime(n.ZDATEEDITED + 978307200, 'unixepoch') as edited, n.ZTITLE, (SELECT ZNAME from ZFOLDER where n.ZFOLDER=ZFOLDER.Z_PK) as Folder,(SELECT zf2.ZACCOUNT from ZFOLDER as zf1  LEFT JOIN ZFOLDER as zf2 on (zf1.ZPARENT=zf2.Z_PK) where n.ZFOLDER=zf1.Z_PK) as folder_parent,ac.ZEMAILADDRESS as email, ac.ZACCOUNTDESCRIPTION, b.ZHTMLSTRING, att.ZCONTENTID, att.ZFILEURLFROM ZNOTE as nLEFT JOIN ZNOTEBODY as b ON b.ZNOTE = n.Z_PKLEFT JOIN ZATTACHMENT as att ON att.ZNOTE = n.Z_PKLEFT JOIN ZACCOUNT as ac ON ac.Z_PK = folder_parent

Location 2

/Users/<USER>/Library/Group Containers/
This one has been seen on El Capitan, Sierra and HighSierra. Attachments are stored in the Media folder located here:
/Users/<USER>/Library/Group Containers/<UUID>/
Here UUID is the unique identifier for each attachment. The database scheme is different here.

NoteStore.sqlite in HighSierra

NotesStore.sqlite in ElCapitan

Only account name and identifier (UUID) is available. To get full account information, this will need to be correlated with the account info database stored elsewhere.
Note data is available in the ZICNOTEDATA table. 

That ZICNOTEDATA.ZDATA blob is gzip compressed. Upon decompression, it reveals the note data stored in a proprietary unknown binary format. As seen below, you can spot the text, its formatting information and attachment info.

ZDATA gzipped blob showing gzip signature (1F8B08)
Uncompressed ZDATA blob showing text, formatting and attachment info

Most notable in this database is the presence of several timestamps-
Attachment Modified → ZMODIFICATIONDATE
Attachment Preview Updated → ZPREVIEWUPDATEDATE

Reading NoteStore.sqlite

The following SQL query will pull out most pertinent information from this database:
SELECT n.Z_12FOLDERS as folder_id , n.Z_9NOTES as note_id, d.ZDATA as data,
c2.ZTITLE2 as folder,
datetime(c2.ZDATEFORLASTTITLEMODIFICATION + 978307200, 'unixepoch') as folder_title_modified,
datetime(c1.ZCREATIONDATE + 978307200, 'unixepoch') as created,
datetime(c1.ZMODIFICATIONDATE1 + 978307200, 'unixepoch')  as modified,
c1.ZSNIPPET as snippet, c1.ZTITLE1 as title, c1.ZACCOUNT2 as acc_id,
c5.ZACCOUNTTYPE as acc_type, c5.ZIDENTIFIER as acc_identifier, c5.ZNAME as acc_name,
c3.ZMEDIA as media_id, c3.ZFILESIZE as att_filesize,
datetime(c3.ZMODIFICATIONDATE + 978307200, 'unixepoch') as att_modified,
datetime(c3.ZPREVIEWUPDATEDATE + 978307200, 'unixepoch') as att_previewed,
c3.ZTITLE as att_title, c3.ZTYPEUTI, c3.ZIDENTIFIER as att_uuid,
c4.ZFILENAME, c4.ZIDENTIFIER as media_uuid
ORDER BY note_id
On HighSierra, use this query:
If you are looking for an automated way to read this, use mac_apt, the NOTES plugin will parse it.