Thursday, 3 August 2017

Finding the Serial number of a Mac from disk image

On a mac (osx/macOS), the serial number is usually not stored on the disk, it is stored in the firmware and available either printed on the backside/underside of your mac/macbook computer or accessible via software on a booted system using 'About My Mac' or System Profiler.

On recent versions of OSX, there are however a few system databases that store this information and make it available for forensic investigators to use (or for verification). These are:
  • consolidated.db
  • cache_encryptedA.db
  • lockCache_encryptedA.db
All the above files are sqlite databases located in the 'root' user's Darwin user cache folder located under /private/var/folders/zz/zyxvpxvq6csfxvn_n00000sm00006d/C/. This location should be the same for all OSX/macOS installations (10.9 & above) because UID and UUID of root is same on all systems and does not change. 
For more information on Darwin folders, see this blog post. 

Screenshot 1 - Table 'TableInfo' inside consolidated.db showing Serial Number

In the above screenshot, the serial number is seen starting with 'VM'. It starts with VM since this was a virtual machine; for real machines, you will see the actual hardware serial number here. I was able to verify this on several macs running osx 10.9 to 10.12. 

In addition, other software might retrieve and store this information too. One such software, is KeyAccess, installed by Sassafras asset management system. KeyAccess leaves behind a binary file /Library/Preferences/KeyAccess/KeyAccess Prefs which also contains the serial number.

Another place where you might find the serial is sysinfo.cache. This is created by Apple Remote Desktop and is found at /var/db/RemoteManagement/caches/sysinfo.cache.

Friday, 28 April 2017

The mystery of /var/folders on OSX

The /var/folders (which is actually /private/var/folders) has been a known but undocumented entity for a very long time. Apple does not document why its there, or what it is. But there are plenty of guides online that suggest it may be good to periodically clear those folders to recover disk space and perhaps speed up you system. Whether that is wise is beyond the scope of this blog post.

If you've ever done a MacOS (OSX) forensic examination, you've probably noticed some odd folders here, all two character folder names with long random looking subfolders. Something like /var/folders/_p/vbvyjz297778zzwzhfpsqb2w0000gl/. A level below are 3 folders C, T, 0 and you can then see familiar data files under those. The 3 folders represent Cache (C), Temporary files (T) and User files (0)

On a live system these can be queried using the command 'getconf VARIABLE' where VARIABLE can be DARWIN_USER_CACHE_DIR, DARWIN_USER_TEMP_DIR or DARWIN_USER_DIR.

These are locations where mac stores cache and temporary files from various programs and system utilities. They are further segregated on a per-user basis. So each of those folders (Example: /var/folders/_p/vbvyjz297778zzwzhfpsqb2w0000gl) represents a single user's temporary space. 

Whats in it for forensicators?

From a forensics examination perspective, there is not a lot of artifacts here. However some useful artifacts like Notifications databases and Quicklook thumbnailcache databases are located here. The Launchpad dock database is also here. Sometimes you can find useful tidbits of cache from other apps too.

Figure: /private/var/folders on MacOS 10.12 (Sierra) 

It would be nice to be able to determine which user owned a particular folder when analyzing the artifacts within it. This is actually really easy, as you can just look up the owner uid of the folder. But if you are more interested in how the name gets generated, read on.

Reverse engineering the Algorithm

There is an old forum post here that does not provide the algorithm but hints that its likely generated from uuid and uid. Both of which are available from the user's plist (under /var/db/dslocal/nodes/Default/users/<USER>.plist). From the narration, the character set used in the folder names would be '+-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'. However that's not what is seen on newer macs (10.10 onwards)

After analyzing the folder names on my mac test machines, the data set was narrowed down to 0-9, underscore and all alphabets except vowels (aeiou). A little bit of research confirmed this, when the string '0123456789_bcdfghjklmnpqrstvwxyz' was seen in the libsystem_coreservices.dylib binary. The string is 32 char in size, so you would need 5 bits to represent an index which could point to a specific character in the string. After a bit of experimentation with creating a few user accounts and setting custom UUIDs, it was clear how this worked. The  algorithm takes the 128 bit UUID string as a binary bitstream, appends to it the binary bitstream of the UID (4 byte), then runs a single pass over that data. For each 5 bits it reads, it uses that as an index to get a single char from the charset array and copies that to output. A python implementation that generate these folder names (for both new osx versions and older ones) is provided here.


Thursday, 5 January 2017

Flexing SQL muscle for parsing an MS db on OSX

This post is about using recursive SQL queries (or rather a single recursive query) to parse the MicrosoftRegistrationDB.reg file created by Microsoft office on Mac OSX systems.

A little background..

On OSX (mac), there is no registry. Most apps just use plist files instead to save local information. Microsoft Office leaves cache and configuration information in plist files like every other OSX application. However it also keeps a copy in this file – microsoftRegistrationDB.reg. The file can be found here –

/Users/research/Library/Group Containers/xxxxxxxxxx.Office/MicrosoftRegistrationDB.reg

This is an sqlite database which is a flattened version of the registry tree that office would create in windows under HKCU\Software\Microsoft\Office, the format of which is quite straight-forward and documented. Some useful MRU artifacts and configuration settings reside here.

The sqlite database has the same fields as in the registry, namely - key, key_last_modified_time, value_name, value_type and value_data. This is nicely arranged in the following table structure.

Figure 1 - Database table schema

Pulling the data out is fairly simple in SQL. However, if you wish to recreate _all_ the registry paths from the flattened tree, then it’s a bit more involved. In the HKEY_CURRENT_USER table, each key has a single entry and along with the key name, you have the parent key reference. As an analyst, you would like to get full key path (i.e. HKCU\Software\Microsoft\...) for every value. There lies the problem. To recreate the paths for every single registry value, you would have to run several individual SQL queries, each query would fetch a key’s parent, and you keep doing that till you reach the ROOT of the tree. You could do this in a recursive function in python. Or you can let SQL do the recursion by running a recursive query. Sqlite supports recursive queries. You can read up about recursive queries in sqlite here and here.

The Final Query

SELECT t2.node_id, t2.write_time, path as Key, as valueName,
  HKEY_CURRENT_USER_values.value as value,
  HKEY_CURRENT_USER_values.type as valueType from
    under_software(path,name,node_id,write_time) AS 
      SELECT under_software.path || '\' ||,
        FROM HKEY_CURRENT_USER JOIN under_software ON
        ORDER BY 1
  SELECT name, path, write_time, node_id FROM under_software


Here the ‘WITH RECURSIVE’ part will perform the recursive querying for every value item. It will create the full key path for that value. The line:
SELECT under_software.path || '\' ||’ will concatenate the parent key path with the sub-key name using backslash as separator. The double-pipe ‘||’ is the concatenate operator. The 'ORDER BY 1' is not really necessary, but this makes it sort the output by the first parameter to the recursive function, i.e, path.

A python script to do this automatically is available here. This script will run the recursive query on the database and then provide both a csv and a plist as output. I chose to output as a plist too because this data is best viewed as a tree as shown below.

Figure 2 - Sample plist produced by script (viewed in plist Editor Pro)