The /var/folders (which is actually /private/var/folders) has been a known but undocumented entity for a very long time. Apple does not document why its there, or what it is. But there are plenty of guides online that suggest it may be good to periodically clear those folders to recover disk space and perhaps speed up you system. Whether that is wise is beyond the scope of this blog post.
If you've ever done a MacOS (OSX) forensic examination, you've probably noticed some odd folders here, all two character folder names with long random looking subfolders. Something like /var/folders/_p/vbvyjz297778zzwzhfpsqb2w0000gl/. A level below are 3 folders C, T, 0 and you can then see familiar data files under those. The 3 folders represent Cache (C), Temporary files (T) and User files (0)
On a live system these can be queried using the command 'getconf VARIABLE' where VARIABLE can be DARWIN_USER_CACHE_DIR, DARWIN_USER_TEMP_DIR or DARWIN_USER_DIR.
These are locations where mac stores cache and temporary files from various programs and system utilities. They are further segregated on a per-user basis. So each of those folders (Example: /var/folders/_p/vbvyjz297778zzwzhfpsqb2w0000gl) represents a single user's temporary space.
Whats in it for forensicators?
From a forensics examination perspective, there is not a lot of artifacts here. However some useful artifacts like Notifications databases and Quicklook thumbnailcache databases are located here. The Launchpad dock database is also here. Sometimes you can find useful tidbits of cache from other apps too.
|Figure: /private/var/folders on MacOS 10.12 (Sierra)|
It would be nice to be able to determine which user owned a particular folder when analyzing the artifacts within it. This is actually really easy, as you can just look up the owner uid of the folder. But if you are more interested in how the name gets generated, read on.
Reverse engineering the Algorithm
There is an old forum post here that does not provide the algorithm but hints that its likely generated from uuid and uid. Both of which are available from the user's plist (under /var/db/dslocal/nodes/Default/users/<USER>.plist). From the narration, the character set used in the folder names would be '+-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'. However that's not what is seen on newer macs (10.10 onwards)
After analyzing the folder names on my mac test machines, the data set was narrowed down to 0-9, underscore and all alphabets except vowels (aeiou). A little bit of research confirmed this, when the string '0123456789_bcdfghjklmnpqrstvwxyz' was seen in the libsystem_coreservices.dylib binary. The string is 32 char in size, so you would need 5 bits to represent an index which could point to a specific character in the string. After a bit of experimentation with creating a few user accounts and setting custom UUIDs, it was clear how this worked. The algorithm takes the 128 bit UUID string as a binary bitstream, appends to it the binary bitstream of the UID (4 byte), then runs a single pass over that data. For each 5 bits it reads, it uses that as an index to get a single char from the charset array and copies that to output. A python implementation that generate these folder names (for both new osx versions and older ones) is provided here.