Friday, March 2, 2012

Enscript Tutorial 1 - Parse XP System Restore Logs

This post is about enscript programming using a real world example, parsing of the windows XP restore point logs to extract original filename and file path information of files collected by System Restore. Link to complete code file is at the bottom of this post.

System Restore
In windows XP, the system restore functionality creates backup copies of files under the "<DRIVE>\System Volume Information" folder. These files are identical copies of the original in data and metadata except for two details, the filename and path. The filename consisting of “name. extension” has the correct extension but the name is now "Axxxxxxx.ext" where x denotes a number. So commonly files like A0000001.ini, A0000002.txt and A0000005.lnk are seen here.

For analysis, we need to know the filename and full path of the Axxxxxxx files. This information can be parsed from the change.log or change.log.x files found in the same folder. We shall not discuss the format of the change.log file as it is documented elsewhere on the web in detail. Instead we shall focus on how to write a program to accomplish this task using enscript.

How to parse file name and path from change.log?
We shall use a shortcut instead of reading the entire structure of the file. After studying the format, I have understood that the path is present in a structure that begins with the signature (hex string) "12EFCDAB" and is 64 bytes from the signature. We shall therefore search for the hex string "12EFCDAB" and then skip 64 bytes to get to the path string. 8 bytes beyond the path is the filename. This will be our strategy.

Searching in enscript
The method to search is not a one liner and takes some setup code. It is roughly 4 steps:
  1. Create SearchClass object
  2. Add keyword(s)
  3. Open a file and use the search object to find hits
  4. Iterate through the hits array to get details of every hit

Program logic and concepts
We have broken down our program into a set of small manageable tasks for ease of understanding.
  1. Recurse folders to find change.log.x files in System Volume Information

    The change.log files will always have a full path resembling
    "\System Volume Information\_restore{GUID}\RPxx\change.log.x"
    We can use this information to craft logic as follows:

    forall (EntryClass e in c.EntryRoot()) {
      if (e.Name().Find("change.log") == 0 &&
          e.Parent().Name().Find("RP") == 0 &&
          e.Parent().Parent().Name().Find("_restore{") == 0)
      // Found change.log, now process it here

  2. Search file for our artifact signature (12EFCDAB)

    SearchClass search();  // create SearchClass object
    search.AddKeyword("\\x12\\xEF\\xCD\\xAB....\\x15\\x00\\x00\\x00....", KeywordClass::GREP);   // add 16 byte GREP keyword
    search.Create();       // initialize search object
    EntryFileClass file(); // create file object
    file.Open(entry);      // open entry as file
    search.Find(file);     // execute the search on the file
    foreach (SearchClass::HitClass hit in search.GetHits()) {
      // Process search hits here

  3. Parse artifacts and print to console

    Once we have the hit object, we can seek into the file at the appropriate offsets and read out the data.

    file.Seek(hit.Offset() + 64); // seek to 64 from hit offset
    String fullpath;
    String newfilename;
    file.SetCodePage(CodePageClass::UNICODE); // codepage set to unicode
    file.ReadString(fullpath);    // Read path
    file.Skip(8);                 // Skip 8 bytes
    file.ReadString(newfilename); // Read name, ie, Axxxxxxx
    Console.WriteLine(newfilename + "      " + fullpath); // Print information to console

No comments:

Post a Comment