Skip to content

How To; Use Skip Data Mode to Handle Data

Ahmed Garhy edited this page Jan 12, 2019 · 1 revision

Overview

Often, you might be interested in detecting, and possibly handling, when an invalid, or broken, instruction is encountered. Most of the time, when an invalid instruction is encountered it is because data is mixed-in in the binary code that is being disassembled. This option is called Skip Data Mode. More information can be found on the Capstone website.

By default, Skip Data Mode is disabled, a disassemble operation prematurely stops when data is encountered, and any instructions disassembled successfully prior to that are returned. This makes it difficult, if not impossible, to determine if the returned instructions represent the entire binary code or just a subset of it because data was encountered and the disassemble operation prematurely stopped.

You can however override this default behavior by enabling Skip Data Mode and either 1) let Capstone automatically determine how many bytes to skip over in the binary code once it encounters data and continue disassembling at the next valid instruction it finds, or 2) manually tell Capstone exactly how many bytes to skip in the binary code once it encounters data and effectively point it to the next valid instruction to continue disassembling from. In either case the disassemble operation does not prematurely stop.

Disassembling With Skip Data Mode Disabled

Let's look at an example where Skip Data Mode is disabled and a disassemble operation prematurely stops when data is encountered:

using Gee.External.Capstone;
using Gee.External.Capstone.Arm;

using (CapstoneArmDisassembler disassembler = CapstoneDisassembler.CreateArmDisassembler(ArmDisassembleMode.Arm)) {
    disassembler.EnableInstructionDetails = true;
    disassembler.DisassembleSyntax = DisassembleSyntax.Intel;

    var binaryCode = new byte[] {
        0xed, 0x00, 0x00, 0x00, 0x00, 0x1a, 0x5a, 0x0f, 0x1f, 0xff, 0xc2, 0x09, 0x80, 0x00, 0x00, 0x00,
        0x07, 0xf7, 0xeb, 0x2a, 0xff, 0xff, 0x7f, 0x57, 0xe3, 0x01, 0xff, 0xff, 0x7f, 0x57, 0xeb, 0x00,
        0xf0, 0x00, 0x00, 0x24, 0xb2, 0x4f, 0x00, 0x78
    };

    // ...
    //
    // This operation will only return 5 instructions because it encounters data
    // after the 5th instruction and Skip Data Mode is disabled.
    ArmInstruction[] instructions = disassembler.Disassemble(binaryCode);
}

The above example is relatively simple and illustrates that when Skip Data Mode is disabled and a disassemble operation prematurely stops when data is encountered, it is difficult to determine if the returned instructions represent the entire binary code or just a subset of it.

Disassembling With Skip Data Mode Enabled

Let's look at an example where Skip Data Mode is enabled and let Capstone automatically determine how many bytes to skip over in the binary code once it encounters data and continue disassembling at the next valid instruction it finds:

using Gee.External.Capstone;
using Gee.External.Capstone.Arm;

using (CapstoneArmDisassembler disassembler = CapstoneDisassembler.CreateArmDisassembler(ArmDisassembleMode.Arm)) {
    // ...
    //
    // Enable details for disassembled instructions. Even though instruction details are enabled, skipped data
    // instructions will never have them.
    disassembler.EnableInstructionDetails = true;
    disassembler.DisassembleSyntax = DisassembleSyntax.Intel;

    // ...
    //
    // By enabling Skip Data Mode, we let Capstone will automatically skip over data and continue
    // disassembling at the next valid instruction it finds.
    disassembler.EnableSkipDataMode = true;

    var binaryCode = new byte[] {
        0xed, 0x00, 0x00, 0x00, 0x00, 0x1a, 0x5a, 0x0f, 0x1f, 0xff, 0xc2, 0x09, 0x80, 0x00, 0x00, 0x00,
        0x07, 0xf7, 0xeb, 0x2a, 0xff, 0xff, 0x7f, 0x57, 0xe3, 0x01, 0xff, 0xff, 0x7f, 0x57, 0xeb, 0x00,
        0xf0, 0x00, 0x00, 0x24, 0xb2, 0x4f, 0x00, 0x78
    };

    // ...
    //
    // This operation will return 10 instructions because it encounters data and skipped
    // over it because Skip Data Mode is enabled.
    ArmInstruction[] instructions = disassembler.Disassemble(binaryCode);
    foreach (ArmInstruction instruction in instructions) {
        if (!instruction.IsSkippedData) {
            // ...
            //
            // Skipped data instructions will never have any details, even though instruction details are
            // enabled. An exception is thrown otherwise!
            ArmInstructionDetail instructionDetails = instruction.Details;
        }
    }
}

The above example is relatively simple and illustrates 2 important things. First, we enabled instruction details and enabled Skip Data Mode. Second, while we are enumerating the instructions, we check whether or not each iterated instruction is a skipped data instruction using ArmInstruction.IsInvalid. Checking whether or not an instruction is a skipped data instruction when Skip Data Mode is enabled is incredibly important because even if instruction details are enabled, they will not be available for skipped data instructions. An exception is thrown if you attempt to access a skipped data instruction's details.

It's worth noting that when Skip Data Mode is disabled, you'll never have an skipped data instruction, like mentioned earlier, because the disassemble operation prematurely stops when data is encountered. So checking for whether an instruction is a skipped data instruction when Skip Data Mode is disabled is moot.

Disassembling With Skip Data Mode Enabled and a Callback

Let's look at an example where Skip Data Mode is enabled and manually tell Capstone exactly how many bytes to skip in the binary code once it encounters data and effectively point it to the next valid instruction to continue disassembling from:

using Gee.External.Capstone;
using Gee.External.Capstone.Arm;

using (CapstoneArmDisassembler disassembler = CapstoneDisassembler.CreateArmDisassembler(ArmDisassembleMode.Arm)) {
    // ...
    //
    // Enable details for disassembled instructions. Even though instruction details are enabled, skipped data
    // instructions will never have them.
    disassembler.EnableInstructionDetails = true;
    disassembler.DisassembleSyntax = DisassembleSyntax.Intel;
    
    // ...
    //
    // By enabling Skip Data Mode with a callback, we manually tell Capstone exactly how many bytes to skip in the binary 
    // code once it encounters data and effectively point it to the next valid instruction to continue 
    // disassembling from.
    disassembler.EnableSkipDataMode = true;
    disassembler.SkipDataCallback = (c, i) => {
        // ...
        //
        // c == the binary code that is being disassembled. i == the index of the byte that could not be disassembled.
        // Let's assume we analyzed the binary code here and decided to tell Capstone to skip 2 bytes to the next valid
        // instruction.
        return 2;
    };

    var binaryCode = new byte[] {
        0xed, 0x00, 0x00, 0x00, 0x00, 0x1a, 0x5a, 0x0f, 0x1f, 0xff, 0xc2, 0x09, 0x80, 0x00, 0x00, 0x00,
        0x07, 0xf7, 0xeb, 0x2a, 0xff, 0xff, 0x7f, 0x57, 0xe3, 0x01, 0xff, 0xff, 0x7f, 0x57, 0xeb, 0x00,
        0xf0, 0x00, 0x00, 0x24, 0xb2, 0x4f, 0x00, 0x78
    };

    ArmInstruction[] instructions = disassembler.Disassemble(binaryCode);
    foreach (ArmInstruction instruction in instructions) {
        if (!instruction.IsSkippedData) {
            // ...
            //
            // Skipped data instructions will never have any details, even though instruction details are
            // enabled. An exception is thrown otherwise!
            ArmInstructionDetail instructionDetails = instruction.Details;
        }
    }
}

The above example is relatively simple and illustrates 3 important things. First, we enabled instruction details and enabled Skip Data Mode with a callback. Second, when data is encountered, Capstone will call our callback with the binary code being disassembled and the index of the byte that could not be disassembled as arguments. Our callback returns a value, in this case 2, to tell Capstone exactly how many bytes to skip in the binary code, starting from the index of the byte that could not be disassembled, and effectively point it to the next valid instruction to continue disassembling from.

Third, while we are enumerating the instructions, we check whether or not each iterated instruction is a skipped data instruction using ArmInstruction.IsInvalid. Checking whether or not an instruction is a skipped data instruction when Skip Data Mode is enabled is incredibly important because even if instruction details are enabled, they will not be available for skipped data instructions. An exception is thrown if you attempt to access a skipped data instruction's details.

It's worth noting that when Skip Data Mode is disabled, you'll never have a skipped data instruction, like mentioned earlier, because the disassemble operation prematurely stops when one is encountered. So checking for whether an instruction is a skipped data instruction when Skip Data Mode is disabled is moot.

Customizing Skipped Data Instructions' Mnemonic

When Skip Data Mode is enabled, a disassemble operation returns skipped data instructions along with valid instructions. A skipped data instruction's mnemonic is assigned a default value by Capstone. You can however override this value. Let's look at an example:

using Gee.External.Capstone;
using Gee.External.Capstone.Arm;

using (CapstoneArmDisassembler disassembler = CapstoneDisassembler.CreateArmDisassembler(ArmDisassembleMode.Arm)) {
    disassembler.EnableInstructionDetails = true;
    disassembler.DisassembleSyntax = DisassembleSyntax.Intel;

    // ...
    //
    // By enabling Skip Data Mode, Capstone will automatically skip over invalid instructions and continue
    // disassembling at the next valid instruction it finds. In addition, we tell Capstone to set skipped data
    // instructions' mnemonic to "db".
    // 
    disassembler.EnableSkipDataMode = true;
    disassembler.SkipDataInstructionMnemonic = "db";

    var binaryCode = new byte[] {
        0xed, 0x00, 0x00, 0x00, 0x00, 0x1a, 0x5a, 0x0f, 0x1f, 0xff, 0xc2, 0x09, 0x80, 0x00, 0x00, 0x00,
        0x07, 0xf7, 0xeb, 0x2a, 0xff, 0xff, 0x7f, 0x57, 0xe3, 0x01, 0xff, 0xff, 0x7f, 0x57, 0xeb, 0x00,
        0xf0, 0x00, 0x00, 0x24, 0xb2, 0x4f, 0x00, 0x78
    };

    // ...
    //
    // This operation will return 10 instructions because it encounters an invalid 6th instruction and skipped
    // over it because Skip Data Mode is enabled.
    ArmInstruction[] instructions = disassembler.Disassemble(binaryCode);
    foreach (ArmInstruction instruction in instructions) {
        if (!instruction.IsSkippedData) {
            // ...
            //
            // If the instruction is a skipped data instruction, its mnemonic == "db".
            var instructionMnemonic = instruction.Mnemonic;
        }
    }
}