SXXXXXXX_CppPythonDebug/doc/English-manual.md
2025-06-13 16:10:56 +02:00

20 KiB

Cpp-Python GDB Debug Helper - User Manual

1. Introduction

1.1 What is Cpp-Python GDB Debug Helper?

The Cpp-Python GDB Debug Helper is a Graphical User Interface (GUI) designed to enhance and simplify the process of debugging C/C++ applications using the GNU Debugger (GDB). It aims to provide a more user-friendly experience compared to the GDB command-line interface, especially for tasks like inspecting complex data structures and automating repetitive debugging scenarios.

1.2 Who is it for?

This tool is primarily aimed at C/C++ developers who use GDB for debugging and would benefit from:

  • A visual interface for common GDB operations.
  • Easier inspection of complex C++ data types (structs, classes, STL containers).
  • Automation of debugging sequences through configurable profiles.
  • Structured output of variable dumps in JSON or CSV formats.

1.3 Key Features

  • Interactive Manual Debugging: Start GDB, set breakpoints, run your target program, and inspect variables.
  • Advanced Variable Dumping: Utilizes a custom GDB Python script to dump the state of C/C++ variables, including complex data structures like classes, structs, pointers, arrays, and std::string, into a structured JSON format.
  • Automated Debug Profiles: Create, manage, and execute debug profiles. Each profile can define:
    • Target executable and program parameters.
    • Multiple debug "actions", each specifying a breakpoint, variables to dump, final output format (JSON/CSV), output directory, and filename patterns.
  • Symbol Analysis: Analyze your compiled executable to extract information about functions, global variables, user-defined types, and source files. This data aids in configuring debug actions.
  • Live Scope Inspection: When configuring an action, the tool can query GDB live to list variables (locals and arguments) available at a specified breakpoint, allowing for precise selection.
  • Configurable Environment: Set paths for GDB, the custom Python dumper script, and various timeouts for GDB operations.
  • Flexible Output: Save dumped data in JSON or CSV formats with customizable filenames using placeholders for better organization.
  • GUI Logging: View application logs and raw GDB output directly within the interface.

2. System Requirements & Setup

2.1 Supported Operating Systems

  • Windows (Primary): The application is primarily developed and tested on Windows. It uses the pexpect library's Windows-compatible backend (PopenSpawn) for robust process control.
  • Linux/macOS (Experimental): The application should be compatible with Unix-like systems as pexpect is cross-platform.

2.2 Python

  • Python 3.7 or newer is recommended.

2.3 Required Python Libraries

You will need to install the following Python libraries. You can install them using pip: pip install pexpect appdirs

  • pexpect: For controlling GDB as a child process.
  • appdirs: Used for determining platform-independent user configuration and data directories (though the primary configuration is now stored relative to the application).
  • Tkinter: This is included with standard Python installations and is used for the GUI.

2.4 GDB Installation

  • A working installation of the GNU Debugger (GDB) is required.
  • Ensure that GDB is either added to your system's PATH environment variable or you provide the full path to the GDB executable in the application's configuration.
  • GDB versions 8.x and newer are recommended for the best Python scripting support.

2.5 Compiling Your C/C++ Target Application

  • Your C/C++ application must be compiled with debugging symbols.
  • For GCC/G++ or Clang, use the -g flag: g++ -g -o myprogram myprogram.cpp.
  • Avoid high levels of optimization (e.g., -O2, -O3) if they interfere with debugging. Consider using -Og (optimize for the debug experience).

3. Installation and Execution

3.1 Running from Source Code

  1. Ensure all prerequisites from Section 2 are met.
  2. Download or clone the source code repository.
  3. Navigate to the root directory of the project (cpp_python_debug).
  4. Run the main script as a module: python -m cpp_python_debug

3.2 Running the Compiled (--onedir) Version

The application can be packaged into a distribution folder using PyInstaller.

  1. Unzip or copy the distribution folder (e.g., CppPythonDebugHelper) to your desired location. This folder is self-contained.
  2. Inside the folder, find and run the main executable (e.g., CppPythonDebugHelper.exe).
  3. All files generated by the application (configurations, logs, dumps) will be created inside this folder, making it fully portable.

4. File and Directory Structure

The application creates and manages several files and directories. Understanding this structure is key to finding your configurations and output.

  • When running from source: All paths are relative to the project's root directory.
  • When running the compiled version: All paths are relative to the folder containing the main executable.
  • config/
    • gdb_debug_gui_settings.v2.json: The main configuration file. It stores all your settings, including paths, timeouts, and all your debug profiles. This file is in JSON format.
  • logs/
    • cpppythondebughelper_gui.log: The main log file for the GUI application itself. Useful for troubleshooting GUI issues.
    • gdb_dumper_script_internal.log: A dedicated log file for the gdb_dumper.py script. This is extremely useful for debugging issues that occur inside GDB during a variable dump.
    • manual_gdb_dumps/: The directory where temporary dump files (.gdbdump.json) from the "Manual Debug" tab are stored before you save them to a final location.
    • gdb_dumper_diagnostics/: (Optional) If you enable "Enable Diagnostic JSON Dump to File" in the settings, this folder will contain a raw JSON copy of every single variable dump, which is useful for debugging the dumper script itself.
  • <Profile Output Directory>: The directory you specify in a profile's action is where the final dump files (JSON or CSV) for that profile run will be saved. The application will create a run-specific subfolder here (e.g., MyDumps/MyProfile_20231027_143000/).

5. Quick Start Guide

  1. Launch the Application as described in Section 3.
  2. Initial Configuration: On first launch, go to Options > Configure Application....
    • In the Paths & Directories tab, browse to your GDB executable.
    • (Strongly Recommended) Also browse to the gdb_dumper.py script located in the core subdirectory of the source code (or cpp_python_debug/core in the compiled version).
    • Click Save.
  3. Your First Manual Debug Session:
    • Go to the Manual Debug tab.
    • Select your compiled C/C++ executable.
    • Enter a breakpoint (e.g., main).
    • Click 1. Start GDB.
    • Click 2. Set Breakpoint.
    • Click 3. Run Program.
    • When the breakpoint is hit, enter a variable name and click 4. Dump Variable.
    • Observe the "Parsed JSON/Status Output" tab. It will show a status message confirming the dump and the path to a temporary .gdbdump.json file.
    • The Save as JSON and Save as CSV buttons will become active. Use them to save the captured data to a permanent location.

6. User Interface Overview

(A screenshot of the main window with areas annotated would be ideal here)

6.1 Menu Bar

  • Options: "Configure Application...", "Exit".
  • Profiles: "Manage Profiles...".

6.2 Critical Configuration Status Area

Displays status of GDB executable and Dumper script. Includes a "Configure..." button.

6.3 Mode Panel (Tabs)

  • Manual Debug Tab: For interactive, step-by-step debugging.
  • Automated Profile Execution Tab: For running pre-configured debug sequences.

6.4 Output and Log Area (Tabs)

  • GDB Raw Output Tab: Raw text communication with the GDB process.
  • Parsed JSON/Status Output Tab: Displays the status payload received from the GDB dumper script or pretty-prints simple JSON.
  • Application Log Tab: Log messages from the GUI application itself.

6.5 Status Bar

Brief messages about the application's current state or last operation.


7. Configuration Window (Options > Configure Application...)

(A screenshot of the Configuration Window with tabs would be beneficial here)

Organized into tabs:

7.1 Paths & Directories Tab

  • GDB Executable Path: Full path to GDB. Crucial.
  • GDB Python Dumper Script Path: Full path to gdb_dumper.py. Strongly recommended for full functionality.

7.2 Timeouts Tab (seconds)

Configure timeouts for GDB operations: GDB Start, GDB Command, Program Run/Continue, Dump Variable, Kill Program, GDB Quit.

7.3 Dumper Options Tab

Control the behavior of gdb_dumper.py: Max Array Elements, Max Recursion Depth, Max String Length, and options for diagnostic logging.


8. Manual Debug Mode in Detail

(A screenshot of the Manual Debug tab would be useful)

This mode provides a step-by-step interface for a single debug session.

8.1 Workflow

  1. Set Target & Parameters: Specify the executable and any command-line arguments.
  2. Set Breakpoint & Variable: Define where to stop and what to inspect. You can use the advanced @ syntax here as well (e.g., my_ptr@100 or my_matrix@rows,cols).
  3. Control Session: Use the numbered buttons (1. Start GDB, 2. Set Breakpoint, 3. Run Program, 4. Dump Variable, Stop GDB) to control the flow.
  4. Dump Data: The "Dump Variable" action invokes the gdb_dumper.py script, which saves the variable's state directly to a temporary .gdbdump.json file.
  5. Save Data: After a successful dump, the "Save as..." buttons become active, allowing you to save the captured data permanently as JSON or CSV. The CSV conversion will use the advanced formatting for matrices if applicable.

8.2 Interpreting Output

  • GDB Raw Output: Shows all communication with GDB, including the status message from the dumper script.
  • Parsed JSON/Status Output: Displays the status payload from the dumper, confirming the action and providing the path to the temporary file.

9. Profile Manager & Automated Execution

(Screenshot of Profile Manager recommended)

This is the core feature for automating debugging.

9.1 Profile Manager (Profiles > Manage Profiles...)

This window is the hub for creating and managing your automated debug scenarios. A profile consists of:

  1. Profile Details: Name, target executable, and program parameters.
  2. Symbol Analysis Data: You can run an analysis on the target executable. The tool uses GDB to find all functions, global variables, etc., and stores this information in the profile. This helps you accurately set up actions.
  3. Actions: A list of debug actions.

9.2 Action Editor

Each action defines a specific task to be performed at a breakpoint.

  • Breakpoint Location: Where GDB should stop.
  • Variables to Dump: A list of variables or expressions to dump (one per line).
    • NEW: Advanced Syntax for Arrays and Matrices: You can provide dimensions for pointer types to dump them as arrays or matrices. This is crucial for handling dynamically allocated memory that GDB cannot inspect on its own.
      • 1D Array (Vector): my_vector_ptr@size
      • 2D Array (Matrix): my_matrix_ptr@rows,cols
    • The size, rows, and cols can be either integer literals (e.g., @100) or other variables visible in the GDB context at that breakpoint (e.g., @my_struct.size, @num_rows,num_cols). This is extremely powerful for dynamically sized data.
  • Output Format: Final format (JSON or CSV).
  • Output Directory: The base directory for the output files.
  • Filename Pattern: A template for naming the output files.
  • Execution Flow: Controls whether to continue after the dump and whether to dump on every hit or just the first.

9.3 Automated Execution Flow

  1. Select a profile from the dropdown on the "Automated Profile Execution" tab.
  2. Click Run Profile.
  3. The ProfileExecutor starts GDB and runs the program.
  4. When a breakpoint is hit, the corresponding action is triggered.
  5. The gdb_dumper.py script is invoked. It dumps the specified variable to an intermediate .gdbdump.json file. This JSON file is now structured with a metadata section (containing information like matrix dimensions) and a data section.
  6. The main application then processes this structured JSON file:
    • If the desired format is JSON, the entire structured object (metadata and data) is saved to the final file.
    • If the desired format is CSV, it reads the structured JSON, writes the metadata as commented header lines in the CSV file, and then flattens the data payload into the appropriate format (e.g., coordinate format for matrices).
  7. The "Produced Files Log" is updated in real-time.

10. Troubleshooting / FAQ

Q: GDB not found / Dumper script issues / No debugging symbols. A: Ensure your configured paths in Options > Configure Application... are correct. Check the Application Log and GDB Raw Output tabs for specific error messages from GDB or the dumper script.

Q: The application hangs or times out. A: Your target program might be taking a long time. Try increasing the timeouts in the Configuration Window. For very large data dumps (e.g., large matrices), the "Dump Variable" timeout may need to be significantly increased.

Q: How can I get more debug information from gdb_dumper.py? A:

  1. Check the logs/gdb_dumper_script_internal.log file. This is the first place to look for errors happening inside the dumper.
  2. For even more detail, enable "Enable Diagnostic JSON Dump to File" in the Dumper Options. This saves a raw JSON copy of every dump to the logs/gdb_dumper_diagnostics/ directory, allowing you to see exactly what the dumper is producing.

11. Use Cases / Examples

11.1 Dumping a std::vector

  • Scenario: You want to inspect the contents of a std::vector<MyObject> myVector every time it's modified inside a processVector function.
  • Profile Setup:
    • Action 1: Breakpoint at the start of processVector.
    • Action 2: Breakpoint at the end of processVector.
    • Both actions dump the myVector variable.
  • Result: When the profile runs, files like vector_dumps/MyProfile_timestamp/processVector_myVector_timestamp.json (or .csv) will be created, allowing you to see the state of the vector before and after processing.

11.2 Tracing a Global Variable

  • Scenario: You need to track how a global variable globalCounter changes at different key points in your application.
  • Profile Setup: Create multiple actions, each with a different breakpoint (e.g., func_A, func_B, main.cpp:150), but all dumping the same variable globalCounter.
  • Result: You will get a series of timestamped files, one for each time the counter was dumped, allowing you to trace its value through the program's execution flow.

11.3 Snapshots of Complex Data

  • Scenario: Your application has a large configuration or state object (ApplicationState appState) and you want to take a complete snapshot of it at a critical point, like just before a long-running task.
  • Profile Setup: An action at longRunningTask.cpp:75 that dumps the appState object.
  • Result: A detailed JSON file like app_state_snapshots/MyProfile_timestamp/longRunningTask.cpp_75_appState_timestamp.json will be created, containing a full, nested representation of your application's state.

11.4 Dumping a Dynamic 2D Matrix of Structs

  • Scenario: You have a C struct ComplexSignal_t signal which contains int n_row;, int n_col;, and a pointer rgk_complex_float* data;. The data pointer points to a flat, row-major memory block. You want to dump this entire matrix to a CSV file for analysis in Python/MATLAB.

  • Profile Setup:

    • Create an action at a breakpoint where signal is in scope.
    • In "Variables to Dump", enter: signal.data@signal.n_row,signal.n_col
    • Set "Output Format" to csv.
  • Result: The application will generate a single, clean CSV file.

    • Metadata Header: The top of the file will contain commented lines with the matrix dimensions, e.g., # original_rows: 1024, # original_cols: 512. This allows your analysis scripts to pre-allocate memory efficiently.
    • Data Body: The rest of the file will be in a "coordinate" (long) format, perfect for analysis:
      row_idx,col_idx,re,im
      0,0,1.23,-0.45
      0,1,1.25,-0.48
      ...
      1023,511,3.14,1.59
      

    This file can be loaded directly into a Pandas DataFrame or other analysis tools.


12. Advanced: The gdb_dumper.py Script

12.1 Role and Interaction with GDB

The gdb_dumper.py script is the core of the data extraction engine. It runs within the GDB process and has access to GDB's Python API.

  • Serialization Logic:

    1. Uses gdb.parse_and_eval() to get a gdb.Value object representing a C++ variable.
    2. Optimized Memory Reading: For large arrays and matrices (specified with @ syntax), it now uses an optimized memory-reading approach. Instead of evaluating each element individually, it reads the entire memory block of the matrix in a single GDB operation (inferior.read_memory). It then uses Python's struct module to unpack the raw bytes into data. This is orders of magnitude faster for large datasets.
    3. It constructs a Python dictionary containing a metadata key (for dimensions, type info, etc.) and a data key (for the actual variable content).
    4. This structured dictionary is serialized to a JSON string.
    5. The dumper saves this full JSON string directly to a specified intermediate file (.gdbdump.json).
    6. It prints a small status JSON payload (indicating success/failure and the path written) to GDB's standard output, bracketed by special delimiters.
  • GUI Processing: The main GUI captures this status payload to understand the outcome of the dump. In Profile Mode, it then processes the intermediate file to create the final user-specified output (saving the full structured JSON, or converting to a metadata-rich CSV).

12.2 Dumper Log File (gdb_dumper_script_internal.log)

This log file, located in the main logs directory, is invaluable for debugging the dumper script itself. It records internal steps, configurations, and errors that occur within the GDB environment, which are not visible in the main application log.


13. Appendix: Filename Placeholders

The following placeholders can be used in the "Filename Pattern" field (in the Action Editor) to construct the base name of your output files. The application automatically sanitizes the content of each placeholder to make it safe for file systems.

  • {profile_name}: The name of the profile.
  • {app_name}: Base name of the target executable.
  • {breakpoint}: The breakpoint location string. For file:line formats, this will be sanitized to something like file_line.
  • {variable}: The variable/expression name being dumped. For expressions with special characters like var@size,dim, this will be sanitized to just var.
  • {timestamp}: A detailed timestamp (YYYYMMDD_HHMMSS_ms).

Example Pattern: dump_{app_name}_{breakpoint}_{variable}_{timestamp} Example Final Output (if JSON): dump_myprogram_main_myVar_20231027_143005_123.json Example Intermediate GDB Dump File: dump_myprogram_main_myVar_20231027_143005_123.gdbdump.json