Obfuscation and Anonymization Transformation
Obfuscation is a source code transformation technique that deliberately makes code harder to read and understand while preserving its original functionality. It typically involves renaming identifiers, restructuring code, or inserting misleading elements to obscure the program’s logic. Obfuscation is widely used to protect intellectual property, hinder reverse engineering, and conceal sensitive implementation details without altering the program's behavior or breaking its execution.
While obfuscation focuses on code concealment, an enhanced transformation called Anonymization can be applied for stronger confidentiality. Anonymization removes even more semantic information and may intentionally break link compatibility, making it impossible to link the transformed code with the original source or other modules. This is particularly valuable when sharing code for debugging, third-party analysis, or external reviews while ensuring that proprietary or sensitive information remains hidden.
Obfuscation and Anonymization Transformation in emmtrix Studio
emmtrix Studio provides a combined Obfuscation and Anonymization transformation that can be applied either via #pragma
directives within the source code or through the graphical user interface. The transformation systematically renames identifiers to obscure or neutral names, making the code significantly harder to interpret for humans while maintaining its original functionality.
- Obfuscation mode: focuses on concealment while maintaining link compatibility — global non-static variables and functions remain unchanged, ensuring that external linkage and build scripts continue to work out of the box. Local symbols, static function names, parameter names, user‑defined types and other internal identifiers are renamed to hashed values.
- Anonymization mode: performs a superset of obfuscation. All possible identifiers are renamed, stripping even more semantic information than is present in the resulting object code. This mode offers more confidentiality especially for code with many global variables and sensitive function names.
In both modes, the original functionality and logical structure of the code are preserved. Deterministic renaming is achieved by using secure hash algorithms (SHA‑1 or SHA‑256) with a user-defined seed. While these algorithms are not used for cryptographic protection in this context, they guarantee consistent and repeatable identifier transformations across builds.
Typical Usage and Benefits
The transformation is used to make the source code hard to understand for humans while still preserving the original structure of the source code. It is meant for cases where the source code needs to be shared (e.g. for debugging) but no intellectual property—or sensitive naming information—should be shown.
In its full scope, obfuscation affects all identifiers, including:
- function names and their parameters
- global and local variables
- user-defined types
- structs and unions
- enumerators
- file names
Beyond that other information are removed (or can be removed in combination with other transformations):
- comments
- formatting
- resolve typedefs
- inline includes
- merge all C files (unity build)
Example
/* The following code tests obfuscation transformation applied to main function with default parameters.
* According to the default settings, all identifiers except for the external definitions and the main func-
tion, shall be renamed and become obscure.
* In the current code example the identifiers with the suffix _obfu are renamed.
* Other identifiers include printf and main functions and they remain unmodified.
*/
const int g_obfu = 10;
void print_obfu(int a_obfu) {
printf(” % d\ n”, a_obfu);
}
#pragma EMX_TRANSFORMATION ObfuscateIdentifiers
int main() {
int l_obfu = 1;
print_obfu(l_obfu);
print_obfu(g_obfu);
return 0;
}
|
/* The following code is the generated code after the transformation has been applied.
*/
const int var_ee832f = 10;
void func_3db980(int var_e86e08) {
printf(” % d\ n”, var_e86e08);
}
void func_baf390(int var_e86e08) {
printf(” % d\ n”, var_e86e08);
}
int main() {
int var_32bc72 = 1;
func_baf390(var_32bc72);
func_3db980(var_ee832f);
return 0;
}
|
Along with the transformed code, a mapping between the old and the new identifiers is created. It is stored in a CSV file obfuscation_mapping.csv.
file_b93f1b;test_obfuscation00
func_3db980;print_obfu
func_baf390;print_obfu_duplicate2
var_e86e08;a_obfu
var_ee832f;g_obfu
var_32bc72;l_obfu
Parameters
Following parameters can be set (each description is followed by keyword in pragma-syntax and default value):
Id | Default Value | Description |
---|---|---|
all
|
true | Whole Project - apply obfuscation on all translation units across project |
external
|
false | External definitions - apply obfuscation on identifiers in header files; use with caution, since it affects used system identifiers and produces uncompilable code |
output
|
true | Output identifiers - generate CSV file containing mapping of old and new names |
seed
|
enc1 | Seed for hash-function - arbitrary string used as input for hashing algorithms |
n
|
6 | Hash-length - length of hash code in obfuscated identifiers |
Note
- For same seeds and hash-lengths, obfuscation is deterministic.