Same Source , Different Results
-Sanchit Karve
born2c0de@hotmail.com
Each Compiler has it's own distinct features. To a Beginner Programmer all compilers seem the same. But for a code digger and experienced programmers it does make a difference. A code snippet maybe executed in one way using one compiler and in another fashion using another. That way one of the compiler ends up generating big and sluggish code. You can try building small programs using two or more different compilers and then compare the File sizes. Obviously the one with the smaller size has generated the smallest code. But it's not always the best. Compilers such as Visual C++ have the ability to generate big, but fast code. So don't draw conclusions just by looking at it's executable size, you have to view the generated code more closely to fully understand it, and that's exactly what we will be doing now. A given source code has been compiled with Microsoft Visual C++ and Borland C++ under it's default settings. Then we shall study it's Disassembled code.I have chosen the code that uses Macros and Functions since it is very interesting to see the results. Have a look below.
Here is the Program which uses Macros and Functions:
| #include <stdio.h> #include <conio.h> #define mac_SQR(x) x*x // MACRO int func_SQR(int x) // FUNCTION { return x*x; } void main() { int a=3; int b=a; printf("*********MACRO**********\n"); printf("%d Square = %d\n",a,mac_SQR(a)); printf("%d is the Square of ++%d\n",mac_SQR(++a),a); printf("%d is the Square of = %d++\n",mac_SQR(a++),a); printf("a is now=%d\n\n\n\n",a); printf("*******FUNCTION********\n"); printf("%d Square = %d\n",b,func_SQR(b)); printf("%d is the Square of ++%d\n",func_SQR(++b),b); printf("%d is the Square of = %d++\n",func_SQR(b++),b); printf("b is now=%d",b); getch(); } |
The Results are:
BORLAND C++ VISUAL C++
*********MACRO********** *********MACRO**********
3 Square = 9
3 Square = 9
20 is the Square of ++3
25 is the Square of ++3
30 is the Square of = 5++
25 is the Square of = 5++
a is now=7
a is now=7
*******FUNCTION********
*******FUNCTION********
3 Square = 9
3 Square = 9
16 is the Square of ++3
16 is the Square of ++3
16 is the Square of = 4++
16 is the Square of = 4++
b is now=5
b is now=5
As you see the results of both compilers when using functions is same but while using Macros the results vary. This is because ANSI has standardized the working of functions but they have left the functioning of macros to be decided by the compiler developers. Hence Borland and Microsoft have different ways when it comes to dealing with macros.
Now let us understand how Borland C++ Implements Macros. So let us analyze the disassembled listing obtained from the executable generated by the Borland C++ Compiler. The Disassembled Listing has been obtained by IDA Pro 4.15. IDA is Shareware. You can also use Win32DASM which is freeware. It will give similar disassembled listings. You will just have to scroll till the main code or you can click the String References Button and Double Click the MACROS String to get you to the code that accesses it. You can download the Disassembler from the link given below:
W32DASM v.8.93 Developed by URSoft
Here is the Disassembled Listing of the Executable File Compiled with Borland C++.
| ; BORLAND C++ PROGRAM func_SQR proc near arg_0 = dword ptr 8 push ebp mov ebp, esp mov eax, [ebp+arg_0] ; Working of this mov edx, eax ; Should be simple imul edx, eax ; Squares a Number mov eax, edx ; Sets the result in EAX for returning pop ebp retn func_SQR endp _main proc near ; DATA XREF: .data:0040B0C4 o argc = dword ptr 8 argv = dword ptr 0Ch envp = dword ptr 10h push ebp mov ebp, esp push ebx push esi push edi ; char mov edi, offset aMacro ; "*********MACRO**********\n" mov ebx, 3 ; a = ebx mov esi, ebx ; b = a; push edi ; __va_args call _printf pop ecx mov eax, ebx ; EAX = a; imul ebx ; EAX = EAX * a push eax ; Push Squared Result ie. 9 push ebx ; Push a lea edx, [edi+1Ah] push edx ; __va_args call _printf ; %d square is %d add esp, 0Ch push ebx ; a is pushed { a = 3} inc ebx ; a++ { now a = 4} mov ecx, ebx ; ecx = 4 inc ebx ; a++ { now a = 5} imul ecx, ebx ; ECX = ECX * EBX push ecx ; Result = 4 * 5 ; Result 20 is pushed lea eax, [edi+2Ah] ; %d is the square of ++%d push eax ; __va_args call _printf add esp, 0Ch push ebx ; a is pushed {a = 5} mov edx, ebx ; EDX = 5 inc ebx ; a++ {a = 6} mov ecx, ebx ; ECX = 6 inc ebx ; a++ {a = 7} imul edx, ecx ; EDX = 5 * 6 push edx ; Result 30 is pushed lea eax, [edi+44h] ; %d is the square of %d++ push eax ; __va_args call _printf add esp, 0Ch push ebx ; a is pushed {a = 7} lea edx, [edi+60h] ; a is now %d push edx ; __va_args call _printf add esp, 8 lea ecx, [edi+70h] push ecx ; __va_args call _printf ; Prints FUNCTIONS pop ecx push esi ; ESI set to 3 after opening stack frame ; b is pushed {b = 3} call func_SQR pop ecx push eax ; Result Square pushed {9} push esi ; b pushed {b = 9} lea eax, [edi+89h] ; %d square = %d push eax ; __va_args call _printf add esp, 0Ch push esi ; b is pushed {b = 3} inc esi ; b++ {b = 4} push esi ; 4 is pushed call func_SQR pop ecx push eax ; Pushes Result 16 lea edx, [edi+99h] ; %d is the Square of ++%d push edx ; __va_args call _printf add esp, 0Ch push esi ; b pushed {b = 4} mov ecx, esi ; ECX = 4 inc esi ; b++ {b = 5} push ecx ; push ECX ie. 4 call func_SQR pop ecx push eax ; Pushes 16 as result lea eax, [edi+0B3h] ; %d is the square of %d++ push eax ; __va_args call _printf add esp, 0Ch push esi ; b is pushed {b = 5} lea edx, [edi+0CFh] ; %d is now %d push edx ; __va_args call _printf add esp, 8 call _getch pop edi pop esi pop ebx pop ebp retn _main endp |
Let us see how Borland C++ executes the program:
push ebx ; a is pushed { a = 3}
inc ebx ; a++ { now a = 4}
mov ecx, ebx ; ecx = 4
inc ebx ; a++ { now a = 5}
imul ecx, ebx ; ECX = ECX * EBX
push ecx ; Result = 4 * 5
; Result 20 is pushed
lea eax, [edi+2Ah] ; %d is the square of ++%d
push eax ; __va_args
call _printf
This time it's the post-increment operator coming into play. Have a look at the macro definition again to understand this better.
mac_SQR(x) x * x translated to -> mac_SQR(x) x++ * x++
Here the value gets incremented after the operation is done. See how this is done in Borland's way though. The Current Value of a ie. 5 is stored in a register EBX. Another register EDX is used to store the contents of EBX ie. the value and then EBX is incremented to 6. Then the new value 6 is stored in another register ECX but now the value in EBX is incremented to 7. The Multiplication in the end turns out to be EDX x ECX ie. 5 x 6 = 30.So instead of incrementing the values after the multiplication Borland C++ increments the values immediately after the parameter to the macro has been stored in a register. Call it a bug, call it a feature...it's all up to you.
Finally the value of a is output on the screen.
Now it's time for us to see what makes Microsoft Visual C++ different from Borland C++ when it comes to macro code generation. Here's how Visual C++ does it's work. Here's the Disassembled Listing first.
| j_func_SQR proc near jmp func_SQR j_func_SQR endp func_SQR proc near arg_0 = dword ptr 8 push ebp mov ebp, esp mov eax, [ebp+arg_0] imul eax, [ebp+arg_0] pop ebp retn func_SQR endp main proc near var_10 = dword ptr -10h var_C = dword ptr -0Ch b = dword ptr -8 a = dword ptr -4 push ebp mov ebp, esp sub esp, 10h mov [ebp+a], 3 ; a = 3 mov eax, [ebp+a] ; EAX = 3 mov [ebp+b], eax ; b = 3 push offset aMacro ; "*********MACRO**********\n" call _printf add esp, 4 mov ecx, [ebp+a] imul ecx, [ebp+a] push ecx mov edx, [ebp+a] push edx push offset aDSquareD ; "%d Square = %d\n" call _printf add esp, 0Ch mov eax, [ebp+a] push eax mov ecx, [ebp+a] add ecx, 1 mov [ebp+a], ecx mov edx, [ebp+a] add edx, 1 mov [ebp+a], edx mov eax, [ebp+a] imul eax, [ebp+a] push eax push offset aDIsTheSquareOf ; "%d is the Square of ++%d\n" call _printf add esp, 0Ch mov ecx, [ebp+a] push ecx mov edx, [ebp+a] imul edx, [ebp+a] mov [ebp+var_C], edx mov eax, [ebp+var_C] push eax push offset aDIsTheSquare_0 ; "%d is the Square of = %d++\n" mov ecx, [ebp+a] add ecx, 1 mov [ebp+a], ecx mov edx, [ebp+a] add edx, 1 mov [ebp+a], edx call _printf add esp, 0Ch mov eax, [ebp+a] push eax push offset aAIsNowD ; "a is now=%d\n\n\n\n" call _printf add esp, 8 push offset aFunction ; "*******FUNCTION********\n" call _printf add esp, 4 mov ecx, [ebp+b] push ecx call j_func_SQR add esp, 4 push eax mov edx, [ebp+b] push edx push offset aDSquareD_0 ; "%d Square = %d\n" call _printf add esp, 0Ch mov eax, [ebp+b] push eax mov ecx, [ebp+b] add ecx, 1 mov [ebp+b], ecx mov edx, [ebp+b] push edx call j_func_SQR add esp, 4 push eax push offset aDIsTheSquare_1 ; "%d is the Square of ++%d\n" call _printf add esp, 0Ch mov eax, [ebp+b] push eax mov ecx, [ebp+b] mov [ebp+var_10], ecx mov edx, [ebp+var_10] push edx mov eax, [ebp+b] add eax, 1 mov [ebp+b], eax call j_func_SQR add esp, 4 push eax push offset aDIsTheSquare_2 ; "%d is the Square of = %d++\n" call _printf add esp, 0Ch mov ecx, [ebp+b] push ecx push offset aBIsNowD ; "b is now=%d\n" call _printf add esp, 8 call getch mov esp, ebp pop ebp retn main endp |
Let us see how Visual C++ deals with this program. Since you already know how Borland did it's work I won't be repeating the same parts of code. I will be explaining only the new methods implemented by Visual C++.
Unlike Borland which used Registers for storing the variable data, Visual C++ uses Space on the Stack for storing the data. This can be observed whenever memory allocation is done like this: mov [ebp+a],ecx Here the Value of ecx is stored in the Stack at address ebp-4 since a= - 4
Here the incrementing style is different. Registers are used to increment the values but the incrementing is carried out on the variable itself.
In Pre-increment operation the variable gets incremented twice and then it gets multiplied by itself. That's why ++3 Square becomes 5 x 5 = 25. This is done by the code below:
mov eax, [ebp+a]
push eax
mov ecx, [ebp+a] ; takes value of variable and stores in ecx
add ecx, 1
; ecx is incremented
mov [ebp+a], ecx ; Incremented value stored back in variable
mov edx, [ebp+a] ; edx this time
add edx, 1
; edx incremented
mov [ebp+a], edx ; Now variable is effectively incremented
twice.
mov eax, [ebp+a] ; variable data stored in eax
imul eax, [ebp+a] ; eax multiplied with itself
push eax
push offset aDIsTheSquareOf ; "%d is the Square of ++%d\n"
call _printf
add esp, 0Ch
Post Increment increments the variable twice after the operation is carried out. Therefore Since the Previous Value of a is 5 the Result of 5++ Square becomes 5 x 5 = 25. And then a is incremented to 7. The code that does this is as follows:
mov edx, [ebp+a]
imul edx, [ebp+a] ; Multiplication done before incrementing
mov [ebp+var_C], edx
mov eax, [ebp+var_C]
push eax
; Result pushed for output
push offset aDIsTheSquare_0 ; "%d is the Square of = %d++\n"
mov ecx, [ebp+a]
add ecx, 1
mov [ebp+a], ecx ; Increment comes here
mov edx, [ebp+a]
add edx, 1
mov [ebp+a], edx ; Second Increment after Multiplication
call _printf
add esp, 0Ch
I don't think there is any need to explain the function code as the results produced are the same and the code is relatively simpler to understand. However if you still have difficulties understanding it you can contact me at born2c0de@hotmail.com.
You can modify the main C Program and change the incrementing operators into decrementing operators and study the disassembled code yourself.