Figuring out confusing assembly instructions — Koszek trick #2

by Wojciech Adam Koszek   ⋅   Jun 26, 2012   ⋅   East Palo Alto, CA

Useful trick to get from C to assembly and attempt to identify which C parts mapped to which assembly fragments.


Today’s post will be very simple, maybe trivial. One of the hacks that I came up with, when I encountered confusing arcane of ANSI C, or when I played with assembly for fun and profit.

Problem: isolate ANSI C construct or in-line assembly block, so that upon a translation to intermediate assembly, block will be exposed more easily in a visual manner.

So imagine you want to isolate memory reference within ANSI C and figure out what the corresponding assembly line is. Assume given portion of the code:

``` [ptr.c]

#include <stdio.h>

int
main(int argc, char **argv)
{
	const char *str = "example";
	char	 c;

	(void)argc;
	(void)argv;

	__asm__("/* -------- BEGIN ----------- */");
	c = str[3];
	__asm__("/* --------  END  ----------- */");
	printf("%c\n", c);

	return 0;
} ```

Basically for a fixed literal string example you fetch its character m, which is held in variable c. It doesn’t make too much sense and isn’t too useful, but the technique has by all means real-world application.

Right now you perform:

$ gcc -S ptr.c

And your created ptr.s suddenly has:

.file "ptr.c" .section .rodata .LC0: .string "example" .LC1: .string "%c\n" .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 movq %rsp, %rbp .cfi_offset 6, -16 .cfi_def_cfa_register 6 subq $32, %rsp movl %edi, -20(%rbp) movq %rsi, -32(%rbp) movq $.LC0, -16(%rbp) #APP # 12 "ptr.c" 1 /* -------- BEGIN ----------- */ # 0 "" 2 #NO_APP movq -16(%rbp), %rax addq $3, %rax movzbl (%rax), %eax movb %al, -1(%rbp) #APP # 14 "ptr.c" 1 /* -------- END ----------- */ # 0 "" 2 #NO_APP movsbl -1(%rbp), %edx movl $.LC1, %eax movl %edx, %esi movq %rax, %rdi movl $0, %eax call printf movl $0, %eax leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2" .section .note.GNU-stack,"",@progbits

As you can see, the comments placed by us through __asm__ macro are preserved in the intermediate assembly file. Done!

If you have ever put a code, which has never been executed, or maybe even disappeared somewhere deep in the sea of stages of compilation, this technique should serve you well!



Subscribe for updates

Once a month I send updates on the new content and hints for software engineers.



Liked it? Share it!


About the author: I'm Wojciech Adam Koszek. I like software, business and design. Poland native. In Bay Area since 2010.   More about me