栈缓冲区溢出

栈缓冲区溢出(stack buffer overflow或stack buffer overrun)是计算机程序把数据写入调用栈上的内存时超出了数据结构的边界。[1][2]栈缓冲区溢出是缓冲区溢出的一种。[1] 这会损坏相邻数据的值,引发程序崩溃或者修改了函数返回地址从而导致执行恶意的程序。这种攻击方式称为stack smashing。可被用于注入可执行代码、接管进程的执行。是最为古老的黑客攻击行为之一。[3][4][5]

例子

下例可用于覆盖函数返回地址。[3][6] 通过函数 strcpy() :

#include <string.h>

void foo (char *bar)
{
   char  c[12];

   strcpy(c, bar);  // no bounds checking
}

int main (int argc, char **argv)
{
   foo(argv[1]);

   return 0;
}

当命令行参数少于12个字符时(例子B时)该程序是安全的。

foo()函数的不同输入下的调用栈,下图展示32位元小端序(little-endian)系统发生栈缓冲区溢出的记忆体状态:

 
A. - Before data is copied.
 
B. - "hello" is the first command line argument.
 
C. - "A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​\x08​\x35​\xC0​\x80" is the first command line argument.

例子C中, 命令行参数多于11个字符,导致foo()覆盖了本地调用栈的数据、存储的栈指针(EBP)以及最重要的返回地址。

攻击也可以修改内部变量值:

#include <string.h>
#include <stdio.h>

void foo (char *bar)
{
   float My_Float = 10.5; // Addr = 0x0023FF4C
   char  c[28];           // Addr = 0x0023FF30

   // Will print 10.500000
   printf("My Float value = %f\n", My_Float);

    /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       Memory map:
       @ : c allocated memory
       # : My_Float allocated memory

           *c                      *My_Float
       0x0023FF30                  0x0023FF4C
           |                           |
           @@@@@@@@@@@@@@@@@@@@@@@@@@@@#####
      foo("my string is too long !!!!! XXXXX");

   memcpy will put 0x1010C042 (little endian) in My_Float value.
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/

   memcpy(c, bar, strlen(bar));  // no bounds checking...

   // Will print 96.031372
   printf("My Float value = %f\n", My_Float);
}

int main (int argc, char **argv)
{
   foo("my string is too long !!!!! \x10\x10\xc0\x42");
   return 0;
}

平台相关

已隐藏部分未翻译内容,欢迎参与翻译

A number of platforms have subtle differences in their implementation of the call stack that can affect the way a stack buffer overflow exploit will work. Some machine architectures store the top level return address of the call stack in a register. This means that any overwritten return address will not be used until a later unwinding of the call stack. Another example of a machine specific detail that can affect the choice of exploitation techniques is the fact that most RISC style machine architectures will not allow unaligned access to memory.[7] Combined with a fixed length for machine opcodes this machine limitation can make the jump to ESP technique almost impossible to implement (with the one exception being when the program actually contains the unlikely code to explicitly jump to the stack register).[8][9]

Stacks that grow up

Within the topic of stack buffer overflows, an often discussed but rarely seen architecture is one in which the stack grows in the opposite direction. This change in architecture is frequently suggested as a solution to the stack buffer overflow problem because any overflow of a stack buffer that occurs within the same stack frame can not overwrite the return pointer. Further investigation of this claimed protection finds it to be a naive solution at best. Any overflow that occurs in a buffer from a previous stack frame will still overwrite a return pointer and allow for malicious exploitation of the bug.[10] For instance, in the example above, the return pointer for foo will not be overwritten because the overflow actually occurs within the stack frame for strcpy. However, because the buffer that overflows during the call to strcpy resides in a previous stack frame, the return pointer for strcpy will have a numerically higher memory address than the buffer. This means that instead of the return pointer for foo being overwritten, the return pointer for strcpy will be overwritten. At most this means that growing the stack in the opposite direction will change some details of how stack buffer overflows are exploitable, but it will not reduce significantly the number of exploitable bugs.

保护方式

常用三种方式来对抗栈缓冲区溢出攻击。

检查栈缓冲区溢出的发生

栈的警惕标志(stack canary),得名于煤矿里的金丝雀英语Animal sentinel#Historical examples,用于探测该灾难的发生。具体办法是在栈的返回地址的存储位置之前放置一个整形值,该值在装入程序时随机确定。栈缓冲区攻击时从低地址向高地址覆盖栈空间,因此会在覆盖返回地址之前就覆盖了警惕标志。返回返回前会检查该警惕标志是否被篡改。[2]

栈数据不可执行

采取了“写异或执行”策略(W^X英语W^X, "Write XOR Execute"),即内存要么可写,要么可执行,但二者不能兼得。这是最常用的方法,大部分桌面处理器都硬件支持不可执行标志(no-execute flag)。

已隐藏部分未翻译内容,欢迎参与翻译

While this method definitely makes the canonical approach to stack buffer overflow exploitation fail, it is not without its problems. First, it is common to find ways to store shellcode in unprotected memory regions like the heap, and so very little need change in the way of exploitation.[11]

Even if this were not so, there are other ways. The most damning is the so-called return to libc method for shellcode creation. In this attack the malicious payload will load the stack not with shellcode, but with a proper call stack so that execution is vectored to a chain of standard library calls, usually with the effect of disabling memory execute protections and allowing shellcode to run as normal.[12] This works because the execution never actually vectors to the stack itself.

A variant of return-to-libc is return-oriented programming, which sets up a series of return addresses, each of which executes a small sequence of cherry-picked machine instructions within the existing program code or system libraries, sequence which ends with a return. These so-called gadgets each accomplish some simple register manipulation or similar execution before returning, and stringing them together achieves the attacker's ends. It is even possible to use "returnless" return-oriented programming by exploiting instructions or groups of instructions that behave much like a return instruction.[13]

随机化内存空间布局

已隐藏部分未翻译内容,欢迎参与翻译

Instead of separating the code from the data another mitigation technique is to introduce randomization to the memory space of the executing program. Since the attacker needs to determine where executable code that can be used resides, either an executable payload is provided (with an executable stack) or one is constructed using code reuse such as in ret2libc or ROP (Return Oriented Programming) randomizing the memory layout will as a concept prevent the attacker from knowing where any code is. However implementations typically will not randomize everything, usually the executable itself is loaded at a fixed address and hence even when ASLR (Address Space Layout Randomization) is combined with a nonexecutable stack the attacker can use this fixed region of memory. Therefore all programs should be compiled with PIE (position-independent executables) such that even this region of memory is randomized. The entropy of the randomization is different from implementation to implementation and a low enough entropy can in itself be a problem in terms of brute forcing the memory space that is randomized.

随机化内存空间布局(ASLR)会对处理程序记忆体区段地址进行随机化,阻止攻击者进行返回导向编程与跳至Shellcode的攻击手法,因为攻击者需要知道可执行区段与 Shellcode 的记忆体地址。执行档如果不是地址无关代码(PIE)的格式,会导致执行档载入成处理程序之后,部分区段地址为固定的(如 .text 区段),攻击者在Linux系统上,依然可以 return 到 0x400000 地址。ASLR受到资讯熵的影响,不同的实作会有不同的资讯熵,过低的资讯熵使攻击者更容易猜测到地址,也因此32位元比起64位元的更容易受到威胁。也因此在2015年发生Stagefright漏洞,在32位元系统下,ASLR仅有8位元的资讯熵,因此每次漏洞利用,只需要从256种组合猜测可能的记忆体地址进行利用[14]

著名例子

参见

参考文献

  1. ^ 1.0 1.1 Fithen, William L.; Seacord, Robert. VT-MB. Violation of Memory Bounds. US CERT. 2007-03-27 [2017-02-19]. (原始内容存档于2012-02-05). 
  2. ^ 2.0 2.1 Dowd, Mark; McDonald, John; Schuh, Justin. The Art Of Software Security Assessment. Addison Wesley. November 2006: 169–196. ISBN 0-321-44442-6. 
  3. ^ 3.0 3.1 Levy, Elias. Smashing The Stack for Fun and Profit. Phrack. 1996-11-08, 7 (49): 14 [2017-02-19]. (原始内容存档于2021-03-04). 
  4. ^ Pincus, J.; Baker, B. Beyond Stack Smashing: Recent Advances in Exploiting Buffer Overruns (PDF). IEEE Security and Privacy Magazine. July–August 2004, 2 (4): 20–27 [2017-02-19]. doi:10.1109/MSP.2004.36. (原始内容存档 (PDF)于2016-03-04). 
  5. ^ Burebista. Stack Overflows (PDF). [2017-02-19]. (原始内容 (PDF)存档于2007-09-28).  (dead link)
  6. ^ Bertrand, Louis. OpenBSD: Fix the Bugs, Secure the System. MUSESS '02: McMaster University Software Engineering Symposium. 2002 [2017-02-19]. (原始内容存档于2007-09-30). 
  7. ^ pr1. Exploiting SPARC Buffer Overflow vulnerabilities. [2017-02-19]. (原始内容存档于2012-02-05). 
  8. ^ Curious. Reverse engineering - PowerPC Cracking on Mac OS X with GDB. Phrack. 2005-01-08, 11 (63): 16 [2017-02-19]. (原始内容存档于2013-10-05). 
  9. ^ Sovarel, Ana Nora; Evans, David; Paul, Nathanael. Where’s the FEEB? The Effectiveness of Instruction Set Randomization. [2017-02-19]. (原始内容存档于2024-07-17). 
  10. ^ Zhodiac. HP-UX (PA-RISC 1.1) Overflows. Phrack. 2001-12-28, 11 (58): 11 [2017-02-19]. (原始内容存档于2013-12-02). 
  11. ^ Foster, James C.; Osipov, Vitaly; Bhalla, Nish; Heinen, Niels. Buffer Overflow Attacks: Detect, Exploit, Prevent (PDF). United States of America: Syngress Publishing, Inc. 2005 [2017-02-19]. ISBN 1-932266-67-4. (原始内容存档 (PDF)于2023-12-26). 
  12. ^ Nergal. The advanced return-into-lib(c) exploits: PaX case study. Phrack. 2001-12-28, 11 (58): 4 [2017-02-19]. (原始内容存档于2014-01-27). 
  13. ^ Checkoway, S.; Davi, L.; Dmitrienko, A.; Sadeghi, A. R.; Shacham, H.; Winandy, M. Return-Oriented Programming without Returns. Proceedings of the 17th ACM conference on Computer and communications security - CCS '10. October 2010: 559–572. ISBN 978-1-4503-0245-6. doi:10.1145/1866307.1866370. 
  14. ^ Mark Brand. Stagefrightened?. Project Zero. Google.