本文翻译自:Concept behind these four lines of tricky C code

Why does this code give the output C++Sucks ? 为什么这段代码给输出C++Sucks What is the concept behind it? 它背后的概念是什么?

#include <stdio.h>double m[] = {7709179928849219.0, 771};int main() {m[1]--?m[0]*=2,main():printf((char*)m);

Test it here . 在这里测试一下 。




It is just building up a double array (16 bytes) which - if interpreted as a char array - build up the ASCII codes for the string "C++Sucks" 它只是构建一个双数组(16个字节) - 如果解释为char数组 - 为字符串“C ++ Sucks”构建ASCII代码

However, the code is not working on each system, it relies on some of the following undefined facts: 但是,代码不能在每个系统上运行,它依赖于以下一些未定义的事实:

  • double has exactly 8 bytes double有8个字节
  • endianness 字节序


Disclaimer: This answer was posted to the original form of the question, which mentioned only C++ and included a C++ header. 免责声明:这个答案被发布到问题的原始形式,其中仅提到C ++并包含C ++标题。 The question's conversion to pure C was done by the community, without input from the original asker. 问题转换为纯C是由社区完成的,没有原始提问者的意见。

Formally speaking, it's impossible to reason about this program because it's ill-formed (ie it's not legal C++). 从形式上讲,这个程序是不可能的,因为它是不正确的(即它不是合法的C ++)。 It violates C++11[basic.start.main]p3: 它违反了C ++ 11 [basic.start.main] p3:

The function main shall not be used within a program. 函数main不得在程序中使用。

This aside, it relies on the fact that on a typical consumer computer, a double is 8 bytes long, and uses a certain well-known internal representation. 除此之外,它依赖于这样的事实:在典型的消费者计算机上, double是8字节长,并且使用某种众所周知的内部表示。 The initial values of the array are computed so that when the "algorithm" is performed, the final value of the first double will be such that the internal representation (8 bytes) will be the ASCII codes of the 8 characters C++Sucks . 计算数组的初始值,以便在执行“算法”时,第一个double精度的最终值将使得内部表示(8个字节)将是8个字符C++Sucks的ASCII代码。 The second element in the array is then 0.0 , whose first byte is 0 in the internal representation, making this a valid C-style string. 数组中的第二个元素是0.0 ,其内部表示中的第一个字节为0 ,使其成为有效的C样式字符串。 This is then sent to output using printf() . 然后使用printf()将其发送到输出。

Running this on HW where some of the above doesn't hold would result in garbage text (or perhaps even an access out of bounds) instead. 在硬件上运行此操作,其中一些上述操作不会导致垃圾文本(或者甚至是访问超出范围)。


The code could be re-written like this: 代码可以像这样重写:

void f()
{if (m[1]-- != 0){m[0] *= 2;f();} else {printf((char*)m);}

What it's doing is producing a set of bytes in the double array m that happen to correspond to the characters 'C++Sucks' followed by a null-terminator. 它正在做的是在double数组m中产生一组字节,恰好对应于字符'C ++ Sucks',后跟一个空终止符。 They've obfuscated the code by choosing a double value which when doubled 771 times produces, in the standard representation, that set of bytes with the null terminator provided by the second member of the array. 他们通过选择一个double值来模糊代码,当加倍771次时,在标准表示中产生的字节集与数组的第二个成员提供的null终止符相同。

Note that this code wouldn't work under a different endian representation. 请注意,此代码在不同的endian表示下不起作用。 Also, calling main() is not strictly allowed. 此外,不严格允许调用main()


More readable version: 更易阅读的版本:

double m[2] = {7709179928849219.0, 771};
// m[0] = 7709179928849219.0;
// m[1] = 771;    int main()
{if (m[1]-- != 0){m[0] *= 2;main();}else{printf((char*) m);}

It recursively calls main() 771 times. 它以递归方式调用main() 771次。

In the beginning, m[0] = 7709179928849219.0 , which stands for C++Suc;C . 首先, m[0] = 7709179928849219.0 , 代表 C++Suc;C In every call, m[0] gets doubled, to "repair" last two letters. 在每次通话中, m[0]加倍,以“修复”最后两个字母。 In the last call, m[0] contains ASCII char representation of C++Sucks and m[1] contains only zeros, so it has a null terminator for C++Sucks string. 在最后一次调用中, m[0]包含C++Sucks ASCII字符表示, m[1]仅包含零,因此它具有C++Sucks字符串的空终止符。 All under assumption that m[0] is stored on 8 bytes, so each char takes 1 byte. 所有假设m[0]都存储在8个字节上,因此每个char占用1个字节。

Without recursion and illegal main() calling it will look like this: 没有递归和非法的main()调用它将如下所示:

double m[] = {7709179928849219.0, 0};
for (int i = 0; i < 771; i++)
{m[0] *= 2;
printf((char*) m);


The number 7709179928849219.0 has the following binary representation as a 64-bit double : 数字7709179928849219.0具有以下二进制表示形式为64位double 7709179928849219.0

01000011 00111011 01100011 01110101 01010011 00101011 00101011 01000011
+^^^^^^^ ^^^^---- -------- -------- -------- -------- -------- --------

+ shows the position of the sign; +表示标志的位置; ^ of the exponent, and - of the mantissa (ie the value without the exponent). 指数的^ ,和-尾数(即没有指数的值)。

Since the representation uses binary exponent and mantissa, doubling the number increments the exponent by one. 由于表示使用二进制指数和尾数,因此将数字加倍会使指数递增1。 Your program does it precisely 771 times, so the exponent which started at 1075 (decimal representation of 10000110011 ) becomes 1075 + 771 = 1846 at the end; 你的程序精确地完成了771次,所以从1075开始的指数(十进制表示10000110011 )最后变为1075 + 771 = 1846; binary representation of 1846 is 11100110110 . 1846年的二进制表示是11100110110 The resultant pattern looks like this: 结果模式如下所示:

01110011 01101011 01100011 01110101 01010011 00101011 00101011 01000011
-------- -------- -------- -------- -------- -------- -------- --------
0x73 's' 0x6B 'k' 0x63 'c' 0x75 'u' 0x53 'S' 0x2B '+' 0x2B '+' 0x43 'C'

This pattern corresponds to the string that you see printed, only backwards. 此模式对应于您看到的打印字符串,仅向后。 At the same time, the second element of the array becomes zero, providing null terminator, making the string suitable for passing to printf() . 同时,数组的第二个元素变为零,提供null终止符,使得该字符串适合传递给printf()


