Ratchet and Clank: Up Your Arsenal was an online title that shipped without the ability to patch either code or data. Which was unfortunate.The game downloads and displays an End User License Agreement each time it's launched. This is an ascii string stored in a static buffer. This buffer is filled from the server without checking that the size is within the buffer's capacity.
We exploited this fact to cause the EULA download to overflow the static buffer far enough to also overwrite a known global variable. This variable happened to be the function callback handler for a specific network packet. Once this handler was installed, we could send the network packet to cause a jump to the address in the overwritten global. The address was a pointer to some payload code that was stored earlier in the EULA data.
Valuable data existed between the real end of the EULA buffer and the overwritten global, so the first job of the payload code was to restore this trashed data. Once that was done things were back to normal and the actual patching work could be done.
in the majority of Latin languages, ø sorts as an accented variant of o, meaning that most users would expect ø alongside o. However, a few languages, such as Norwegian and Danish, sort ø as a unique element after z. Sorting “Søren” after “Sylt” in a long list, as would be expected in Norwegian or Danish, will cause problems if the user expects ø as a variant of o.
Alphabetical order explained in a mere 27,817 words. previously
At the end of 1386, Jogaila returned to Vilnius to [..] convert the Grand Duchy to Catholicism. [..] New converts were baptized en masse, with little teaching, and were awarded wool shirts.
"People will do anything for a free t-shirt", historical perspective edition. From
"Union of Krewo" on Wikipedia.
Different locales also return different versions of the same symbol. The US locale (en_US) returns a full-width ¥ symbol, where as the Japan locale (ja_JP) returns a regular ¥ symbol. Similarly, the French locale (fr_FR) will return a non-breaking space between the digits and the symbol, where as the French Canadian locale (fr_CA) which formats numbers the same way (“15,00 $NZ”, like above) uses a regular space.
This is an utterly brilliant list of broken assumptions under Unicode from rjh. Perl-biased, but syntax aside, the majority of these are just generally true. A trimmed list of my personal favorites (but you should read the whole list):
- Code that assumes it can open a text file without specifying the encoding is broken.
- Code that assumes [any language] uses UTF‑8 internally is wrong.
- Code that assumes [..] code points are limited to 0x10_FFFF is wrong.
- Code that assumes roundtrip equality on casefolding [..] is completely broken and wrong. Consider that the uc(“σ”) and uc(“ς”) are both “Σ”, but lc(“Σ”) cannot possibly return both of those.
- Code that assumes every lowercase code point has a distinct uppercase one, or vice versa, is broken. For example, “ª” is a lowercase letter with no uppercase; whereas both “ᵃ” and “ᴬ” are letters, but they are not lowercase letters; however, they are both lowercase code points without corresponding uppercase versions. Got that? They are not p{Lowercase_Letter}, despite being both p{Letter} and p{Lowercase}.
- Code that assumes changing the case doesn’t change the length of the string is broken.
- Code that assumes only letters have case is broken. Beyond just letters, it turns out that numbers, symbols, and even marks have case. In fact, changing the case can even make something change its main general category, like a p{Mark} turning into a p{Letter}. It can also make it switch from one script to another.
- Code that assumes you can remove diacritics to get at base ASCII letters is evil, still, broken, brain-damaged, wrong, and justification for capital punishment.
- Code that assumes characters like > always points to the right and < always points to the left are wrong — because they in fact do not.
- Code that assumes if you first output character X and then character Y, that those will show up as XY is wrong. Sometimes they don’t.
- Code that assumes that ü has an umlaut is wrong.
- Code that believes things like ₨ contain any letters in them is wrong.
- Code that believes that given $FIRST_LETTER as the first letter in some alphabet and $LAST_LETTER as the last letter in that same alphabet, that [${FIRST_LETTER}-${LAST_LETTER}] has any meaning whatsoever is almost always complete broken and wrong and meaningless.
- Code that believes someone’s name can only contain certain characters is stupid, offensive, and wrong.
- Code that converts unknown characters to ? is broken, stupid, braindead, and runs contrary to the standard recommendation, which says NOT TO DO THAT! RTFM for why not.
- Code that believes once you successfully create a file by a given name, that when you run ls or readdir on its enclosing directory, you’ll actually find that file with the name you created it under is buggy, broken, and wrong. Stop being surprised by this!
- Code that believes UTF-16 is a fixed-width encoding is stupid, broken, and wrong. Revoke their programming licence.
- Code that believes that stuff like /s/i can only match “S” or “s” is broken and wrong. You’d be surprised.
It turns out that in January 1970, the UK government was in the middle of an experimental change to a year-round GMT+1 timezone. Some operating systems seem to be aware of that fact; some aren’t
Argh; time.
[Python-ideas] Please reconsider the Boolean evaluation of midnight
Core Data is also a supported engine for iCloud syncing. It’s supposed to be no more complex than just using Core Data on its own, so imagine everyone’s surprise when it was just as complex as using Core Data on its own.
Somebody is going to say something like, “Dan, lighten up. It’s just coffee. You don’t need to have so many feelings about it. Go out and change the world instead of fretting about coffee on the internet.” This will not be somebody who drinks much coffee.