Don’t Believe Everything You Read: Security Bugs in Reference Materials

In this post, we consider the impact that the explosion of new software frameworks, and the frequent need to shift between them, has on security: (1) the tendency to use technologies without fully understanding them, (2) the use of non-peer reviewed sources for assistance and documentation when solving a problem, and (3) the presence of security bugs in code examples in printed materials from respected publishers. Finally, we conclude with recommendations for developers and authors to help avoid inadvertent subtle security bugs as a result of relying on or producing faulty documentation.

The shift toward Web-based applications over the past two decades has led to a proliferation of new technologies. Free of the overhead of installing extensive runtime environments on the client in order to use a new language or library, Web developers have a huge number of frameworks upon which they might develop an application: ASP.NET, J2EE, Spring, ColdFusion, PHP, Google Web Toolkit, Ruby on Rails, Django, mod_perl, and Node to name a few, not to mention the JavaScript libraries available at the client, such as jQuery and AngularJS. Web development can even involve learning new domain-specific languages, including ERB, HAML, Stylus, JSON, and of course, HTML, CSS, and JavaScript.

The rapid pace of change in web technologies, and the tendency for the technologies used to shift from project to project, inevitably means that developers do not always have the ability to thoroughly learn the technology before using it. Gone are the days when a developer might purchase a copy of Kernighan and Ritchie’s The C Programming Language as an undergraduate, review it from cover to cover, and continue to refer to such an in- depth language reference on a day-to-day basis throughout their career. Shortcuts (such as code snippets, Internet searches, and message boards) and knowledge markets (such as Stack Overflow and Superuser) have become essential tools in keeping pace with today’s rapid application development cycles. By using these tools, developers can forego reviewing hundreds of pages of reference manuals and detailed documentation.

The consequence is that much of the reference material that today’s developers use has undergone little to no peer review. Compounding this, developers without inside-and-out knowledge of a language and framework are unlikely to spot any subtle errors before using an example. This has the potential to affect the security of the applications that developers using these resources produce.

Even printed material can be affected. Recently, we at ISE spotted a security bug in the O’Reilly title Learning Node by Shelley Powers (first edition, ISBN 978-1-449-32307-3). In a chapter on web sockets, the book provides the following example code illustrating how a chat client could handle a “chat” event, to receive an incoming message from the server (p. 282):

  1. socket.on('chat',function(username, data) {
  2.    var p = document.createElement('p');
  3.    p.innerHTML = username + ': ' + data;
  4.    document.getElementById('output').appendChild(p);
  5. });

The use of the innerHTML property to insert the data received from the server into the DOM at the client renders this example code vulnerable to cross-site scripting. Developers who copy this design will reproduce the same vulnerability in their own code. As a consequence, an adversary who has control over the data received can cause JavaScript code to be injected into the page and executed in the context of the website displaying it. This could allow the attacker to steal information from the displayed page using DOM traversal, make unauthorized calls to the server and receive the resulting response using the victim’s account, or otherwise leverage JavaScript to attack the site or the victim’s browser. In this example, the data should either be escaped before inserting it into the innerHTML property, by replacing the dangerous characters &, <, >, ", ', and / with the HTML entity encodings &amp;, &lt;, &gt;, &quot;, &#x27;, and &#x2f;, respectively [] or by passing the data out-of- band as a DOM text node where it will not be parsed as HTML, by replacing line 3 with the statement p.appendChild(document.createTextNode(username + ': ' + data));.

Software developers must treat example code with suspicion before using the techniques given in the example in their own code. If sample code uses a property, method, or other construct that you do not fully understand, look it up first. If you suspect a security vulnerability or even a less-consequential bug in the sample code, investigate it; do not assume that the example has already been carefully checked for such problems or that the author knows more about the language than you do. Authors and publishers, on the other hand, should increase the number of technical reviewers of their publications and make these reviews more stringent. Example code should not allow a security vulnerability to exist without explicitly cautioning readers against it, even if this decreases the readability or increases complexity slightly. Just as no well-reviewed C book should offer an example containing an unchecked buffer, no Web programming book should offer one susceptible to cross-site scripting.

The increasing divergence of technology stacks and frameworks, with Web development being the archetypical example, mean that programmers must learn new software technologies faster than ever before. Completing a functional product on time and within budget is difficult enough. We should not let this difficulty be compounded by the existence of subtle security bugs in the examples we provide as authors and mentors, and we should be cautious of such issues as users of reference materials.

Additional Information

Readers interested in further technical details about this topic can reach us at: contact AT securityevaluators DOT com