Tuesday, 24 November 2015

Inferior Process and Incompetent Developers

In Falhar's recent post, "Everybody is doing TDD," they claim that every developer uses test-driven development (TDD), because they will either automate their tests, or they will manually test their application. They go on to state that those who are manually testing their applications are "fully incompetent." Whilst I agree that with a sufficiently broad definition, almost anyone who tests their programs are undertaking TDD. Whether that broadly-defined TDD matches the commonly accepted definition is a different matter. However, I want to argue that those who do not produce automated tests are not necessarily incompetent, but rather that this is a matter of context.

Let's take three developers working on three separate projects.

Developer A is working on a security critical software library. The library implements a well-known cryptographic construction, which is defined in one or more RFC documents. Prior to development, this developer produces an automated test suite which consists of the test vectors from the RFC and property-based randomised tests. They work alone, so there is no code or design review, but they do use commonly available static analysis and code style tools to ensure that their work is consistent and free of "obvious" errors.

Developer B is a keen gardener, but is very forgetful. In order to ensure that they do not forget to tend their various plants according to a complex schedule, they write a program to help them remember. When run by cron, the program sends them an email with the names of the plants to water. There is no over-arching specification, the requirements are all encoded within the developer's head. If the program fails, the primary impact is that some plants are not watered for a day or two, or the schedule does not work out quite as planned. To develop this program, the developer uses some simple shell scripts, and a single crontab entry.

Finally, we have Developer C. Developer C is working on the control software for a turbofan engine (commonly called a jet engine). They are part of a large team, which includes safety managers, requirements engineers, and so on. The development time scale is on the order of a decade, and starts with requirements gathering, hazard analyses, risk assessments, and so on. Due to the fact that a failed engine could send searing hot fragments of turbine blade into the passenger cabin, the decision is made to formally verify the software. Developers are not expected to test their code; they're expected to write code which can be shown to be equivalent to the specification. Testing is handled by a large and dedicated assurance team, who test both the components, and the system as a whole. The closest to testing that developer C undertakes is checking that their code and associated proof holds according to the verifier.

It does not make sense to refer to any of the above developers as incompetent, despite the fact that only one of them is practising TDD. Each project calls for differing levels of assurance, and therefore different processes. Each process is completely adequate for the context, and further, it is possible that a single developer undertakes each of the projects outlined, some as part of their hobby, and some as part of their usual employment. There is no incompetence here, just different assurance levels.

TDD is a tool which is available to many developers. Not using TDD does not mark a developer as incompetent. Using a process which is inappropriate for the assurance level required by a project may well result in poor outcomes, but often developers do not decide on the process. In the cases where developers do decide on the process, it may be the case that their choices are guided by forces other than software correctness, such as market forces, management pressure, team familiarity, and so on. There may be cases where the wrong process is used for the situation, and often this would be referred to as negligence and would likely be incompetence.

Saturday, 7 November 2015

Ransomware on Linux

Dr.WEB is reporting that ransomware has come to the Linux ecosystem. Fortunately, this has only affected "tens" of users thus far. In particular, this malware is targeting those with a lot to lose: web site administrators. This gives the malware a good chance of ensnaring some business-critical data or functionality, thereby giving the victim a bit more incentive to pay the ransom.

Ransomware has been around for some time in the Windows ecosystem. Previously these programs would show a dialogue, claiming that the machine was locked and could be unlocked when a suitable payment was made. In reality, these were often just programs configured to run automatically on start-up, and did not directly endanger user data. In recent years, these have made attempts at encrypting the user's data and putting the key out of reach. A prompt payment promises to return the key, and thus the data, to the victim. These have had varying levels of success, with the "best" managing to pull in millions of dollars for their creators. They have not been without their flaws which allowed the victims to recover their data without paying; some variants stored the key locally on the machine, some eventually had the keys disclosed by security researchers, and some which have yet to to be broken. Often, organisations have no option but to pay the ransom.

Fortunately, this particular strain of malware requires extensive user interaction to run, requiring root privileges. This does not prevent future generations of this malware piggy-backing on other access vectors, such as vulnerable web browsers, email clients, web servers, and so on. I would predict that we will see this kind of malware attached to remote exploits in the moderately near future. Even using old exploits, or only encrypting a user's home directory could turn up quite the bounty for the attacker, as those who don't update their systems may well not have suitable backup processes in place to recover from the attack, and many people store their valuable files in their home directory.

There are a few options to mitigate the risk posed by this threat. However, none will be wholly effective, so a combination may be required. For some organisations, this will simply be a strengthening or verification of existing defences. For others, this threat may call for entirely new defences to be deployed.

The first and most common would be to ensure that all systems under your control have all relevant security patches applied. This should limit the likelihood of an exploit being used to launch an attack without user interaction. A backup system which stores backups offline should be used. If an on-line backup system is in use, either deploy an offline system or ensure that a previously saved backup cannot be overwritten by a corrupted copy, or easily reached by an attacker. This will reduce the impact of a breach, as it should be possible to recover from relatively recent backups in the event of a compromise. Where possible, software which consumes untrusted input, such as web browsers, email clients, web servers, and so on, should be placed into to a suitable sandbox environment. This should reduce the likelihood that the malware will be able to reach critical business data. Finally, better user education may reduce the likelihood of a breach, as they may be better able to detect social engineering attacks which might have otherwise lead them to run the malware.

It is fortunate that Linux has several sandbox mechanisms available, and an appropriate one can be selected. Such mechanisms include chroots, SELinux, AppArmor, or seccomp-bpf. Other systems, such as FreeBSD, should not be considered invulnerable, and similar mitigations applied, such as the use of jails or Capsicum. Unfortunately, restricting a complex web browser's access to the file system may have unexpected consequences, or simply be very time consuming. Ubuntu provides an AppArmor profile to do this for Chromium. However, it is not without it's issues, such as not being able to determine if it is the default browser on the system.

Saturday, 8 August 2015

SQLite and Testing

Categorical claims are often the source of faulty statements. "Don't test with SQLLite [sic] when you use Postgres in Production"  by Robellard is a fantastic example. I actually agree with a variant of this statement: "If you need high levels of assurance, don't test with SQLite alone when you use Postgres in production."

Robellard bases his claim on several points, noting that "SQLite has different SQL semantics than Postgres," "SQLite has different bugs than Postgres," and "Postgres has way more features that SQLite." He has a couple more points, but all of these largely amount to a discrepancy between SQLite and Postgres, or between one Postgres version and another, leading to a defect. These points are a genuine concern, but his claim relies on using exactly one database back-end for testing, and exactly one risk profile for various applications.

As a quick diversion, I am not using the common definition of risk which is synonymous with chance. I am using a more stringent definition: "the effect of uncertainty on objectives" as specified in ISO Guide 73:2009. This definition often requires an assessment of both the impact and likelihood of some form of scenario to obtain a fuller picture of an "effect."

If the risk posed by defects caused by an SQLite-Postgres discrepancy is too high, then you'll likely want use Postgres as part of your testing strategy. If the risk posed is sufficiently low, then SQLite alone may be appropriate. These are predicated on the risk posed by defects, and the organisational appetite for risk.

A testing strategy comprising several different testing methodologies can often be thought of as a filter of several layers. Different layers are variously better or worse at surfacing different types of defects. Some are more likely to surface defects within components, and others are better at locating defects in the interactions between components. Other "layers" might be useful for catching other classes of defects. Each layer reduces the likelihood of a defect reaching production, which reduces the risk that defects pose. Each layer also has a cost associated with writing and maintaining that layer.

It's quite common for different layers to be run at different times. For instance, mock-based unit tests might be run very frequently by developers. This provides the developers with very quick feedback on their work. Integration tests backed by an in-memory database might be run prior to committing. These take a little longer to run and so might get run less often, but still catch most problems caused by erroneous component interactions. A continuous integration (CI) server might run integration tests backed by Postgres, and slower UI tests periodically. Finally, penetration tests might be conducted on a yearly or six-monthly basis.

This sort of process aims to allow developers the flexibility to work with confidence by providing quick feedback. However, it also provides heavier-weight checking for the increased levels of assurance required for the risk-averse organisation. An organisation with a greater appetite for risk may remove one or more of those layers, such as in-memory integration tests, to speed development. This saves them money and time but increases their exposure to risk posed by defects.

SQLite is just a tool which may be used as part of one's testing strategy. Declaring "Don't test with SQLLite [sic] when you use Postgres in Production" ignores how it may be usefully applied to reduce risk in a project. In many cases SQLite is entirely appropriate, as the situation simply does not require high levels of assurance. In other cases, it may form part of a more holistic approach along side testing against other database backends, or be removed entirely.

Not every organisation is NASA, and not every project handles secrets of national import. Most failures do not kill people. An honest assessment of the risks would ideally drive the selection of the testing strategy. Often-times this selection will be balanced against other concerns, such as time-to-market and budget. There is no silver bullet. A practical, well-rounded solution is often most appropriate.

Saturday, 25 July 2015

Infosec's ability to quantify risk

In Paul Graham's latest post, "Infosec's inability to quantify risk," Graham makes the following claim:
"Infosec isn't a real profession. Among the things missing is proper "risk analysis". Instead of quantifying risk, we treat it as an absolute. Risk is binary, either there is risk or there isn't. We respond to risk emotionally rather than rationally, claiming all risk needs to be removed. This is why nobody listens to us. Business leaders quantify and prioritize risk, but we don't, so our useless advice is ignored."

I'm not going to get into a debate as to the legitimacy of infosec as a profession. My job entails an awful lot of infosec duties, and there are plenty of folks turning a pretty penny in the industry. It's simply not my place to tell people what they can and cannot define as a "profession."

However, I do take issue with the claim that the infosec community lack proper risk analysis tools. We have risk management tools coming out of our ears. We have risk management tools at every level. We have those used at the level of design and implementation, for assessing the risk a vulnerability poses to an organisation, and even tools for analysing risk at an organisational level.

At the design and implementation level, we have software maturity models. Many common ones explicitly include threat modelling and other risk assessment and analysis activities.

One of the explicit aims of the Building Security in Maturity Model (BSIMM) is "Informed risk management decisions." Some activities in the model include "Identify PII obligations" (CP1.2) and "Identify potential attackers" (AM1.3). These are the basic building blocks of risk analysis activities.

The Open Software Assurance Maturity Model (OpenSAMM) follows a similar pattern, including a requirement to "Classify data and applications based on business risk" (SM2) and "Explicitly evaluate risk from third-party components" (TA3).

Finally, the Microsoft Security Development Lifecycle requires that users "Use Threat Modelling" to "[...] determine risks from those threats, and establish appropriate mitigations." (SDL Practice #7).

So, we can clearly see that risk analysis is required during the design and implementation of a system. Although no risk management methodology is prescribed by the maturity models, it's easy to see that we're clearly in an ecosystem that's not only acutely aware of risk, but also the way those risks will impact organisational objectives.

If these maturity models fail to produce adequately secure software, we need to understand how bad a vulnerability is. Put simply, statements like "On the scale of 1 to 10, this is an 11" are not useful. I understand why such statements are sometimes necessary, but I worry about the media becoming fatigued.

Vulnerabilities are classified using one of several methods. Off the top of my head, I can think of three:
  1. Common Vulnerability Scoring System (CVSS)
  2. DREAD Risk Assessment Model (Wikipedia)
  3. STRIDE (Wikipedia)
These allow for those with infosec duties to roughly determine the risk that a vulnerability may pose to their organisation. Put simply, they allow for the assessment of the risk posed to one's systems. They are a (blunt) tool for risk assessment.

Finally, there are whole-organisation mechanisms for managing risks, which are often built into an Information Security Management System (ISMS). One of the broadest ISMS standards is BS ISO/IEC 27001:2013, which states:
"The organization shall define and apply an information security risk assessment process [...]"
If this seems a bit general, you should be aware that an example of a risk management process (which includes mechanisms for risk assessment & analysis) is available in BS ISO/IEC 27005:2011.

Let's look at the CERT Operationally Critical Threat, Asset, and Vulnerability Evaluation (OCTAVE) Allegro technical report:
"OCTAVE Allegro is a methodology to streamline and optimize the process of assessing information security risks [...]"
Similarly, Appendix A provides guidance on risk management, which includes sections on risk assessment and analysis.

Yet another standard is NIST SP800-30 Revision 1, "Guide for Conducting
Risk Assessments". It states it's purpose quite clearly in section 1.1 "Purpose and Applicability"
"The purpose of Special Publication 800-30 is to provide guidance for conducting risk assessments [...]"
NIST SP800-30 Revision 1 also provides an example of how to conduct a risk assessment.

As you can see, members of the infosec community have quite a few tools for risk assessment and analysis at our finger-tips. From the design and implementation of software, through to the assessment of individual vulnerabilities, and even for assessing, analysing, and mitigating organisational risk, we're well equipped.

The infosec community is often very bad at communicating, and the media likes a salacious story. How often have you heard that a cure for cancer has been found, sight returned to the blind, and teleportation achieved? Recently, members of the infosec community have played into this, but that does not eliminate the fact that we do have tools for proper risk management. Our field is not so naive that we blindly believe all risk to be unacceptable.

Sunday, 7 June 2015

The Four Words of Distress

The title of this post is taken directly from The Codeless Code, "The Four Words of Distress"

I read it again yesterday, and today I have been bitten by it.

I accidentally removed the content of a comment on this blog. I don't get many comments, so I feel that all of them are important contributions (except the incessant 50 Shades of Grey spam, urgh, do I regret writing about that.) As such, removing a comment is "serious" to me.

Upon realising my error, I immediately searched the page for an undo or a "restore" link of some sort, and eventually went to search for an answer to my problems. I only found unanswered questions on Google's Groups, and a Product Help page claiming that I should've been sent to a confirmation page (no such thing happened)

To have fixed this problem, there would have been any number of ways to deal with it:

  1. A confirmation dialogue, alongside a "don't show this again" check box.
  2. A trash can, which comments can go to for a fixed period of time.
  3. An undo button
  4. A help guide.
As it stands, I accidentally clicked the "Remove Content" link, and now, without warning, the comment has gone. I worry that this is a black mark against the commenter's account, when it is a simple mistake.

Saturday, 6 June 2015

Chrooting a Mumble Server on OpenBSD

One of my colleagues is starting remote working shortly. As such, we needed a VoIP solution that worked for everyone, Mac, Linux and FreeBSD. It was discovered that Mumble provided ample quality and worked everywhere. Top it off with the fact that we could host it ourselves, and we looked to be set.

However, being security conscious, I like to sandbox any service I have. Further, since this solution is slightly ad-hoc at the moment, it's being run off a personal BigV server, running OpenBSD. So I set out to chroot the package-managed mumble server, umurmur, which does not include sandboxing as default.

Fortunately, if no logfile is specified, umurmurd will log to syslog, and it does not support config reloading, so I don't need to worry about that.

Chrooting is not entirely simple, and simple versions can be improved by "refreshing" the chroot every time the service starts. This means that if an attacker infects the binary in some way, it'll get cleared out and replaced after a restart. As another bonus, if a shared library is updated, it simply won't get found, which tells you what to update! If the binary gets updated, it'll be copied in fresh when the service is restarted.

To do this, we modify /etc/rc.d/umurmur:

# $OpenBSD: umurmurd.rc,v 1.2 2011/07/08 09:09:43 dcoppa Exp $
# 2015-06-06, Turner:
# Jails the umurmurd deamon on boot, copies the daemon binary and
# libraries in each time.
# An adversary can still tamper with the logfiles, but everything
# else is transient.

daemon="chroot $chroot $original_daemon"

build_chroot() {
  # Locations of binaries and libraries.
  mkdir -p "$chroot/usr/local/bin"
  mkdir -p "$chroot/usr/lib"
  mkdir -p "$chroot/usr/local/lib"
  mkdir -p "$chroot/usr/libexec"

  # Copy in the binary.
  cp "$original_daemon" "$chroot/usr/local/bin/"

  # Copy in shared libraries
  cp "/usr/lib/libssl.so.27.0" "$chroot/usr/lib/"
  cp "/usr/lib/libcrypto.so.30.0" "$chroot/usr/lib/"
  cp "/usr/local/lib/libconfig.so.9.2" "$chroot/usr/local/lib/"
  cp "/usr/local/lib/libprotobuf-c.so.0.0" "$chroot/usr/local/lib/"
  cp "/usr/lib/libc.so.77.0" "$chroot/usr/lib/"
  cp "/usr/libexec/ld.so" "$chroot/usr/libexec/ld.so"

  # Setup /etc and copy in config.
  mkdir -p "$chroot/etc/umurmur"
  cp "/etc/umurmur/umurmur.conf" "$chroot/etc/umurmur/umurmur.conf"
  cp "/etc/umurmur/certificate.crt" "$chroot/etc/umurmur/certificate.conf"
  cp "/etc/umurmur/private_key.key" "$chroot/etc/umurmur/private_key.key"

  # Setup the linker hints.
  mkdir -p "$chroot/var/run/"
  cp "/var/run/ld.so.hints" "$chroot/var/run/ld.so.hints"
  cp "/usr/libexec/ld.so" "$chroot/usr/libexec/ld.so"

  # Copy the pwd.db password database in. This is less-than-ideal.
  cp "/etc/pwd.db" "$chroot/etc/"
  grep "$group" "/etc/group" > "$chroot/etc/group"

  # Setup /dev
  mkdir "$chroot/dev"
  mknod -m 644 "$chroot/dev/urandom" c 1 9
  mknod -m 644 "$chroot/dev/null" c 1 3

destroy_chroot() {
  if [ "$chroot" ]
    rm -rf "$chroot"

case "$1" in
# Standard rc.d "stuff" here.
. /etc/rc.d/rc.subr


rc_cmd $1

So there we go! When /etc/rc.d/umurmurd start is called, the chroot is setup, and umurmurd started in there. When you kill it, the chroot jail is emptied.

There are some limitations. For one, any private key (in the default, it's private_key.key) can be compromised by an attacker who can compromise umurmurd, and this can be used to impersonate the server long after the compromise. Secondly, if you do specify a log file in umurmur.conf, and you setup the relevant directory for logging to, it will be trashed when you stop the daemon. This is a real problem if you're trying to workout what happened during a compromise.

Finally, if umurmur is updated, and the required libraries do change, "ldd /usr/local/bin/umurmurd" will list the new shared objects.

Known Issues

This does not currently stop the umurmur daemon on stop. I'm not entirely sure why, but the work around is to stop the service using /etc/rc.d/umurmurd stop, then find it using ps A | grep umurmur and kill -15 it.

Sunday, 31 May 2015

Refactoring & Reliability

We rely on so many systems that their reliability is becoming more and more important.

Pay bonuses are determined based on the output of performance review systems; research grants handed out based on researcher tracking systems, and entire institutions may put their faith for visa enforcement in yet more systems.

The failure of these systems can lead to distress, financial loss, the closure of the organisations, or even prosecution. Clearly, we want these systems to have a low failure rate; be they design flaws or implementation defects.

Unfortunately for the developers of the aforementioned systems, the all have a common (serious) problem: the business rules around them are often in flux. Therefore, the systems must have the dual property of flexibility and reliability. Very often, these are in contradiction to one another.

Reliability requires requirements, specification, design, test suites, design and code review, change control, monitoring, and many other processes to prevent, detect, and recover from failures in the system. Each step in the process is designed as a filter to deal with certain kinds of failures. Without them, these failures can start creeping into a production system. These filters also reduce the agility of a team; reducing their capability to respond to new opportunities and changing business rules.

On the other hand, the flexibility demanded by the team's environment is often attained through the use of traditional object-orientated design. This is typically achieved by writing to specific design patterns. If a system is not already in a state that is considered to be "good design," a team will apply refactorings.

Refactorings are small, semantics-preserving changes to the source of a system, with the goal of migrating towards a better design. This sounds perfect. Any analysis and testing which took place prior to the refactoring should still be valid! [1].
However, even though the semantics of the source are preserved (although, humans do occasionally make mistakes!), other observable properties of the program are not preserved. Any formal argument that was made regarding the correctness, or time, space or power requirements may not be valid after the refactoring.

Not only does the refactoring undermine any previous formal argument, it can often make it more difficult to construct a new argument for the new program. This is because many of the refactoring techniques given introduce additional indirection, duplicate loops, or use dynamically allocated objects. These are surprisingly difficult to deal with in a formal argument. So much so that many safety-critical environments simply do not support them, for example, SPARKAda. In many common standards aimed at safety critical systems, they are likewise banned.

I am not arguing against refactoring. I think it's a great tool to have in one's toolbox. I also think that like any other tool, it needs to be used carefully and with prior thought. I'd also shy away from the idea that just because something's important, it is critical. With a suitable development process, a development team can remain agile whilst still reducing the risk of a serious failure to an acceptable level.

In the end, it's exactly that -- the balance of risks. If a team is not responsive, they may miss out on significant opportunities.To mitigate this risk, teams introduce flexibility into their code through refactoring. To mitigate the risk of these refactorings causing a serious failure [2], the team should employ other mitigations, for example, unit and integration testing, design and code review, static analysis, and so on. Ideally, to maintain the team's agility, they should be as automated and integrated into their standard development practice as possible. Each team and project is different, so they would need to assess which processes best mitigate the risks, whilst maintaining that flexibility and agility.

[1] Fowler states that for refactorings to be "safe", you should have (as a minimum) comprehensive unit tests.
[2] Assuming that the "original" system wasn't going to cause a serious failure regardless.

Saturday, 11 April 2015

All White Labour?

With the up and coming general election, we've been receiving election material.

When my other half mentioned that the leaflet we received from the Labour candidate for York Central looked a little bit "overly white" (paraphrasing), I  decided to run the numbers.

The York Unitary Authority's demographics, from the 2011 census show that we have 94.3% "white" people [1].

We sat down and counted the faces on the leaflet, excluding the candidate themselves. We came to a count of 14 faces, all of which were white.

The chance of picking 14 people at random from a population which is 94.3% white and getting 14 white people is 43.97%. That means that the chance of getting at least one non-white person on the leaflet would've been 56.03%.

Obviously, this is quite close to a toss-up, but bear in mind that these people aren't usually selected for at random. All sorts of biases go into selecting people for photo shoots, from who turns out, to who interacts with the candidate, who the photographer chooses to take photographs of and who is selecting which photos from the shoots end up on the page and their biases towards what will "look good," and what is expected to promote an image of diversity.

Anyways, I don't want to say one way or the other about what this does or does not mean, I just want the data to be available for people.


[1] : 2011 Census: KS201EW Ethnic group, local authorities in England and Wales (Excel sheet 335Kb) (http://www.ons.gov.uk/ons/publications/re-reference-tables.html?newquery=*&newoffset=0&pageSize=25&edition=tcm%3A77-286262) Accessed: 2015-04-11

Monday, 6 April 2015

OpenBSD time_t Upgrade

Last night I foolishly undertook the upgrade from 5.4 to 5.5, without properly reading the documentation. My login shell is zsh, which meant that, when the upgrade was complete, I couldn't login to my system.

I'd get to the login prompt, enter my credentials, see the motd, and be kicked out as zsh crashed, due to the change from 32-bit time_t to 64-bit time_t change. I'd also taken the security precaution of locking the root account.

If fixed this as follows:

  1. Reboot into single-user mode (boot -s at the boot> prompt)
  2. Mounted my filesystems (they all needed fsck running on them, before mount -a)
  3. Changed my login shell to /bin/sh (chsh -s /bin/sh <<username>>)
  4. Rebooted.
After that, it was a simple question of logging in and doing the usual; update my PKG_PATH to point at a 5.4 mirror, and running "pkg_add -u" to upgrade all my affected packages.

I then continued on to upgrade my system to OpenBSD 5.5.

A quick warning: This is one particular failure mode arising from not reading the docs. It may get you back into your system, but it's unlikely to fix all your woes if you don't read the docs.

So, the moral of the story is: ALWAYS READ THE DOCS.

Monday, 16 February 2015

Keeping a Web Service Secure

This post is aimed at those tasked with developing and maintaining a secure system, but who have not had to do so previously. It is primarily a sign-posting exercise, pointing initiates in the right direction. It is not gospel; many things require common sense, and there is rarely a right answer. Everything is a trade off; performance for security, usability for development time, backup size for recovery ability. To do this effectively, you need to consider your systems as a whole, including the people running and building them.

Keeping a web service adequately secure is quite a task. Many of the things covered will simply not apply, depending on your business model. For instance, web services which use services such as Google App Engine, Heroku, or OpenShift will not need to keep up with OS patches. Those that are for intranet use only may be able to get away with weaker password policies if they're not usually exposed to the internet.

Use your common sense, but be prepared to backup your decisions with evidence. When the executioner comes, you need to be able to honestly argue that you did the right thing, given the circumstances. If you can't do this honestly, you will get found out, your failure will be compounded, and the outcomes all the more severe.

The whole aim of these recommendations is to give you a head start removing potential avenues of attack for your adversaries, providing some mitigation to recover from an attack, and giving you plenty of places to go do further research.

You'll need to do the basics for securing your infrastructure. As with most other things, this is an ongoing process.

You will likely need, as a bare minimum, a properly configured firewall. I'm a big fan of pf, but IPTables is probably more widely used. Use what's appropriate for your platform, do some reading, and make sure you've got it dropping as much un-necessary traffic as possible, in and out of your organisation.

For remote access, most services go with SSH. OpenSSH is easily hardened. I'd recommend having a good read of Lucas' SSH Mastery to familiarise yourself with the finer points of the service. If your platform doesn't have SSH, it's likely that you'll have something similar, but you'll have to do your research.

If your company has more than a small handful of employees, sudo is an absolute life saver. Once again, Lucas provides a good book on utilising sudo for access control & auditing in Sudo Mastery.

You must keep any services you run up to date. This is everything, database, web server, remote access, kernel, etc. This will entail some downtime. A little and often goes a long way towards preventing major outages.

Often, people have perfectly good intentions, but fail to keep up with patches. There are three main causes.

The first is wanting to avoid any and all scheduled downtime. If you plan correctly, and notify your users ahead of time, this is no issue. People largely don't mind the service disappearing at a well-known time for an hour a month.

The second is harder to combat. Ignorance of the updates existence, the security implications of not applying the patch, or knowledge of how to apply them. You need to collate a list of your dependencies (web servers, databases, OS patches, etc.) and their security announcement pages. This list then needs to be checked, and any relevant issues assessed. You should also be aware of how you actually update each piece of software. Many operating systems from the unix family make this easy with a package manager, but I don't know how the rest of the computing world fares in that arena.

Typically, Mitre will provide an assessment of the risk posed to organisations by a vulnerability, which is normally a reasonable estimate of the risk posed to your organisation. High risk vulnerabilities may need re-mediation outside your normal downtime cycle, lower risk ones may be able to wait.

The third is that many people fear updates as they can break things. This risk only gets worse with time. If you're not keeping up with the latest security patches, then when things break, you need to work out which of the 17 patches you just applied did it. With just one or two patches, you can file a bug report, or determine if you're mis-using the dependency much more easily.

A hidden set of dependencies are those of your own bespoke software, which are pulled in by some form of build process. Many organisations simply ignore is the dependencies of their bespoke software. For instance, with a large Java application, it's all too easy to get left behind when ORMs and IoCs get updated regularly. You must be regularly checking your dependencies of your bespoke software for updates, and the same arguments regarding OS level patching apply. Get too far behind, and you'll get bitten -- hard.

I'd recommend turning on OS-level exploit mitigation techniques provided by your OS. This is typically things like address-space layout randomisation (ASLR) and W^X, but plenty of others exist. Should you be on holiday when a problem arises, these will buy you some time to either put a phone call in to someone to get the ball rolling, or get to an internet cafe and open a connection with a one-time password (don't trust the internet cafe!). They also tend to frustrate attackers, and may prevent some specific vulnerabilities from being exploitable at all.

This is a huge field, and some systems don't have all of these protections turned on by default, or simply lack the feature at all. Look very closely at your OS, and the dials that you can turn up.

Other systems of policy enforcement, such as SELinux may also help, but I've not had much chance to use them. Read up on them, both pros and cons, and work out if they're of much use to you, and if they're worth your time.

The next class of problems is running obviously bad software. Even if you keep your dependencies up to date, lots of people will fall prey to this. One of the worst offenders is WordPess (and it's plugin ecosystem), but other system have not had a clean run.

Check out the history of a project before utilising it. If you've decided that it's bad, but nothing else works for you, segregate, segregate, segregate. Then monitor it for when it does get compromised. This can be done with a simple chroot, or for beefier solutions, a separate system with an aggressive firewall between it and any user data can help. For monitoring, you'll probably want something like Nagios, but there's plenty of alternatives.

If possible, you should try to segregate and monitor your bespoke software as if it were in the class of services with a poor history, but this may not be possible.

On the subject of monitoring, you should be monitoring lots of performance metrics, such as database load, queries per second, errors/second in logs, etc. These will help tell if you something funny (security related or otherwise) is up, before it becomes a major problem for you.

You may also chose to deploy an intrusion detection system (IDS). Snort is widely recommended, and comes with a whole bunch of basic rules. You'll need to tune this to filter out the noise and leave the signal. This is no small task, prepare yourself for some serious documentation divin' with most IDS/

Once you're at this point, you should be have a relatively decent security posture. But there's two crucial things I've not covered, relating to recovery from an incident.

The first is backups. Make them, store the offline (at least one business has gone under from this), and test them, and your restore processes regularly. If something bad happens, you need these to be out of reach of the threat, whether it's deliberate or otherwise.

Secondly is general incident response. However, the one I'd most like to bring to the fore is a breach communications package. This is a set of templates covering several scenarios which can be used to notify customers, fend off the press, and put on your website to explain the situation. If you're a big company, and you expose customer data, the press will call. If you expose a lot of user data, the press will call. If you are compromised by a highly news-worthy adversary (e.g. ISIL), the press will call.

Do not waste your time trying to talk to journalists, and do not waste time writing a press release under pressure. You'll do a bad job of it, and things will go from bad to worse very quickly; especially if reporters think you're available for comment.

And finally, what's the point of all this, if you're not going to be developing some bad-ass web app to scratch a particular itch.

I strongly recommend that you use some variety of a secure development lifecycle, such as OpenSAMM, BSIMM, or Microsoft's SDL. Obviously, these won't solve your issues, but they should help alleviate them.

One of the most important things on each lifecycle is developer training. Without that, you're leaving a very large, obvious and attractive target for attackers.

Many of the things in a lifecycle will overlap with, and go beyond these recommendations. That's good and fine, but most of them discuss things in very abstract terms, and hopefully this post puts some of the requirements down more concretely.

Hopefully, that should give you a good place to start. The important thing is discipline. The discipline to check for & apply patches, follow a secure development lifecycle, impose some reasonable restrictions on yourself and therefore your attackers.

Developing software is very hard. Developing secure software is maddeningly hard. For this reason, there is no security silver bullet. Anyone trying to sell you a one-stop solution is full of it. It requires careful research and implementation; and it will take time. It is easiest to do this from the start, but by injecting quality control activities into an existing process, you can begin to improve an organisation's security posture, but it will be slow. In many cases, bespoke software will have to be reviewed or simply thrown away, services migrated to better-configured infrastructure, and so on. It takes time, a lot of time.

Tuesday, 10 February 2015

Madness from the C

So, I've decided to take the plunge.

C is so widely used that not being quite intimately acquainted with it is a definite hinderance. I can read C comfortably for the most part, ably wielding the source of a few kernels or utilities to track down bugs and determine exactly how features are used is something that's not beyond my remit.

I might actually try to write some C. Originally I was going to be all hipster and show how to build your web 2.0 service in C using FastCGI, because the 90s are still alive and well. Also, you can do some pretty awesome things relating to jailing strategies (e.g. chroot+systrace, chroot+seccomp or jail+capsicum), and the performance can be good.

Unfortunately, when it comes to writing C, I freeze up. I know there is so much left to the imagination; so much to fear; so much lurking in the shadows, waiting to consume your first born. And I hate giving out advice that could lead to some one waking up to find ETIMEDOUT written in blood on the walls; even if the "advice" is given in jest. (Thanks to Mickens for the wording)

I speak of the dreaded undefined behaviour.

Undefined behaviour is a double edged sword. In tricky situations (such as INT_MAX + 1), it allows the compiler to do as it pleases, for the purposes of simplicity and performance. However, this often leads to "bad" things happening to those who transgress in the areas of undefined behaviour.

I suggest that if you are a practicing C developer, and you don't think this is much of a problem, or you don't know much about it, you read both Regehr's "A Guide to Undefined Behaviour in C and C++" and Lattner's "What Every C Programmer Should Know About Undefined Behaviour" in full.

I was in the camp of "undefined behaviour's bad, but not too much of a problem since it can be easily avoided" camp, since I am more security-oriented than performance-oriented. I much prefer Ada to C.

That was, until I started in on Hatton's "Safer C".

The book is well written, clear, and direct.

Broadly speaking, all was going well, until I got to page 49. Page 49 contains a listing of undefined behaviour, as do pages 50, 51, 52, 53, 54, 55, and about one third of 56. That's over 7 pages of undefined behaviour.

It could be made better, if these were all weird, corner cases, that the compiler clearly could not detect and throw out as "obviously bad," and not even just semantically bad (for example, dereferencing a null pointer); but those that are syntactically bad.

Things like "The same identifier is used more than once as a label in the same function" (Table 2.2, entry 5) and "An unmatched ' or " character is encountered on a logical source line during tokenization" (Table 2.2, entry 4) and my current favourite, "A non-empty source file does not end in a new-line character, ends in a new-line character immediately preceded by a backslash character, or ends in partial preprocessing token or comment" (Table 2.2, entry 1).

Just look at those three. Syntactic errors, which, according to the standard, could bring forth the nasal demons.

Let me put it more bluntly. A conforming compiler may accept syntactically invalid C programs, and then emit (in effect) any code it wants.

Now, clearly, most compilers do not do this; they give define these undefined behaviours as syntax errors. The thing which really scares me is that the errors presented are from the first page of the table, and there's another six pages like that to go.

Further, I think I must come from a really luxurious background. I expect compilers to do everything reasonable to help program writers avoid obvious bugs, rather than simply making the compiler writers life easier.

A compiler is to be written for a finite set of hardware platforms. It needs to be used by countless other (probably less experienced) programmers to produce software which may then go on to be used by an unfathomable number of people. It's no small thing to claim that a large fraction of the world's population have probably interacted, in one way or another, with the OpenSSL code base, or the Linux kernel.

The reliability of these programs is directly correlated to the "helpfulness" of the language they are written in, and as such, C needs to be revisited with a view to "pinning down" many of the undefined behaviours; especially those which commonly lead to serious safety or security vulnerabilities.

Wednesday, 4 February 2015

Last Christmas, I gave you my HeartBleed

With HeartBleed well and truly behind us, and entered into the history books, I want to tackle the idea that HeartBleed would've definitely been prevented with the application of formal methods -- specifically those that culminate in a proof of correctness regarding the system.

Despite sounding really good, this is actually a false statement, and the reasoning isn't actually all that subtle.

To demonstrate, I'm going to use a common example, the Bell and LaPaulda model for mandatory access control.

The model is, roughly, that everything is a resource or a subject. Both resources and subjects have classifications (we'll just limit ourselves to two), for example, "low" and "high". Subjects with low classification can only read low resources. Subjects with high classification may read both high and low resources.

The specification of the write operations is not currently relevant.

So, let's try and specify this in (something like) the Z notation; we need two operations, ReadOk and ReadError. ReadOp is a combination of both of them.

I apologise in advance, this is the bastardisation of Z that I could work into a blog post without inclining images.



(SubjectClassification? = high) || (classificationOfResource(resourceFromId(resourceId?)) = low)
Result! = resourceFromId(resourceId?)

What this can be read as is, given inputs SubjectClassification and ResourceId, and an Result output; and the predicate (SubjectClassification? = high) || (classificationOfResource(resourceFromId(resourceId?)) = low), then the Result output is the resource.

This embodies the "Read down" property of Bell & LaPaulda. If a subject has high classification, then they can read, otherwise, the resource must have a low classification.

The ReadError operation is similarly designed;



(SubjectClassification? = low) && (classificationOfResource(resourceFromId(resourceId?)) = high)
Result! = error

This basically, roughly, very poorly, states that if you're a low classification subject, and you try to read a high classification resource, you'll get an error.

We now know that, for our specification, we're done.

This looks really good! A decent specification that we can really work to! Let's implement it!

public class BLPFileSystem {

  public enum Classification { LOW, HIGH }

  private final Map<ResourceId, Resource> resourceMap;

  public Resource readById(final ResourceId id, Classification subjectClassification) {

    assert id != null;

    assert resourceMap.contains(id);

    Resource r = resourceMap.get(id);

    Classification resClass = r.getClassification();


    if (subjectClassification == HIGH) {

      return r;

    } else if (resClass == LOW) {

      return r;

    } else {

      throw new IllegalAccessException();



Also, imagine the rest of the infrastructure is available and correct...

Well, this is obviously broken. If you've skipped the code, the offending line is "r.setClassification(LOW);"

That's right, this method declassifies everything as it goes along!

Interestingly, this completely meets our specification. Now, if this was an automatically verified (engage hand-waving/voodoo), I could push this to production with no hassle.

This isn't just a contrived example, but a demonstration of a general issue with these sorts of things.

A specification is usually a minimum of what your software must do -- it usually does not declare a maximum. In OpenSSL's case, the software did the minimum that it was supposed to, it also went above and beyond it's specification to work in new and interesting ways; which turned out to be really bad.

Even with our file system, we can add predicates to ensure that the state of the read file is not changed by reading it; but then the state of other files could be modified. A bad service could declassify every other file when reading any file.

It's not easy to just put "must not" into the spec. Many cryptography systems must run in adversarial situations, for example, sharing virtual machines sharing hardware with adversaries. These systems must protect their key material despite a threat model in which the adversary can measure the time, power and cache effect of the system.

In our example, the system can still violate the "spirit" of the specification by leaking information based on timing-dependant operations.

In part, this exists because the specification is a "higher level" of abstraction, and the abstraction is not perfect.

Now, that is not to say that we should abandon formal methods. Far from it, these, and related formalisms are the kinds of things which save lives, day in, day out. The overall quality of most projects would be vastly improved if some degree of formal methods had been applied from the start. Doubly so if it's something as unequivocally bad as the OpenSSL source. It's just that, in the face of security, our tools need some refinement.

Monday, 26 January 2015

Software Testing with Rigour

Previously, I've heard a lot about test-driven design (TDD) and why TDD is dead. Gauntlets have been thrown down, holy wars waged, and internet blood spilled.

This has resulted in widespread disaffection with TDD, and in part, some substantial disillusion with testing as a whole. I know that I suffer this; some days, I just want to cast the spear and shield of TestNG and Cobertura aside, and take up the mighty bastard-sword of SPARKAda.

I can sympathise with the feeling that testing is ineffective, that testing is a wast of time, that testing just doesn't work. When I get another bug report that the bit of code that I worked so hard to test is "just broken," I want to find the user and shake them until they conform to the code.

Clearly, assaulting end-users is not the way forwards. Unfortunately, tool support for the formal verification of Java code is also lacking in the extreme, so that route is right out too.

My own testing had been... undisciplined. Every class had a unit tests, and I even made quite a lot of integration tests. But it seemed lots of bugs were getting through.
It seems to me, that there are two things that really need to be said:
  1. Testing is extremely important.
  2. Creating tests up front was, at one point, basically common sense. Meyers covers this in the 1979 book, "The Art of Software Testing"
Following in Meyers' footsteps, I'd also like to make a big claim: Most people who do TDD aren't doing testing right.

TDD is often used as a substitute for program requirements, or program specification. Unfortunately, since the tests are almost always code, when a test fails, how does one really decide in a consistent way if it's the tests or the program that's broken? What if a new feature is to be worked into the code base that, on the surface looks fine, but the test suite shows it to be mutually exclusive with another feature?

Agile purists take note; a user story can work as a piece of a system's specification or requirements, depending on the level of detail in the story. If you're practicing "agile", but you don't have some way of specifying features, you actually practicing the "ad-hoc" software development methodology, and quality will likely fall.

Testing is "done right" when it is done with the intent of showing that the system is defective.

Designing Test Cases

A good test case is one which stands a high chance of exposing a defect in the system.

Unlike Meyers,  I side with Regehr on this one: Randomised testing of a system is a net good, if you have:
  1. A medium or high strength oracle.
  2. The time to tweak your test case generator to your system.
  3. A relatively well-specified system.
If you want to add even more strength to this method, combinatorial test case generation, followed by randomised test case generation looks to be a very powerful methodology.

However, I also feel strongly that time and effort needs to be put into manually designing test cases. Specifically, designing test cases with a high degree of discipline, and the honest intent to break your code.

Meyers recommends using boundary value analysis to partition both the input domain and output range of the system, and designing test cases which exercise those boundaries, as well as representative values from each range.

Oddly, he also discusses designing test cases which will raise most coverage metrics to close to 100%, which struck me as odd; although he tempered it by using boundary value analysis to slot into the high-coverage tests. I'm not sure I can recommend that technique, as it destroys the coverage metrics as a valid proxy of testing effectiveness, and Meyers acknowledges this earlier in the book.

For a really good run down of how to apply boundary value analysis (and lots of other really interesting techniques!), I really can't do any better than referring you to Meyers' book.

Testing Oracles

Testing oracles are things which, given some system input, and the system's output, decides if that output is "wrong".

These can be hand-written expectations, or simply waiting for exceptions to appear.

The former is a form of strong oracle, the latter is a very weak oracle. If your system has a high quantity of non-trivial assertions, you've got yourself a medium-strength oracle, and can rely on that to some degree.

What To Test

Meyers and Regehr are in agreement: Testing needs to be done at many levels, though it's not as clear in Meyers' book.

Unit testing is a must on individual classes, but this is not enough. Components need to be tested with their collaborators to ensure that there are no mis-communications or mis-understandings of a unit's contract.

I guess the best way to put this across is to simply state this: unit testing and integration testing are looking for different classes of errors. It is not valid to have one without the other, and claim that you may have found a reasonable amount of errors, as there are errors which you are simply not looking for.

I, personally, am a big fan of unit testing, then bottom-up integration testing. That is, test all the modules individually, with all collaborators mocked out, then start plugging units together starting at the lowest level, culminating in the completed system.

Other methods may be more effective for you; see Meyers' book for more methods.

This method allows you to look for logic errors in individual units, and when an integration test fails, you have a good idea of what the error is, and where it lies.

How to Measure Testing Effectiveness

A test is effective if it has a high probability of finding errors. Measuring this is obviously very hard. One thing that you may need to do is work out an estimate of how buggy your code is to begin with. Meyers has a run down of some useful techniques for this.

Coverage is a reasonable proxy -- if and only if you have not produced test cases designed to maximise a coverage metric.

It would also seem that most coverage tools only measure a couple of very weak coverage metrics: statement coverage and branch coverage. I would like to see common coverage tools start offering condition coverage, multi-condition coverage, and so on.

When to Stop Testing

When the rate at which you're finding defects becomes sufficiently low.

This is actually very hard, especially in an agile or TDD house, where tests are being run constantly,  defects are patched relatively quickly with little monitoring, and all parts of development are expected to be a testing phase.

If your methodology has a testing phase, and you find that at the end of your (for example) 4 week testing window is finding more and more defects every week, don't stop. Only stop when the defect-detection rate is dropping down to an acceptable level.

If your methodology doesn't have a testing phase, this is a much harder question. You have to rely on other proxy methods of whether your testing is effective, and if you've discovered most of the defects your end users are likely to see. Good luck.

I, unfortunately, am in the latter category. I just test as effectively as I can, as I go along and hope that my personal biases aren't sinking me too badly.


Do testing with the intent of breaking your code, otherwise you're not testing -- you're just stroking your ego.

If possible, get a copy of Meyers' book and have a read. The edition I have access to is quite small, coming in at ~160 pages. I think that if you're serious about having a highly-effective test suite, you need to read this book.

Regehr's Udacity course, "Software Testing" is also worth a plug, as he turns out to be both very capable of effective systems testing teaching; a rare trait. Take advantage for your benefit. The course also provides a nice, more modern view on many of Meyers' techniques. His blog is also pretty darn good.