CryptoCoding for Fun

Introduction

Inevitably when somebody with more than a passing interest in programming develops an interest in crypto, there is an overwhelming urge to write cryptocode. Sometimes it is just the desire to implement something documented in a textbook or website. Sometimes it is the desire to implement a personal crypto design. And sometimes is is because there is a design need for crypto with no apparent ready solutions. All of these are valid desires / needs, however there are a few (well recognized) principles of cryptocoding that need to articulated.

Design your own Crypto

The first of these is sometimes referred to as Schneiers Law “any person can invent a security system so clever that she or he can’t think of how to break it.” This is not to state that any one person cannot design a good crypto algorithm, but that really good crypto algorithms are written by groups of people, and validated by much larger groups of well educated and intelligent people. In simple terms, good crypto is conceived by people who are well versed in the crypto state of art, and iteratively built on that. The concept is then successively attacked, and refined by an increasingly larger audience of crypto-experts. At some point the product may be deemed sufficient not because it has been proved to ‘completely secure’, but that a sufficiently capable group of people have determined that it is ‘secure enough’. Bottom line – There is nothing fundamentally wrong with designing your own crypto algorithm, but without years of education and experience as a cryptoanalyst, it is unlikely to be anywhere as good as current crypto algorithms. In addition good crypto is the product of teams and groups who try to attack and compromise it in order to improve it.

Writing your own Crypto

The second of these is sometimes referred to as “never roll your own crypto”. Even if we are implementing well documented, well defined, best practice algorithms – writing crypto code is fraught with risks that do not exist for general coding. These risks are based on the types of attacks that have been successful in the past. For example, timing attacks have been used to map out paths through the crypto code, buffer attacks have been used to extract keys from the crypto code, or weak entropy / keying. Of course this is primarily applicable in a production context or anytime the cryptocode is used to protect something of value. If this is being done for demonstration or educational purposes, these risks can often be recognized and ignored (unless dealing with those risks are part of the lesson goals). Bottom line – writing crypto code is hard, and writing high quality, secure / interoperable crypo code is much harder.

Kerckhoff’s Principle

Kerckhoff’s Principle states “A cryptosystem should be secure even if everything about the system, except the key, is public knowledge.” Essentially this says that in a good cryptosystem, all parts of the system can be public knowledge (except the key) and not impair the effectiveness of the cryptosystem. Another perspective on this is that any cryptosystem that relies on maintaining the secrecy of the algorithm is less secure that one that only relies on a secret key.

Bruce Schneier stated “Kerckhoffs’s principle applies beyond codes and ciphers to security systems in general: every secret creates a potential failure point. Secrecy, in other words, is a prime cause of brittleness—and therefore something likely to make a system prone to catastrophic collapse. Conversely, openness provides ductility.”

When Kerckhoff’s principle is paired with Schneiers Law, which states (paraphrased) that more eyes makes better crypto, the result is that better code results from public and open development. Fortunately that is the approach used for most modern crypto, which is a very fortunate circumstance for the aspiring cryptocoder / cryptographer. It allows us to learn cryptosystems from the best cryptosystems available.

If you are interested in the best crypto documentation that United States Government publishes, review the FIPS and NIST Special Publications listed in the references below. For example, if you are interested in the best way to generate / test public-private keys, look at FIPS 186-4. The quality if the Pseudo Random numbers used in crypto is critical to security, and if you are interested in how this is done, look at NIST Special Publications 800-90a (revision 1), 90B, and 90C.

Bottom Line

For years I have espousing the principals of “never write your own crypto” because it was clear that crypto is hard, and high quality cryptocode is only written by teams of well qualified cryptocoders. With the Snowden revelations over the last few years, it has also become obvious to me that “never write  your own crypto” is also what a state player like the NSA would also encourage in order to more easily access communications through shared vulnerabilities in cryptosuites like OpenSSL. As a result I have modified my recommendation to state “In order to be a better cryptocoder / cryptographer, you should write your cryptocode and develop your own cryptosystems (for learning purposes), but should only use well qualified cryptocode in production or critical systems. ”

It is critically important that every systems engineer and cryptocoder develop an in depth understanding of crypto algorithms and cryptosystems, and the most effective method to accomplish this is by writing, testing and evaluating cryptocode / cryptosystems.

So my stronger recommendation is that everybody who can program and who has an interest in crypto should write crytocode and learn cryptocode and design cryptosystems, and from that we will have a much stronger foundation of understanding of Security Systems Engineering.

References

https://en.wikipedia.org/wiki/Kerckhoffs%27s_principle

https://www.schneier.com/blog/archives/2011/04/schneiers_law.html

http://security.stackexchange.com/questions/18197/why-shouldnt-we-roll-our-own

http://csrc.nist.gov/publications/PubsSPs.html

http://www.nist.gov/itl/fipscurrent.cfm

http://www.libressl.org/

https://www.openssl.org/

Posted in Uncategorized | Leave a comment

System Security Testing and Python

Overview

A significant part of systems security can be testing, and this presents a real challenge for most systems security engineers. Whether it is pen testing, forensic analysis, fuzz testing, or network testing, there can be infinite variations of System Under Test (SUT) when combined with the necessary testing variations.

The challenge is to develop an approach to this testing that provides the necessary flexibility without imposing an undue burden on the systems security engineer. Traditionally the options included packaged security tools; which provides an easy to use interface for pre-configured tests, and writing tests in some high level language; which provide a high degree of flexibility with a relatively high level of effort / learning curve. The downside with the packaged tools is lack of flexibility and cost.

An approach which has been increasingly more popular is to take either one (or both) of these approaches and improve through the use of Python.

For Example

A few examples of books that take this approach include:

  • Grayhat Python – ISBN 978-1593271923
  • Blackhat Python – ISBN 978-1593275907
  • Violent Python – ISBN 978-1597499576
  • Python Penetration Testing Essentials – ISBN 978-1784398583
  • Python Web Penetration Testing Cookbook – ISBN 978-1784392932
  • Hacking Secret Ciphers with Python – ISBN 978-1482614374
  • Python Forensics – ISBN 978-0124186767

In general, these take the approach of custom code based on generic application templates, or scripted interfaces to security applications.

Conclusions

After skimming a few of these books and some of the code samples, it is become obvious that Python has an interesting set of characteristics that make it a better language for systems work  (including systems security software) than any other language I am aware of.

Over the last few decades, I have learned and programmed in a number languages including Basic, Fortran 77, Forth C, Assembly, Pascal (and Delphi), and Java. Through all of these languages I have come to accept that each one of these languages had a set of strengths and issues. For example, Basic was basic. it provided a very rudimentary set of language features, and limited libraries which meant there often a very limited number of ways to do anything (and sometimes none). It was interpreted, so that meant it was slow (way back when).  It was not scaleable, which encouraged small programs, and it was fairly easy to read. The net result is that Basic was used as a teaching language, suitable for small demonstration programs – and it fit that role reasonably well.

On the other hand Java (and other strongly typed language) are by nature, painful to write in due that strongly typed nature, but also make syntax errors less likely (after tracking down all of the missing semi-colons, matching braces, and type matching). Unfortunately, syntactical errors are usually the much simpler class of problems in a program.

Another attribute of Java (and other OO languages) is the object oriented capabilities – which really do provide advantages for upwards scaleability and parallel development, but result in very difficult imperative development (procedural). Yes – everything can be an object, but that does not mean that it is the most effective way to do it.

Given that background, I spent a week (about 20 hours of it) reading books and writing code in Python. In that time I went from “hello world.py” to a program with multiple classes that collected metrics for each file in a file system, placed that data in a dictionary of objects and wrote out / retrieved from a file in about a 100 lines of code. My overall assessment:

  • The class / OO implementation is powerful, and sufficiently ‘weakly typed’ that it is easily useable.
  • The dictionary functionality is very easy to use, performs well, massively flexible, and becomes the go-to place to put data.
  • The included standard libraries are large and comprehensive, and these are dwarfed by the massive, high quality community developed libraries.
  • Overall – In one week, Python has become my default language of choice for Systems Security Engineering.

Postscript

Also of note, I looked at numerous books on Python and have discovered that:

  • There are a massive number of books purportedly for learning Python.
  • They are also nearly universally low value, with a few exceptions.

My criteria for this low value assessment is based on the number of “me-too” chapters. For example, every book I looked at for learning Python has at least one chapter on:

  • finding python and installing it
  • interactive mode of the Python interpreter
  • basic string functions
  • advanced string functions
  • type conversions
  • control flow
  • exceptions
  • modules, functions
  • lists, tuples, sets and dicts

In addition each of these sections provide a basic level of coverage, and are virtually indistinguishable from a corresponding chapter in dozens of other books. Secondary to that there was usually minimal or basic coverage of dicts, OO capabilities, and module capabilities.  I wasted a lot of time looking for something that provided a more terse coverage of the basic concepts and a more complete coverage of more advanced features of Python. My recommendation to authors of computer programming books: if your unique content is much less than half of your total content, don’t publish.

From this effort I can recommend the following books:

  • The Quick Python Book (ISBN 978-1935182207): If you skip the very basic parts, there is a decent level of useful Python content for the experienced programmer.
  • Introducing Python (ISBN 978-1449359362): Very similar to the Quick Python book, with some unique content.
  • Python Pocket Reference (ISBN 978-1449357016): Simply a must have for any language. If O’Reilly has one, you should have it.
  • Learning Python (ISBN 978-1449355739): A 1500 page book that surprised me. It does have the basic “me-too” chapters, but has a number of massive sections not found in any other Python book. Specifically, Functions and Generators (200 pages), Modules in depth (120 pages), Classes and OOP in depth (300 pages), Exceptions in depth (80 pages), and 250 pages of other Advanced topics. Overall it provides the content of at least three other books on Python, in a coherent package.

Note – Although I could have provided links on Amazon for each of these books (every one of them is available at Amazon), my purpose is to provide some information on these books as resources (not promote Amazon). I buy many books directly from O’Reilly (they often have half off sales), Amazon, and Packt.

Posted in Uncategorized | Leave a comment

IoT and Stuff – Cautionary Tales

Overview

IoT (Internet of Things) is an interesting phenomena where “things” become connected and provide either some control and / or sensor capability through this connection. Examples include connected thermostats, weather stations, garage door openers, smart door locks, etc.

It is an area of explosive growth, and like any other system it will have its security failures.

Tale 1 – Hacking Internet Connected Light Bulbs

LIFX lightbulbs are smart LED lights with two wireless interfaces; a Wi-Fi interface to connect to the local network and provide a control path for computers / smartphones, and an IEEE 802.15.4 mesh network to communicate between multiple LIFX smartlights. This dual wireless interface meant that any number of LIFX smartlights could be controlled and managed through a single Wi-Fi connection. Since any of the LIFX smartlights could operate as the “master” device that connected to both networks, it was necessary for each smartlight to have the Wi-Fi network access credentials.

Vulnerability

The vulnerability involves a couple of aspects in the design. These include:

  1. When an additional LIFX smartlight was added to the network, it exchanges data over the IEEE 802.15.4 network in the clear (unencrypted); except for the Wi-Fi credentials and some configuration details. All of this data was sent as encrypted blob.
  2. The encryption key for this blob was a pre-shared key hard coded into the firmware for every LIFX smartlight (of that firmware revision). This key was accessible via JTAG (which was pinned out on the PCB) or through the firmware image (which was not available at the time of the compromise).
  3. The system allowed a client on the IEEE 802.15.4 network to request (and receive) this encrypted configuration / credentials blob at any time in the background.

Compromise

The compromise allows an attacker physically close to the system to:

  1. Acquire the LIFX pre-shared encryption key from the firmware or JTAG interface.
  2. On the IEEE 802.15.4 network, request the encrypted configuration / credentials blob (masquerading as a LIFX smartlight).
  3. Crack open the blob using the encryption key from step 1.
  4. Connect to the Wi-Fi network using the credentials from the blob.
  5. Access the network and / or control the LIFX light bulbs.

Assessment

From this there are at least a few poor design choices that enabled this compromise. The first of these is to use a static pre-shared key to encrypt sensitive wireless data. The ability to establish a secure channel based on PKI has been standard practice for decades, allowing the use of dynamically generated keys at a session level. The use of a static pre-shared key is just lazy design.

The second of these is the ability to request the encrypted credential blob silently. For an initial configuration of an additional smartlight to the network, it is reasonable to require user confirmation to share the data with the additional smartlight. An attacker requesting this data in the background should not be allowed to get this data without user confirmation, or simply rejected when not part of a new bulb configuration.

Although having the JTAG port pinned out may seem to be a poor design choice, it is not really add significant risk. JTAG availability on the device pins would have been more than sufficient for a physical hacker, and that is assuming that the same data would not have been available in a firmware download. A JTAG port does not present a significant risk if keys are managed securely, and the security architecture takes this exposure into account.

Tale 2 – Smart Home Denial of Service

Vulnerability

The vulnerability in this story is that the smart home in this story had connected all of the smart devices in the house through a common Ethernet infrastructure – effectively rendering every device as a node on a flat network. This flat network meant that any one device can saturate the network with packets, effectively breaking the network. It also means that any one device can also monitor every packet on the network, or selectively disrupt packets. Essentially the security of this flat network can be compromised in multiple ways by any device on the network, and the overall security of the network is only as good as the weakest device.

Compromise

This particular compromise was based on a smartlight beaconing on the network as a denial of service attack. This event was not malicious, but if we consider the triad of confidentially, integrity and availability it is still a security failure. A self induced denial of service is still a denial of service.

Assessment

As a systems engineer the smart home described in this article makes me uncomfortable. The designer indicated that he had not installed his smart door locks since he did not want to be locked out / in by the locks. The designer also indicated that the light bulb denial of service rendered all the smart devices in his house broken / unavailable.

As a systems engineer this bothers me for a couple of reasons. The first of these is that it is possible to segment the network so that a failure does propagate through the entire network – effectively setting up security domains on functional boundaries. Even a trivial level of peering management would provide some level of isolation without giving up the necessary control protocols.

The second part that I find bothersome is that it appears that the entire system was designed with a single centralized control mechanism / scheme. Given the relatively poor reliability of network systems as compared with traditional home lighting / appliance controls, it makes sense to to install a parallel control scheme that is based on a local (more reliable) control path that operates much closer to the device being controlled.

In summary – the architecture of this particular smart home implementation is brittle in that a single device failure can precipitate an entire system failure. In addition it is fragile in that the control scheme is dependent on a number of disparate sequential operations that provide a multitude of single point failures for every device. Lastly, the system is not robust in that there is not an alternate control scheme. In my opinion this smart home may be an interesting experiment, but is a weak systems design with lots of architectural / system flaws.

Tale 3 – ThingBots

This is not a cautionary tale about a specific device or attack, but a cautionary tale about embedded devices in general, and by inclusion – IoT devices. Back in last week of 2013/first week of 2014, Proofpoint gathered some data from a number of botnets sending out spam. Specifically, they identified the unique IP addresses in the botnets, and characterized them forensically, and found that roughly a quarter of the zombie machines were not traditional PCs, but things like DVRs, security cameras, home routers, and at least one refrigerator. From this they coined the term ‘ThingBot’, which is a botnet zombie based on some ‘thing’.

The message is that when it comes to compromise and attack, there are no devices that will not be attacked, there is no point where your devices is not a target for a botnet. Harden all embedded devices and design defensively.

Bottom Line

The messages in these three tales are diverse, but can be summarized by:

  1. Every connected device is a target. Simply being a connected device is sufficient.
  2. Key management may be mundane but is even more critical on devices since often the only interface is networked.
  3. Most importantly – System design matters. Most security issues occur at the integration  interfaces between components of one type or another – and good system design reduce that exposure.

References

Posted in Uncategorized | Leave a comment

IOT and Stuff – The Evolution

Overview

This is the first of several posts I expect to do on IoT, including systems design, authentication, standards, and security domains. This particular post is an IoT backgrounder from my subjective viewpoint.

Introduction

The Internet of Things (IoT) is a phenomena that is difficult to define, and difficult to scope. The reason it is difficult to define is that it is rapidly evolving, and is currently based on the foundational capabilities IoT implementations provide.

Leaving the marketing hyperbole behind, IoT is the integration of ‘things’ into what we commonly refer to as the Internet. Things are anything that can support sensors and/or controls, an RF network interface, and most importantly – a CPU. This enables ubiquitous control / visibility into something physical on the network (that wasn’t on the network before).

IoT is currently undergoing a massive level of expansion. It is a chaotic expansion without any real top down or structured planning. This expansion is (for the most part) not driven by need, but by opportunity and the convergence of many different technologies.

Software Development Background

In this section, I am going to attempt to draw a parallel to IoT from the recent history of software development. Back at the start of the PC era (the 80s), software development carried with it high cost for compilers, linkers, test tools, packagers, etc. This marketing approach was inherited from the mainframe / centralized computer system era, where these tools were purchased and licensed by “the company”.  The cost of an IBM Fortran compiler and linker for the PC in the mid 80s was over $700, and libraries were $200 each (if memory serves me). In addition, the coding options were very static and very limited. Fortran, Cobol, C, Pascal, Basic and Assembly represented the vast majority of programming options. In addition (and this really surprised me at the time), if you sold a commercial software package that was compiled with the IBM compiler, it required that you purchase a distribution license from IBM that was priced based on number of units sold.  Collectively, these were significant barriers to any individual who wanted to even learn how to code.

This can be contrasted with the current software development environment where there is a massive proliferation of languages and most of them available as open source. The only real limitations or barriers to coding are personal ability, and time. There have been many events that have led to this current state, but (IMO) there were two key events that played a significant part in this. The first of these was the development of Borland Turbo Pascal in 1983, which retailed for $49.99, with unlimited distribution rights for an additional $99.99 for any software produced by the compiler. Yes I bought a copy (v2), and later I bought Turbo Assembler, Delphi 1.0, and 3.0. This was the first real opportunity for an individual to learn a new computer language (or to program at all) at an approachable cost without pirating it.

To re-iterate, incumbent software development products were all based on a mainframe market, and mainframe enterprise prices and licensing, with clumsy workflows and interfaces, copy protection or security dongles. Borland’s Turbo Pascal integrated editor, compiler and linker into an IDE – which was an innovative concept at the time. It also had no copy protection and a very liberal license agreement referred to as the Book License. It was the first software development product targeted at end users in a PC type market rather than the enterprise that employed the end user.

The second major event that brought about the end of expensive software development tools was GNU Compiler Collection (GCC) in 1987, with stable release by 1991. Since then, GCC has become the default compiler engine nearly all code development, enabling an explosion of languages, developers and open source software. It is the build engine that drives open source development.

In summary, by eliminating the barriers to software development (over the last 3 decades),  software development has exploded and proliferated to a degree not even imagined when the PC was introduced.

IoT Convergence

In a manner very analogous to software development over the last 3 decades, IoT is being driven by a similar revolution in hardware development, hardware production, and  software tools. One of the most significant elements of this explosion is the proliferation of Systems On a Chip (SoC) microprocessors. As recently as a decade ago (maybe a bit longer), the simplest practical microprocessor required a significant number of external support functions, which have now been integrated to a single piece of silicon. Today, there are microprocessors with various combinations of integrated UARTs, USB OTG ports, SDIO, I2C, persistent flash RAM, RAM, power management, GPIO, ADC and DAC converters, LCD drivers, self-clocking oscillator, and a real time clock  – all for a dollar or two.

A secondary aspect of the hardware development costs are a result of the open source hardware movement (OSH), that has produced very low cost development kits. In the not so distant past, the going cost for microprocessor development kit was about $500, and that market has been decimated by Arduino, Raspberry PI, and dozens of other similar products.

Another convergent element of the IoT convergence comes from open source software / hardware movement. All of the new low cost hardware development kits are based on some form of open source software packages. PCB CAD design tools like KiCAD enable low cost PCB development. Projects like OSHPark enable low cost PCB prototypes and builds without lot charges or minimum panel charges.

A third facet of the hardware costs is based on the availability and lower costs of data link radios for use with microprocessors. Cellular, Wi-Fi, 802.15.4, Zigbee, Bluetooth and Bluetooth LE all provide various tradeoffs of cost, performance, and ease of use – but all of them have devices and development kits that are an order of magnitude of lower cost than a decade ago.

The bottom line, is that IoT is not being driven by end use cases, or one group, special interest or industry consortium. It is being driven by the convergent capabilities of lower cost hardware, lower cost development tools, more capable hardware / software, and the opportunity to apply to whatever “thing” anybody is so inclined. This makes it really impossible to determine what it will look like as it evolves, and it also makes efforts by various companies get in front of or “own” IoT seem unlikely to succeed. The best these efforts are likely to achieve is that they will dominate or drive some segment of IoT by the virtue of what value they contribute to IoT. Overall these broad driving forces and the organic nature of the IoT growth means it is also very unlikely that it can be dominated or controlled, so my advice is to try and keep up and don’t get overwhelmed.

Personally, I am pretty excited about it.

PS – Interesting Note: Richard Stallman may be better known for his open source advocacy and failed Mach OS, but he was the driving developer behind GCC and EMACs, and GCC is probably as important as the Linux kernel in the foundation and success of the Linux OS and the open source software movement.

References

Posted in Arduino, Internet Security, Systems Engineering | Tagged , | Leave a comment

Hiatus / Status Report

Apologies to all for being remiss in updates to my blog. Over the last 3 months we sold our house of 17 years, and I took a job in downtown Minneapolis (formerly Spectrum Design Services). We should be closing on a house soon, but for the time being we are homeless (but not indigent) and imposing on relatives.

This presents a number of challenges to the available time to do projects / investigations and document said projects / investigations. First the demands of starting a new job, cleaning / fixing a house to sell and tortured commutes (with each of us spending the weeks living in different states) leaves me little spare time. Second, the limitations imposed by being homeless provide limited maintainable workspace for projects.  Thirdly, the cumulative stress of changing so much of your life is exhausting.  Lastly, when our house closes and we move in there will be a whole new set of things that will necessarily consume said spare time.

But it is all good. The job provides complexity and challenges and opportunities to really build the mental muscles. Boredom can kill you just as effectively as a heart attack – it just takes a lot longer (and it is a lot more tortured). The house is new, with a backyard over a wooded ravine with a creek running through it, and it is in River Falls – a really nice country/college town just beyond the sprawling metropolis of the Minneapolis-St Paul metro area.

So to re-iterate, it has been a while since I posted something interesting to the blog. I also anticipate that it will be a while (going forward) before I have additional posts. However I have been considering some different topics when I get back. Perhaps something on some hardware hacking – Arduino, Raspberry Pi or BeagleBoard Black – the challenge there is to find an interesting topic that hasn’t been done a multitude of times. If I can find a topic that I find interesting and has not been beaten to death, that will be on the list. Another topic is to look at some other programming development options on the Chromebook. Part of me is really curious if/how well PyCharm CE runs under Crouton. I also never got around to doing a Dart test project. Then I think i need to get back and refresh my Android Systems Engineering work – which are getting very stale.

In any case – this blog has not be abandoned, just neglected. I will be back.

Posted in Life | Tagged , , | Leave a comment

A Brief Introduction to Security Engineering

Background

One of the great myths is that security is complicated, hard to understand, and must be opaque to be effective. This is mostly fiction perpetrated by people who would rather you did not question the security theater they are creating in lieu of real security, by security practitioners who don’t really understand what they are doing, or lastly those who are trying to accomplish something in their interests under the false flag of security. This last one is why so much of the government “security” activities are not really about security, but about control – which is not the same. Designing and doing security can be complex, but understanding security is much easier than it is generally portrayed.

Disclaimer – This is not a comprehensive or exhaustive list / analysis. It is a brief introduction that touches on a few of the most practical elements of security engineering.

Security Axioms

Anytime I look at systems security, there are a few axioms I use to set the context, limit the scope and measure the effectiveness. These are:

  1. Perfect security is unachievable, and any practical security is the result of some cost driven tradeoff.
  2. Defining and understanding your threat model is step zero of any security solution. If you don’t know who are are defending against, the solution will not fit.
  3. Defining and understanding success. This means understanding what you trying to protect and what exactly protecting those elements means.
  4. Defending a system is more costly / difficult than attacking that same system. Attacker only need to be successful once, but defenders need to be successful everytime.
  5. Security based on secrecy is weaker than security based on strength. Closed security solutions are more likely to contain flaws that weaken the security versus open security solutions. Yes – this has been validated.

The first of these is a recognition that a security is about a conflict between a system / information defender and an attacker on that system. Somebody is trying to take something of yours and you want to stop them. Each of these two parties can use different approaches and tools to do this, with increasing costs – where costs are monetary, time, resources, or risks of being caught / punished. This first axiom simply states that if an attacker has infinite time, money, resources, and zero risk, your system will be compromised because you are outgunned. For less enabled attackers,  the most cost effective security is that which is just enough to discourage them so they move on to an easier target. This of course leads understanding your attacker, and the next axiom – know your threat.

The second axiom states that any security solution is designed to protect from a certain certain type of threat. Defining and understanding the threats you are defending against is foundational to security design since it will drive every aspect of the system. A security system to keep your siblings, parents, children out of your personal data is completely different than one designed to keep out cyber extortionists out of your Internet accounts.

The third axiom is based on the premise that most of what your system / systems are doing requires minimal protected (depending on the threat model), but some parts of it require significant protection. For example – my Internet browsing history is not that important as compared with my password and account access file. I have strong controls on my passwords and account access (eg KeePass), and my browsing history is behind a system password. Another way to look at this to imagine what the impact could be if a given element were compromised – that should guide the level of protection for that item.

The fourth axiom is based on the premise that the defender must successfully defend every vulnerability in order to be successful, but the attacker only has to be successful on one vulnerability – one time to be successful. This is also why complex systems are more prone to compromise – greater complexity leads to more vulnerabilities (since there are more places for gremlins to hide).

The fifth one is the perhaps the least obvious axiom of this list. Simply put the strength of some security control should not be based on the design being secret. Encryption protocols are probably the best example of how this works. Most encryption protocols over the last few decades are developed, and publicized within the peer community. Invariably, weaknesses are found and corrected, improving the quality of the protocol, and reducing the risk of an inherent vulnerability. These algorithms and protocols are published and well known, enabling interoperability and third party validation reducing the risk of vulnerabilities due to implementation flaws. In application, the security of the encryption is based solely on the key – the keys used by the users. The favorite counter example is from the world of traditional pin tumbler locks , in which locksmith guilds attempted to keep their design / architecture secret for centuries, passed laws making it a crime to possess lock picks or knowing how to pick a lock unless you were a locksmith. Unfortunately, these laws did little to impede criminals and it became an arms race between lock makers, locksmiths and criminals, with the users of locks being kept fairly clueless. Clearly of the lock choices available to a user, some locks were better, some were worse, and some were nearly useless – and this secrecy model of security meant that users did not have the information to make that judgement call (and in general they still don’t). The takeaway – if security requires that the design / architecture of the system be kept secret, it is probably not very good security.

Threat Models

In the world of Internet security and information privacy, there are only a few types of threat models that matter. This is not because there are only a few threats, but because the methods of attack and the methods to defend are common. Generally it is safe to ignore threat distinctions that don’t effect how the system is secured. This list includes:

  1. Immediate family / Friends / Acquaintances – Essentially people who know you well and have some degree of physical access to you or the system your are protecting.
  2. Proximal Threats : Threats you do not know, but are who are physically / geographically close to you and the system you are protecting.
  3. Cyber Extortionists : A broad category of cyber attackers whose intent is to profit by attacking and compromising your information. This group generally targets individuals, but not a specific individual – they look for easy targets.
  4. Service Compromise : Threats who attack large holders of user information – ideally credit card information. This group is looking for bulk information is not targeting individuals directly.
  5. Advanced Persistent Threats (APTs) : Well equipped, well resourced, highly capable and persistent. These attackers are generally supported by governments or large businesses and their targets are usually equally large. This group plans and coordinates their attacks with a specific purpose.
  6. Government (NSA / CIA / FBI / DOJ / DHS / etc): Currently the biggest, baddest threat. They have the most advanced technical resources, the most money, and they use National Security Letters when those are not enough. The collect data in bulk, and they target individuals.

From a personal security perspective we are looking at threats most likely to concern any random user of internet services – you. In that context, we can dismiss a couple of these quickly. Lets do this in reverse order:

Government (NSA et al) – If they are targeting you specifically, and you use Internet services – you are need of more help than I can provide in this article. If your data is part of some massive bulk data collection – there is very little you can do about that either. So in either case,  in the context of personal data security for Joe Internet User, don’t worry about it.

Advanced Persistent Threats (APTs) – Once again, much like the NSA, it is unlikely you would be targeted specifically, and if you are your needs are beyond the scope of this article. So – although you may be concerned about this threat, there is very little you can do to stop this threat.

Service Compromise – I personally pay all of my bills online, and every one of these services wants to store my credit card in their database. Now the question you have to ask is if (for example), the Verizon customer database is compromised and somebody steals all of that credit card information (with 10s of millions of card numbers) and uses them to spend 100s of millions of charges – is Verizon (or any company in that position) going to take full responsibility? Highly unlikely – and that is why I do not store my credit information on their systems. If they are not likely to accept responsibility for any outcome, should you trust them with your credit?

Cyber Extortionists – The most interesting and creative of all these threat classes. I continue to be amazed at every new exploit I hear about. Examples include mobile apps that covertly call money transfer numbers (eg 1-900 numbers in US), or apps that buy other apps covertly. Much like the Salami Slicing attacks (made famous in the movie Office Space), individual attacks represent some very small financial gain, but the hope is that collectively they can represent significant money.

Proximal Threats – If somebody can physically take your laptop, tablet, phone, they have a really good shot at all of the information on that device. Many years ago, I had an iPhone stolen from me on the Washington DC metro, I had not enabled the screen lock, and I had the social security numbers / birthdays of my entire family in my contacts. And yes, there were false attempts to get credit based on this information within hours – unsuccessfully. I now use / recommend everybody use some device access lock, and encrypt very sensitive information in some form of locker. Passwords / accounts and social security numbers in KeePass and sensitive file storage in TruCrypt. These apps are free and provide significant protection for Just In Case. Remember physical control / access to a device is its own special type of attack.

Friends / Family / Acquaintances – In most cases, the level of security to protect from this class of threat is small. More importantly, it is crucial to understand what it is you are trying to protect, why are you protecting it, and what are your recovery options. To repeat – what are your recover options? It is very easy to secure your information, and then forget the password /  passphrase  or corrupt your keyfile. Compromise of private data in this context is orders of magnitude less likely than you locking yourself out of your data – permanently. Yes, I have done this and family photos on a locked TrueCrypt partition cannot be recovered in your lifetime. So when you look at security controls to protect from this threat model, look for built in recovery capabilities and only protect what is necessary to protect.

Conclusions

Fundamentally security engineering is about understanding what you are trying to protect, who / what your threat is, and determining what controls to use to impede the threat while not impeding proper function. Understanding your threat is the first and most important part of that process.

Lastly – I would encourage everybody who finds this the least bit interesting to either read Bruce Schneier’s blog and his books. He provides a very approachable and coherent perspective on IT security / Security Engineering.

Links

Posted in Internet Security, Security, Systems Engineering | Tagged , , , , | Leave a comment

Software: Thoughts on Reliability and Randomness

Overview

Software Reliability and Randomness are slippery concepts that may be conceptually easy to understand, but hard to pin down. As programmers, we can write the equivalent of ‘hello world’ in dozens of languages on hundreds of platforms and once the program is functioning – it is reliable. It will produce the same results every time it is executed. Yet systems built from thousands of modules and millions of lines of code function less consistently than our hello world programs – and are functionally less reliable.

As programmers we often look for a source of randomness in our programs, and it is hard to find. Fundamentally we see computers as deterministic systems without any inherent entropy (for our purposes – randomness). For lack of true random numbers we generate Pseudo Random Numbers (PRNs), which are not really random. They are used in generating simulations, and in generating session keys for secure connections, and this lack of true randomness in computer generated PRNs has been the source of numerous security vulnerabilities.

In this post I am going to discuss how software can be “unreliable”, deterministic behavior, parallel systems / programming, how modern computer programs / systems can be non-deterministic (random), and how that is connected to software reliability.

Disclaimer

The topics of software reliability, deterministic behavior, and randomness in computers is a field that is massively deep and complex. The discussions in this blog are high level, lightweight, and I make some broad generalizations and assertions that are mostly correct (if you don’t look to closely) – but hopefully still serve to illustrate the discussion.

I also apologize in advance for this incredibly dry and abstract post.

Software Reliability

Hardware reliability, more precisely “failure” is most often occurs when some device in a system breaks (the smoke comes out), and the system no longer functions as expected. Software failures do not involve broken hardware or devices. Software failures are based on the concept that there are a semi-infinite number of paths (or states) through a complex software package, and the vast majority will result in the software acting and functioning as expected. However there are some paths through the code that will result in the software not functioning as expected. When this happnes, the software and system are doing exactly what the code is telling it to do – so from that perspective, there is no failure. However from the concept of a software failure, the software is not doing what is expected – which we interpret as a software failure, which provides a path to understand the concept of software reliability.

Deterministic Operation

Deterministic operation in software means that a given program with a given set if inputs will function in exactly the same manner every time it is executed – without any unexpected behaviors. For the most part this characteristic is what allows us to effectively write software. If we carry this further, and look at software on simple (8 / 16 bit) microprocessors / microcontrollers, where the software we write runs exclusively on the device, operation is very deterministic.

In contrast – on a modern system, our software exists in a relatively high level on top of APIs (application programming interfaces), libraries, services, and a core operating system – and in most cases this is a multitasking/multi-threaded/multi-cored environment. In the world of old school 8 / 16 bit microprocessors / microcontrollers, none of these layers exist. When we program for that environment, our program is compiled down to machine code that runs exclusively on that device.

In this context, our program not only operates deterministically in how the software functions, but the timing and interactions external to the microprocessor is deterministic. In the context of modern complex computing systems, this is generally not the case. In any case, the very deterministic operation of software on dedicated microprocessor makes it ideal for real world interactions and embedded controllers. This is why this model is used for toasters, coffee pots, microwave ovens and other appliances. The system is closed – meaning its inputs are limited to known and well defined sources, and its functions are fixed and static, and generally these systems are incredibly reliable. After all how often it is necessary to update the firmware on an appliance?

If this war our model the world of software and software reliability, we would be ignoring much of what has happened in the world of computing over the last decade or two. More importantly – we need to understand that this model is an endpoint, not the whole story, and to understand where we are today we need to look further.

Parallel Execution

One of the most pervasive trends in computing over the last decade (or so) is the transition from increasingly faster single threaded systems to increasingly parallel systems. This parallelism is accomplished through multiple computing cores on a single device and through multiple processing threads on a single core, which are both mechanisms to increase the ability of the processor to produce more work by being able to support concurrently running programs. A typical laptop today can have two to four cores and support two hardware threads per core, resulting in 8 relatively independent processes running at the same time. Servers with 16 to 64 cores would have qualified as supercomputers (small ones) a decade ago are now available off the shelf.

Parallel Programming: the Masochistic Way

Now – back in the early 80s as an intern at Cray, my supervisor spent one afternoon trying to teach me about how Cray computers (at that time) were parallel coded. As one of the first parallel processing systems, and as systems where every cycle was expensive – much of the software was parallel programmed in assembly code. The process is exactly how would imagine. There was a hardware scheduler that would transfer data to/from each processor to main memory every so many cycles. In between these transfers the processors would execute code. So if the system had four processors, you would write assembly code for each processor to execute some set of functions that were time synchronized ever so many machine cycles, with NOPs (no operation) occasionally used to pad the time. NOPs were considered bad practice since cycles were precious and not to be wasted on a NOP.  At the time, it was more than I wanted to take on, and I was shuffled back to hardware troubleshooting.

Over time I internalized this event, and learned something about scalability. It was easy to imagine somebody getting very good at doing two (maybe even 3 or 4) dissimilar time synchronous parallel programs. Additionally, since many programs also rely on very similar parallel functions, it was also easy to imagine somebody getting good at writing programs that did the same thing across a large number of parallel processors. However, it is much harder to imagine somebody getting very good at writing dissimilar time synchronous parallel programs effectively over a large number of parallel processors. This is in addition to the lack of scalability inherent in assembly language.

Parallel Programming – High Level Languages

Of course in the 80s or even the 90s, most computer programmers did not need to be concerned with parallel programming, and every Operating System was single threaded, and the argument of the day was Cooperative multitasking versus Preemptive multitasking. Much like the RISC vs CISC argument from the prior decade, these issues were rendered irrelevant by the pace of processor hardware improvements. Now many of us walk around with the equivalent that Cray supercomputer in our pockets.

In any case the issue of parallel programming was resolved in two parts. The first being the idea of a multi-tasking operating systems with a scheduler – the core function that controls what programs are running (and how long they run) in parallel at any one time. The second being the development of multi-threaded programming in higher level languages (without the time synchronization of early Crays).

Breaking Random

Finally getting back to my original point… The result today is that all modern operating systems have some privileged block of code – the kernel running continuously, but have a number of other services that run the OS, including the memory manager and the task scheduler.

The key to this whole story is that these privileged processes manage access to shared resources on the computer. Of these two, the task manager is the most interesting – mostly due the arcane system attributes it uses to determine which processes have access to which core / thread on the processor. This is one of the most complex aspects of a multitasking / multi-core / multithreaded (hardware) system. The attributes the scheduler looks at include affinity flags that processes use to indicate core preference, priority flags, resource conflicts and hardware interrupts.

The net result is that if we take any set of processes on a highly parallel system there are some characteristics of this set that are sufficiently complex and impacted by unknown external elements that they are random – truly random. For example if we create three separate processes that generate a pseudo random number set based on some seed (using unique values in each), and point all of them to some shared memory resource- where the value is read as input and the output is written back. Since the operation of the task scheduler means that the order of execution of these three threads is completely arbitrary, it is not possible to determine what the sequence is deterministically – the result would be something more random than a PRNG. A not so subtle (and critical) assumption is that the system has other tasks and processes it is managing, which directly impact the scheduler, introducing entropy to the system.

Before we go on, lets take a closer look at this. Note that if some piece of software functions the same (internally and externally) every time it executes, it is deterministic. If this same piece of software functions differently based on external factors that are unrelated to this software, that is non-deterministic. Since kernel level resource managers (memory, scheduler, etc) function in response to system factors and factors from each and every running process – that means that from the perspective of any one software package, certain environmental factors are non-deterministic (i.e. random). In addition to the scheduling and sequencing aspects identified above, memory allocations will also be granted or moved in a similar way.

Of course this system level random behavior is only half the story. As software packages are built to take advantage of gigabytes of RAM, and lots of parallel execution power, they are becoming a functional aggregation of dozens (to hundreds) of independently functioning threads or processes, which introduce a new level of sequencing and interdependancies which are dependent on the task manager.

Bottom Line – Any sufficiently complex asynchronous and parallel system will have certain non-deterministic characteristics based on the number of independent sources that will influence access / use of system shared resources. Layer the complexity of parallel high level programming, and certain aspects of program operation are very non-deterministic

Back to Software Reliability

 Yes we have shown that both multitasked parallel hardware and parallel programmed software contribute to some non-deterministic behavior in operation, but we also know that for the most part software is relatively reliable. Some software is better and some is worse, but there clearly is some other set of factors in play. 

The simple and not very useful answer is “better coding” or “code quality”. A slightly more insightful answer would tell you that code that depends on or uses some non-deterministic feature of the system is probably going to be less reliable. An obvious example is timing loops. Back in the days of single threaded programs and single threaded platforms, programmers would introduce relatively stable timing delays with empty timing loops. This practice was easy, popular and produced fairly consistent timing – showing deterministic behavior. As systems hardware and software have evolved, the assumptions these coding practices rely on become less and less valid. Try writing a timing loop program on a modern platform and the results can be workable much of the time, but it  can also vary by orders of magnitude – in a very non-deterministic manner. There are dozens of programming practices like this that use to work just fine, but no longer do – but they don’t completely break, just operate a little bit randomly. In many cases, the behavior is close enough to “correct” that the program appears to function, but not very reliably.

Another coding practice that used to work on single threaded systems was to call some function and expect the result would be available on the next line of code. It worked on single threaded systems because execution was handed off to that function, and did not return until it was complete. Fast forward to today, and if this is written as a parallel program – the expected data may not be there when your code thinks is should be. There is a lesson here – high level parallel programming languages make writing parallel code fairly easy, but that does not mean that writing robust parallel programs is easy. Parallel inter-dependencies issues can be just as ugly as parallel assembly code on a Cray system.

Summary

A single piece of code running exclusively on a dedicated processor is very deterministically, but parallel programmed software on a multitasking parallel hardware system can be very non-deterministic, and difficult to test. Much of software reliability is based on how little a given software package depends on these non-deterministic features. Managing software reliability and failure mechanisms requires that programmers understand the system beyond the confines of the program.

References

Posted in Security, Systems Engineering | Tagged , , | Leave a comment