November 14 2010 by
Greg in
General |
Over the past six years, the development team at CarePaths has been working to implement a fully redundant, fault-tolerant network and application stack for the eRecord. In that time we have experimented quite a bit and have more or less settled on the technologies that are going to drive the eRecord over the forseeable future. Given that, we thought it might be of interest to talk about what goes into the stack and what it means to users. The stack is our generic term for everything on our end of your Internet connection to the eRecord. We are not talking about the net itself, just what happens when we get your browser’s request for a webpage. If you are already halfway to clicking on the “get me out of here link,” hold on just a minute. What we are talking about here is how the eReecord makes it possible to keep your records secure, easily and quickly organized and what we do at CarePaths to keep costs down.
The short answer is that like many successful Internet sites, we combine commodity hardware with open source software to produce a seamless, hand-crafted, yet relatively inexpensive system tailored to deliver the eRecord. Our goal is to avoid paying expensive licensing, management, and deployment fees that many other EMR vendors who have essentially purchased their technology “off the shelf” have to pay. Amazingly, not only do these companies pass these costs directly and shamelessly on to you in a variety of packages including required licensing, consulting, and support contracts, but they do so by selling these “services” as positive features of their product. In short, these companies do not control their own costs and so are obliged to increase your costs when theirs go up. At CarePaths, we strive to cut out the middlemen and be a direct provider of a software service that delivers a solid product at a price that says we know everyone isn’t a Fortune 500 company. Ok, so how do we do it?
The Network
At the lowest level, we house our servers in a secure data center (aka a server farm), that provides 24×7 network monitoring, 24×7 staffing by network engineers, physical security, redundant power, and redundant, very fat fiber pipes directly into the major Internet backbones. More secure data centers, like the one that we are in, typically are certified using an auditing procedure known as SAS70. When used to audit data centers, it typically means that the facilitity is one of the most secure places in which you can locate your data. As an example, for me to get physical access to our servers, I have to be on an approved list, show a picture ID to a live person and pass a biometric scan just to get into the facility. Once inside, the network engineer escorts me to a large locked cage in which are more locked cabinets. Our servers are located inside one of those locked cabinets and I have to know the combination for the lock. Each cabinet has its own dedicated power circuit. Like software security, physical security is multilayer and relies on different forms of security (eg like your bank card with two factor security, something you have, your card, plus something you know, your PIN). Server colocation is not inexpensive, but relative to the benefits and security offered, we feel that it is worth every penny and provides a solid foundation on which to build our stack. The cost is also mitigated by the fact that we build our own servers and do our own maintenance (more on that to follow). From here on we will refer to the data center and our connections to it as our Internet Service Provider (ISP).
Firewalls, Switches, and Wires…Oh my!
If you have a home router for your home network, then you know that at some point the Internet stops being the Internet and becomes your local network (aka Local Area Network — LAN). This is in contrast to the Internet which we sometimes call the Wide Area Network (WAN). Basically, our ISP gives us an unfettered and unfiltered (but still monitored) connection to the Internet. It is up to us to filter out/block undesirable traffic and only expose our servers to requests from our users and not wierdos from some far off land looking for fun. A firewall is the box that does that. We have two of them in a redundant pair such that if one fails the other one will automatically and almost instantaneously take over. We only open a few “ports” that are required for the encrypted connections from our users to the eRecord. Unlike some other sites, all connections to the eRecord are secure, encrypted and do NOT use outside advertising and/or analytics to monitor our traffic. We feel that this is inappropriate on several levels, but most critically we feel it is a loss of privacy to patients using the eRecord as these analytics and advertising sites track a user’s Internet Protocol (IP) address and, therefore, can potentially identify what sites a given person is using.
Behind the firewalls are redundant pairs of switches (boxes with many Ethernet plugs like a power strip but for networking) and redundant pairs of Ethernet cables to each of our Servers. In addition, each server is redundant in the sense that whatever functions it provides to the stack, there is another physical server that is either doing the same thing (think load balancing) or waiting to be asked to take over (warm/hot standby). Inside our servers, the things that tend to fail most often are the things with moving parts, the fans and the harddrives. All of our servers have multiple fans and run software or hardware RAID with at least two harddrives such that if a harddrive fails, the machine continues to run. In short, the eRecord, at the wire level is fully redundant with no single point of failure. We can lose a firewall, a switch, several servers and even a hard drive out of one or more of the remaining servers and the eRecord will continue to fly. All of this redundancy has been baked in so that when disaster strikes, the eRecord is capable of continuing on without interruption or service degredation. (More on disaster recovery later in the Roadmap section where we talk about opening a second cluster out West).
Just What is a Server and Can it Make My Coffee?
When i first started playing with servers, I was confused by the ubiquitous use of the word “server.” Everything it seemed was a server, physical boxes, pieces of software, just about anything was a server of some sort. To make it worse, there were other vague, but seemingly related terms like “middleware” and “Service Oriented Architecture” and “Enterprise Server Solution.” How did i get clarity? I stopped reading the advertising- and marketing-speak filled manuals from proprietary, closed source corporations and started reading books, blogs and webpages about open source technologies, the most important of which for the eRecord is Linux, the free, open-source operating system originally developed by Linus Torvalds somewhere back around 1991. Since then Linux and other open source operating systems have become some of the most signficant achievements of collaborative, distributed computing and combined power a significant majority of all Internet traffic (60-70%? something like that).
Why use Linux instead of a proprietary, closed source operating system like the one on my desktop? Cost, ease of use, performance, security, and most importantly Linux and other open source software products are essentially peer-reviewed and independent. By peer-reviewed, i mean that the code is freely available for anyone to view and even “fork” into your own version of it (though that is probably not what you want to do). This review-by-consensus is usually born out of testing done my many, many users and developers and results in very secure, robust tools. Granted, not all projects are the same in quality, but if you select mature projects with active development communities, the result is usually better and certainly cheaper than most comparable proprietary closed source products. In addition, as a small company, we have little control over what large software companies decide for their products and many of these companies design their tools to lock you into using all or none of their products. In essence, the hallmark of many large software companies prior to the advent of open source was to take hostages. “Vendor lock-in” means that once you have leveraged enough of your core software against a vendor’s software, you are dependent and must pay whatever increasing costs or cope with whatever decisions they make about how their product evolves, which is particularly difficult if they discontinue and/or stop supporting their code. In contrast, at CarePaths, our stack takes the more open source approach to building a stack. Namely, we use a bunch of cooperating, but separate tools which together comprise our stack. Any one part of the stack is replaceable with some other similar tool without changing the rest of the stack. We avoid “vendor lock in” and all of the software licencing fees that goes along with it. Building the stack in this way gives us much more control and lower costs which we can pass on to our users.
Enough Already! Just Tell Me About the Tools
Ok. Enough evangelizing for open source. We use it and you should too: desktop, laptop, phone…everywhere. Ubuntu is a version of Linux (more accurately, a “distribution” as different versions of Linux are called) that is free and a very solid replacement for any closed source operating system with which you are (or should be) dissatisfied. At CarePaths, we use the following open source tools to build our stack, each in it’s own section.
pfSense Firewall
Based on another open source operating system, OpenBSD (Berkeley Standard Distribution), pfSense is what’s known as an “appliance” distribution in the sense that you can install it on a generic server (aka pizza box because typical rack-mount servers are about the size and shape of a pizza box) and it will manage the entire machine and do one thing, protect everything behind it. Firewalls are best when implented as a separate machine that has very few options and/or services enabled and relies on a rock-solid, secure operating system. We moved to pfSense after spending several years dealing with just crazy service and maintenance fees that were required by the closed source, proprietary hardware firewall box we had been using just to get updates. More hostage taking. pfSense has a very active developer comminity focused on reliability and security.
Debian Linux
I call Debian the “mothership” of linux distribuations. That is just my pet name and probably not one that everyone would agree with, but it is the underlying source for many other Linux distributions because of its packaging system that allows administrators to reliably install and upgrade the system with very granular control. Ever wondered what happens to your computer when you go to that online site to update your machine? So do i, and the short answer is that you don’t know. You can’t see the code, you don’t get to know what files are going to be updated etc. We don’t install anything at CarePaths that we can’t control or validate. Debian is rock solid and, for me, the choice for reliable and predictable server operations. If there is a downside to Debian, it is that because it is designed to be very, very stable, the packages tend not to be updated all that frequently. This is a plus for most production servers because you aren’t changing the code that much and you just want to update security fixes and other bugfixes but not change much else. On the other hand, if you are talking about a desktop or a laptop where you want the latest and greatest (or need the latest drivers) then Debian can be a bit painful. This is exactly why Ubuntu Linux (based on Debian) has become so popular. It takes the best of Debian but keeps the packages more current with more frequent updates. So the adage has been, “Debian on the server, ‘buntu on the desktop.” As always, Your mileage and opinion may vary.
Nginx
Nginx is a relatively newer player in the webserver space. A webserver is what your requests to the eRecord connects to and the thing that decrypts your encrypted request so that it can be processed by the rest of the stack. It is also responsible for serving static content like pictures and other application code such as style sheets and javascript files that stay the same regardless of the request. Apache is probably the best known open source webserver, but we prefer Nginx due to its more efficient memory and processor use. It is blisteringly fast with many connections on cheap hardware. For the geeks in the crowd, Nginx and HAProxy both use ansyncronous I/O event loops rather than processes or threads to handle thousands of concurrent requests (ie the C10K Problem).
HAProxy
HAProxy is a load balancer designed similarly to handle thousands of concurrent requests on inexpensive hardware. A load balancer is simply a piece of software that takes incoming connections and routes them to other servers that actually do the processing. In plain english, it assigns incoming requests to the backend server that will actually answer your request. HAProxy also does health checking on all of the servers for which it handles requests so that when a server is busy or dies, it stops sending requests to that server. The net effect to the user is that when we have a technical problem on our end, users typically know nothing about it. The eRecord adheres to the principle of “shared nothing” meaning that any request can be served by any machine and that you probably don’t get the same machine twice in a row. The practical upside to this is that when a machine goes down, it doesn’t matter. There was nothing on that particular webserver that you need for your eRecord session.
Ruby on Rails
Ruby is a programming language developed initially in Japan by Yukihiro Matsumoto in the 1990s, but is now a world-wide open source project with many companies and individual developers contributing. Rails is a web application framework written in Ruby that processes eRecord requests and builds and sends the response back to HAProxy which in turn sends it back to Nginx and then back to you in an encrypted envelope. So Rails is what we spend most of our time working in. Next to Linux, it is by far the single most important piece of sotware that we have put in the stack and is responsible improved developer productivity and many of the recent improvements to the eRecord. What else can i say, Rails does everything but make my coffee. One partcular aspect of Rails that is beoming more relevant to the eRecord is that it enables us to compartmentalize functionality such that we are able to both standardize our coding and functionality for the general class of requests, but are also able to add on custom functionality without interfering or interacting with regular requests. More specifically, the Roadmap for 2011 has improved clinical and administrative decision support and reporting as a major goal. Decision Support tends to be one of those “expert system” areas where there are many different sets of rules for different providers, payers, and regulators. Having a way to compartmentalize that code will be very important.
SQL Data Storage: PostgresSQL
So where do we put your data once we are done with the request and sent it back to you? Good question and it used to be a fairly simple answer, a relational database like PostgreSQL. A relational database is simply a set of spreadsheet-like tables that are connected to each other through typically through the use of integer “keys.” For example, a child table would have a key for each row and then another column with the key of it’s parent row in the parent table. In this way, using Standard Query Language (SQL), you can build sets of data from the interrelated tables. A central feature of relational databases is that you don’t store the same data in two different places (ie your tables are fully normalized). While it has been canon for a long time, fully normailzed data has some unfortunate downsides with respect to performance and archival integrity. Performance-wise, it is expensive to have to track down data on all the other tables when you are dealing with many concurrent users. SQL was invented in something like the Jurassic Period when networking meant that you walked to the coffee shop with your coworker. It and many of the database servers that are built on it, certainly didn’t have in mind that you would be having hundreds if not thousands of concurrent users doing exactly the same things at the same time. Many SQL database servers are not well designed to handle many concurrent users, especially when those users are all updating data. In addition, SQL, as it is typically implemented in applications, overwrites the data in the row that you are updating such that changes to one table when “joined” to another table with result in the “view” having the updated data. While this might seem like what you want, it doesn’t work if you are wanting to track changes over time. For example, if i update a patient address, then I need to store the old address on some other table otherwise the next time I look at documents from three years ago, they will have the updated address and not the address as it was at the time that the document was written. Archival document fidelity is not a strong point in the relational, fully normalized SQL implementation. You have to do more work, duplicate data, have more tables, and consequently incur more overhead on every write. The eRecord is approximately 50/50 in terms of reads vs. writes so we pay a penalty to track “versions” of data. In its defense, PostgreSQL is a wonderful, open source tool and we would be hard-pressed to find a comparable replacement.
No SQL: CouchDB and Redis
So, while we certainly use PostgreSQL for our workhorse, big iron storage solution, we have started to integrate newer, non-relational data storage systems that address these common issues. Together this class of data servers are a part of the so-called “No SQL” movement in web application design. Two of the best of these are CouchDB and Redis.