Ahh, the dreaded "downtime."
On Friday, October 26th between 12pm and 3pm JumpRope experienced severe, though intermittent outages. Try as we might, downtime is a reality of web-based applications. It's sort of a tradeoff - you get the convenience of connected, real-time, shared data from any computer with an internet connection... but you have to deal with the inconvenience of occasionally having to live without it. Factoring in the three hours for today's outages, we're running at a 99.88% "uptime rate" over the last thirty days.
The downtime can be attributed to a major outage throughout Google's datacenters, which we use to host and cache large parts of our application. You can read Google's post-mortem here: http://googleappengine.blogspot.com/2012/10/about-todays-app-engine-outage.html
Our reliance on Google to host data has generally been a very smart move, but there are times that we simply have deal with outages and our hands are tied to do anything more than to stay in communication with our users and provide as many updates as possible. We are very sorry for the inconvenience and trouble that this caused you.
What is it that can cause JumpRope to be inaccessible?
- Your internet may be down. Though this is not a form of downtime, it is the most common reason that you may not be able to access JumpRope and is thus worth mentioning. It is usually wise to check other web sites first. When doing so, try going to a reliable website or two that you don't normally visit, to make sure that a cached copy is not already stored on your computer so that it has to be downloaded from the web. I usually use CNN.com or Yahoo.com. If they don't work, or are very very slow, chances are that the issue is on your end and not ours. Check your wireless internet or internet cables, or find someone in your school to help get you connected!
- JumpRope may have a scheduled maintenance. Once a twice a year, we make upgrades to our system that will make JumpRope inaccessible to you for a period of time. So far, we have only had two scheduled maintenance periods and they have lasted less than an hour each. We do everything we can to minimize them or eliminate them entirely, and we "stage" our updates carefully so that they are completed as quickly as possible. JumpRope warns you upon login for 7 days leading up to a scheduled maintenance period. During the period, you may see an "under construction" page or simply an error when you try to access JumpRope.
- JumpRope developers may have made a mistake. Those of you with sharp eyes may notice that JumpRope periodically gets updated with new features, bug fixes, and other changes. This happens behind the scenes, and you often won't even notice updates. We carefully test new versions of our software before releasing them to you, but occasionally bugs and other problems sneak into a release. If you notice a particular problem in JumpRope that doesn't seem to affect the entire system, please click on the Help & Feedback link at the top of the screen and describe the problem with as much detail as possible. We have support staff and developers who will evaluate the issue and get back to you with a solution or a fix as soon as possible.
- JumpRope may be down for the count. Web applications are becoming so common that sometimes we (well, not so much me since I have to worry about it every day) forget how complex they actually are. Every time you use JumpRope, literally dozens of computers are involved in downloading, fetching data, securing your connection, logging in, translating URL's into machine codes, saving your data, and generating reports. JumpRope hosts data with carefully-chosen partners who are way better than we are at making sure that things are secure and reliable. However, even the best of us sometimes have our issues - Facebook, Twitter, Gmail, and many others - pretty much everything short of Google's main search page - have all experienced downtime over the last few months, at least for some of their users. When this is happening, the best thing to do is to relax, walk away, and trust that teams of engineers at our web hosts (primarily Google and Amazon Web Services) are working furiously to get things working again - for us and their thousands/millions of other customers. You can be sure that everything will be working as quickly as possible.
I hope that this helps clarify the various types of issues that may affect your ability to use our system. Naturally, we work as hard as we can to minimize issues, and we naturally have a goal of 100% uptime. Please let us know if you have any questions or comments below!
An archive of our updates to users:
- Update @ 2:28pm - Phew. All systems are a go. We apologize, and are working on a way to make it up to all of our hard-working teachers!
- Update @ 1:54pm - Our hands are tied and our tail is between our legs. Servers are still intermittent and we cannot recommend using the system because it is still unreliable. It could start working at any time, and we'll post updates here - but logging in every few seconds will probably cause more stress than success. Sorry to our users who are trying to use JumpRope today - we are working furiously and counting on our partners to do the same. Hope that you're able to have good conversations with parents, NYC folks!
- Update @ 1:22pm - Issues are ongoing. Those of you who would like to hear straight from the source can follow this notification group: https://groups.google.com/forum/?fromgroups=#!topic/google-appengine-downtime-notify/SMd2pDJsCPo . Anybody who has the luxury, relax and enjoy your weekend and we'll worry about getting things up and running again!
- Update @ 12:50pm - Feeling a bit better. Recommend giving it a try, but are still receiving reports of slow connections and high error rates.
- Update @ 12:35pm - We are in communication with Google about the health of their servers, which we use to host your data. Their engineers are actively working on restoring access, and JumpRope's hands are tied until the issues are resolved on their end. Very inconvenient for those of you with parent conferences!
- Update @ 12:24pm - It looks like all users are able to log in again, though they may experience delays while servers become health
- Update @ 12:19pm - After working well for half an hour, many users are experiencing issues again. We apologize for the trouble, and will have you up and running as soon as possible.