Rendered at 10:13:40 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
ashfn 1 days ago [-]
Something must be wrong, it's showing github as up!
marcosdumay 17 hours ago [-]
So is reddit.
But then, the home page can be cached, and bots can be batched and nobody would ever notice the difference.
throrork 21 hours ago [-]
GitHub does not report their outages. If you see GitHub.com, does not mean GH actions are working.
colinbartlett 22 hours ago [-]
Pretty cool visualization.
I've been building something like this for 12 years now.
One major difference is mine does not only rely on the "official" status page but also receive millions of reports from users about outages.
So your single pane of glass can show not just known outages but emerging ones that haven't been acknowledged yet by providers.
Also supports more than 8,000 services.
pc86 18 hours ago [-]
Where do you source these "millions of reports" from?
8 hours ago [-]
colinbartlett 2 hours ago [-]
users of product and visitors to our site
iFred 17 hours ago [-]
I mean, their viz is free and straight forward, not hidden behind a paywall or a demo page. I also appreciate not putting any comment based signal indicators as that is often noise.
0123456789ABCDE 1 days ago [-]
beautiful visualization of "complex systems run in degraded mode"
There is still a tendency within some parts of aviation (safety auditing) to look for root causes and use tools like "fish bone diagrams" despite the more holistic approach used after an actual crash or incident.
kortilla 1 days ago [-]
A bunch of different services on a single status page doesn’t make it a complex system. Most of these have no relation to each other other than the high level services on the cloud providers.
rcxdude 1 days ago [-]
They're all part of the internet, which is one of the most complex systems ever built.
kortilla 18 minutes ago [-]
No, they exist on the internet but calling them part of the same system is a bit torturous.
My toaster and the dam 1000 miles away are on the same electrical grid. Calling my toaster part of the electrical generation system because it consumes from it doesn’t make sense.
Coming back to the dashboard example, almost none of those work together to provide some kind of combined outcome you would expect from complex systems analysis (e.g. electrical generation, healthcare, etc).
If all of the boxes were ISPs instead, it would be a great example. Because they all work together to provide IP connectivity to the world and many can be down while the overall internet continues to function.
0123456789ABCDE 1 days ago [-]
> A bunch of different services on a single status page doesn’t make it a complex system.
you're it does not.
> Most of these have no relation to each other other than the high level services on the cloud providers.
so, some of them are related to each other? some of them even share underlying infrastructure? perhaps multiple of these are considered infrastructure for some teams?
what is the point you're trying to make?
sammy2255 23 hours ago [-]
Probably unfair to class Cloudflare as "degraded" they have over 300 PoPs theres always going to be some in maintenance mode and re-routed
politelemon 1 days ago [-]
Auth0 and Slack appear degraded here, but not on their status pages
colinbartlett 21 hours ago [-]
This app looks to be incorrectly parsing Slack and Auth0 official status page and showing incidents as ongoing that are not
And those are just the 2 that I checked.
To be fair, accurately scraping and normalizing data from status pages is really hard to to do consistently (my company has a team of 5 engineers to do it and it's a lot of work).
somewhatgoated 1 days ago [-]
Yea I was wondering where that data/info was coming from?
And what does it mean exactly?
xiphias2 1 days ago [-]
Cloudflare as well
talonx 21 hours ago [-]
Services like Cloudflare and Twilio have so many POPs globally that one or more always have an outage going on. Then there's the question of whether it's a major outage or a minor outage. Even though major status page providers like Atlassian and Incident.io have public status APIs (Cloudflare uses Atlassian), it takes more than just parsing them to determine what is "down" and at what granularity.
I run an outage detection service - and some of these issues, like parsing hundreds of - sometimes undocumented - status APIs, make for an interesting engineering problem.
iFred 17 hours ago [-]
With these guys you get into a weird world of "is it them, us, or upstream of both of us" all the time. I had been using Twilio's telco partner maintenance notifications as a way of figuring out if someone like Orange was responsible for a bunch of French end points independent of Twilio had network degradation.
ninju 18 hours ago [-]
I notice that the site 'boxes' are different sizes.
Does the size indicate anything?
dvh 1 days ago [-]
Maybe try using <wbr> for example Cloud<wbr>flare or mongo<wbr>db for more natural break on small screens.
Would be interesting if sites could be grouped based on what services they rely on, or just grouped based on which have correlated downtime.
zamadatix 22 hours ago [-]
Correlated downtime and this is a place I wouldn't actually mind a guess from AI on whether their is a common underlying cause between some of the things. I say AI because I don't really think anyone is going to keep all of the possible common dependencies of different privately hosted systems up to date, but AI could at least take an initial guess + try to find if anyone else is posting root cause theories elsewhere at the time and link to those (and a guess is fine enough).
chedoku 1 days ago [-]
Suggestion: The area of each rectangle should be proportional to the UPTIME capitalization
chedoku 1 days ago [-]
Maybe this is the idea, but how come github uptime is 100%!?
xyst 1 days ago [-]
No love for mindgeek assets?
TeMPOraL 1 days ago [-]
Are those ever down?
cednore 23 hours ago [-]
Facebook, Twitter (X), Instagram is no longer a thing?
talonx 21 hours ago [-]
They don't have straightforward status pages or APIs to detect outages - I think that's the reason they are not listed.
fosron 1 days ago [-]
Playstation is in the list but not Xbox? Weird
Crunchified 1 days ago [-]
No Apple services listed?
Where's iCloud?
gegtik 21 hours ago [-]
Ouch, Azure isn't even present
aetch 20 hours ago [-]
They said major sites
b3lvedere 1 days ago [-]
Interesting.. Ms Teams blocks the entire url..
cleansy 1 days ago [-]
Yeah, highly inaccurate data. Shows Auth0 with an uptime of 0.6% over 24h. Smells like a slop project.
tcumulus 1 days ago [-]
Well if you count every minor service outage which maybe 0.1% of the users are non-critically affected by, you quickly get to 0.6%. So, this doesn't really tell you anything.
23 hours ago [-]
haktan 1 days ago [-]
But 55 of them is unknown (edit: fixed now)
progbits 1 days ago [-]
And github has 100% uptime while cloudflare has 20%. Yeah, right.
hulitu 13 hours ago [-]
Kids those days. What happened to netcraft ?
UrbanNorminal 1 days ago [-]
What a godsend this is! Thanks a lot! I hope the data is accurate! Keep improving it.
wakeless 1 days ago [-]
I'm assuming there's an optimisation in the source of this:
```
if(github) return false
```
chaidhat 1 days ago [-]
over half are unknown
tristor 18 hours ago [-]
Where does this draw data from? It's a similar visual concept to what we're doing at ThousandEyes within Internet Insights (see https://www.thousandeyes.com/outages/) however we make it fairly clear how we are making these determinations. Our data comes from billions of daily pseudonymous metrics from within synthetic tests running across thousands of agents around the world.
If you're drawing the data from a public resource like downdetector or using the sites status pages, then you may not be reflecting reality, but it should be clear what the provenance of the data is.
But then, the home page can be cached, and bots can be batched and nobody would ever notice the difference.
I've been building something like this for 12 years now.
One major difference is mine does not only rely on the "official" status page but also receive millions of reports from users about outages.
So your single pane of glass can show not just known outages but emerging ones that haven't been acknowledged yet by providers.
Also supports more than 8,000 services.
https://how.complexsystems.fail/#5
There is still a tendency within some parts of aviation (safety auditing) to look for root causes and use tools like "fish bone diagrams" despite the more holistic approach used after an actual crash or incident.
My toaster and the dam 1000 miles away are on the same electrical grid. Calling my toaster part of the electrical generation system because it consumes from it doesn’t make sense.
Coming back to the dashboard example, almost none of those work together to provide some kind of combined outcome you would expect from complex systems analysis (e.g. electrical generation, healthcare, etc).
If all of the boxes were ISPs instead, it would be a great example. Because they all work together to provide IP connectivity to the world and many can be down while the overall internet continues to function.
you're it does not.
> Most of these have no relation to each other other than the high level services on the cloud providers.
so, some of them are related to each other? some of them even share underlying infrastructure? perhaps multiple of these are considered infrastructure for some teams?
what is the point you're trying to make?
And those are just the 2 that I checked.
To be fair, accurately scraping and normalizing data from status pages is really hard to to do consistently (my company has a team of 5 engineers to do it and it's a lot of work).
And what does it mean exactly?
I run an outage detection service - and some of these issues, like parsing hundreds of - sometimes undocumented - status APIs, make for an interesting engineering problem.
Does the size indicate anything?
``` if(github) return false ```
If you're drawing the data from a public resource like downdetector or using the sites status pages, then you may not be reflecting reality, but it should be clear what the provenance of the data is.