Earlier this year, the single greatest site reliability engineering (SRE) lesson unfolded itself out in space. Last week we saw the very first, better-than-even-expected images from the James Webb Space Telescope or JWST.
After ten years of design and build on a $9 billion budget, this was an effort in testing 344 single points of failure all before deploying to production, with the distributed system a million miles and one month away.
Needless to say, there are a lot of reliability lessons to be learned from this endeavor. At his WTF is SRE talk last month, Robert Barron brought his perspective as an IBM SRE architect, amateur space historian, and a hobby space photographer to uncover the patterns of reliability that enabled this feat. And how NASA was able to trust its automation so much that itd release something with no hopes of fixing it. Its a real journey into observability at scale.
Its a great platform for demonstrating site reliability engineering concepts because this is reliability to the extreme, Barron said of the James Webb Space Telescope. If something goes wrong, if its not reliable, then it doesnt work. We cant just deploy it again. Its not something logical, its something physical that has to work properly and I think there are a lot of lessons and a lot of inspiration that we can take from this work into our day-to-day lives.
After 30 years of amazing photos from the Hubble Telescope, there was a demand for new business and technical capabilities, including to be able to see through and past clouds as they are created.
When designing the Webb telescope, the design engineers kicked off with the functional requirements, which in turn drove a lot of non-functional requirements. For instance, it needed to be much more powerful and larger than Hubble, but to achieve that it needed a significantly larger mirror. However, an operational constraint arose that the mirror is so large that it doesnt fit into any rocket, so it needed to be broken up into pieces. The non-functional requirement became to create a foldable mirror. A solution arose to break the mirror up into smaller hexagons, which can be aligned together to form a honeycomb-shaped mirror.
The second non-functional requirement of the JWST was to go beyond Hubble in not only seeing invisible light, but in seeing hot infrared light. But, to be accurate, the mirror needs to keep cold. Not just colder, but we need to be able to control the temperatures. Exactly. Because any variation and were going to look at something and think Oh, this is a star. This is a galaxy. Not thats just something there on Webb itself, which is slightly colder or warmer than it should be, Barron explained.
Unlike Hubble which orbits the Earth, Webb is unable to orbit because then its temperatures would vary greatly in sun and shade. Plus, it needs to be much farther away from earth than Hubble has ever gone. With this in mind, the controls and antennas face Earth and the telescope faces away with the honeycomb set of mirrors that reflect into a second set of mirrors which then sends the images back to the cameras, which are located in the middle of the honeycomb mirrors. Then behind it is a massive set of sunshades that work to control the temperature of the telescope.
When NASA decided back in 1995 to make this next-generation space telescope, the agency assumed itd cost about a billion dollars. In 2003, they started to design it, and they realized that its not just scaling up Hubble, we need technological breakthroughs the foldable mirrors, precise control of the temperature, the unfurling of the heat shields, and so on, said Barron. Over the next four years of high-level design, they moved the budget to $3.5 billion and planned on another billion for a decade of operations.
Then between 2007 and 2021, NASA dove into the design, build and test phase of what was named the James Webb Space Telescope.
Like good SREs we test and, because we have ten technological breakthroughs that we need to achieve, we have a lot of failures, Barron said. So we retest and fail, and retest and fail. And this takes a lot of time, and the project is nearly canceled many times. And eventually it costs $9.5 billion dollars just to build it. And that $1 billion that we thought would be enough to operate for 10 years is only going to be enough to operate for five years.
All things considered, the JWST was launched in December of last year, kicking off its operation, and what Barron referred to as pirouetting and ballet moves through space.
You can see that over a period of 13 days that the telescope, like a butterfly, opens up, spreads its wings, and started reporting home. And then starts going further away from Earth until it reaches the location where it will remain for the next decade, he explained. This journey took a total of 30 days.
As of the WTF is SRE event that Barron spoke at the end of April, the JWST was considered mid-deployment, before reaching production were doing the final tests before we can say that the system is working and can start giving actual scientific data.
During this deployment phase, there are so many components and pieces moving and changing, it uncovered many points of failure 344 to be exact.
Webb is famous for having over 300 single points of failure during this process of 30 days, each of which has to go perfectly, each of which if the fails, the entire telescope will not be able to function, Barron explained.
When those first exceptional photos came back, discovering new, fainter galaxies, was it luck or a feat of extreme site reliability engineering?
How did NASA reach the point where they could send $10 billion worth of satellite out into space without being able to fix anything without being able to reach out with an astronaut to say, Oh, I need to move something, I need to restart something, I need to do something manual. How can the system be completely fully automated? And can I trust that no dragons will come from outer space and do something to the telescope which will cause it to fail?
Robert Barron @FlyingBarron
You could say this is more than a leap of faith. That trust that NASA had in all this working properly, Barron believes, comes from its decades-long history of sending crafts into space, which is grounded in the values of:
Both the Voyager spacecraft that went to Jupiter, Saturn, Uranus, and Neptune and the Mars Rover were actually sets of identical twin crafts, in case one failed. Similarly, constellations of satellites work in tandem as fail-safes. This redundancy has long been embraced by NASA, but wasnt the option with the JWST price tag.
When redundancy is out, NASA next reaches for repairability. The Hubble Telescope has been repaired and upgraded multiple times for both fixes and preventive maintenance. And, according to Barron, 50% of the astronaut time on the International Space Station is actually spent on toil.
If the astronauts left the International Space Station, then, in a very short period of time, it would just break down and theyd be forced to send it back down into the atmosphere to burn up, he explained.
But, again, the non-functional requirement of repairability was also not an option for the Webb Telescope because it is floating far beyond the current capability of astronauts.
So the next step toward reliability came from building the JWST out of component architecture.
Barron went through a brief history of the Space Race between the Soviet Union and the U.S. from 1960 to 1988. He uncovered the pattern that redundancy didnt actually matter much because the failure modes were shared in both crafts each time, like an alloy wasnt durable enough or a launch was during a sandstorm. He did note that the Soviet space program chose not to publish their mistakes, so they were less likely than NASA to learn from them.
Redundancy is very good, but sometimes at a system level, it doesnt solve a problem because the problem is much wider, which Barron said happens to SREs as well. Kubernetes, for example, has componentization, redundancy and load balancing built-in, but that doesnt matter if the problem is with the DNS or an application bug. Often reliability demands more than simple redundancy.
The monolith Hubble was designed from the start with repairability and upgradeability in mind. With this repairability out of the picture, there had to be a lot more testing on Webb versus Hubble, for each single point of failure. For example, each mirror was a smaller component that could be realigned remotely. He analogized this to Kubernetes, where you want to allocate the right amount of CPU, memories, and resources available to each and every microservice.
In fact, Webb saw some observability trade-offs because it could only allow for so many selfie cameras to observe its own condition because adding more could affect the temperature and alter its observations.
Theres no doubt that the James Web Space Telescope SRE strategy has more stakes than any enacted on Earth. It still makes for a fantastic example of how site reliability engineering and observability needs vary within the context of circumstances. And that sometimes chaos engineering can only be performed before it goes into production.
Barron observed some of the JWSTs SRE strategy:
The JWST experiment is also a good reminder that, with fewer stakes than NASA, much more frequent, smaller deployment cadence, and with less than 100% uptime required, you can experiment more with redundancy, repairability and reliability to continuously improve your systems. Under ideally significantly less pressure.
As SREs, we dont want to aim for 100% availability. We want the right amount of availability, and we dont want to overspend neither resources nor budget in order to get there. We dont want to embrace too many new technologies for new products, Barron said. A lot of the lessons from Webb are what not to do.
Disclosure: The author of this article was a host of the WTF is SRE conference.
The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Saturn.
Follow this link:
James Webb Space Telescope and 344 Single Points of Failure - thenewstack.io
- Hubble Telescope spies stormy weather and a shrinking Great Red Spot on Jupiter (video) - Space.com - March 16th, 2024 [March 16th, 2024]
- NASA's James Webb Space Telescope mission Live updates - Space.com - March 16th, 2024 [March 16th, 2024]
- Secret remains: James Webb measures the rate of expansion of the Universe - The Universe. Space. Tech - March 16th, 2024 [March 16th, 2024]
- Cosmic Expansion Mystery Suggests 'We Have Misunderstood the Universe' - Newsweek - March 16th, 2024 [March 16th, 2024]
- NASA releases an official tabletop adventure that's brave enough to ask: what would Earth be like if a dragon ... - PC Gamer - March 6th, 2024 [March 6th, 2024]
- Hubble Telescope spies massive 'bridge of stars' connecting 2 galaxies on collision course (image) - Space.com - January 30th, 2024 [January 30th, 2024]
- Hubble telescope spots tiniest water-rich world in orbit - The Register - January 30th, 2024 [January 30th, 2024]
- This Hubble Telescope photo of a spiral galaxy will take your breath away - Space.com - January 30th, 2024 [January 30th, 2024]
- Hubble telescope spots water around tiny hot and steamy exoplanet in 'exciting discovery' - Space.com - January 30th, 2024 [January 30th, 2024]
- Old age is the one thing the Hubble telescope and its latest photo have in common - Digital Camera World - January 30th, 2024 [January 30th, 2024]
- Hubble Telescope captures massive bridge of stars between two merging galaxies that could be our new home - WION - January 30th, 2024 [January 30th, 2024]
- NASA's Hubble telescope discovers water vapor In small exoplanet's atmosphere - DNA India - January 30th, 2024 [January 30th, 2024]
- Hubble Telescope detects water vapour in the atmosphere of smallest exoplanet GJ 9827d - Tech Explorist - January 30th, 2024 [January 30th, 2024]
- See This Remarkable Spiral Galaxy from the Eyes of the Hubble Telescope - Beebom - January 30th, 2024 [January 30th, 2024]
- Have we seen the last-ever picture from the Hubble? - Digital Camera World - December 14th, 2023 [December 14th, 2023]
- NASA will try bringing the Hubble telescope back online on Friday - Digital Trends - December 14th, 2023 [December 14th, 2023]
- 30 years ago, astronauts completed the Hubble telescope's first repair. Here's how - CBC.ca - December 14th, 2023 [December 14th, 2023]
- Cosmic Chameleon: Galaxy's Stunning Transformation by Hubble Filters - SciTechDaily - November 15th, 2023 [November 15th, 2023]
- Today's Photo from Ted Grussing Photography: Tweaking ... my ... - Sedona.biz - November 15th, 2023 [November 15th, 2023]
- The science of exploration through photography The Durango ... - The Durango Herald - November 15th, 2023 [November 15th, 2023]
- Hubble telescope captures jaw-dropping 'glitzy' galactic view - Study Finds - August 8th, 2023 [August 8th, 2023]
- Planetary defense test deflected an asteroid but unleashed a ... - UCLA Newsroom - August 8th, 2023 [August 8th, 2023]
- This new tool 'cleans' annoying satellite trails from Hubble telescope photos - Space.com - June 14th, 2023 [June 14th, 2023]
- Hubble Telescope gazes into the heart of a monstrous galaxy cluster (photo) - Space.com - May 14th, 2023 [May 14th, 2023]
- NASA's Hubble telescope captures Jupiter's Great Red Spot, reveals shocking details - DNA India - May 14th, 2023 [May 14th, 2023]
- Hubble telescope spies 'peek-a-boo' exoplanets amid star's tilted dust rings - Space.com - May 14th, 2023 [May 14th, 2023]
- NASA Astronomy Picture of the Day 8 May 2023: Mesmerizing Spanish Dancer Galaxy - HT Tech - May 14th, 2023 [May 14th, 2023]
- Portugal participates in the development of a first-class instrument ... - EurekAlert - May 14th, 2023 [May 14th, 2023]
- Brandywine Art Guide: Multiplicities - women in the arts - Chadds ... - Chadds Ford Live - May 14th, 2023 [May 14th, 2023]
- Hubble telescope reveals huge star's explosion in blow-by-blow detail - Reuters - November 11th, 2022 [November 11th, 2022]
- James Webb and Hubble telescope images capture DART asteroid collision ... - November 1st, 2022 [November 1st, 2022]
- Hubble telescope peeks through 'cosmic keyhole' in stunning photo - November 1st, 2022 [November 1st, 2022]
- The 10 biggest telescopes on Earth - Space.com - October 15th, 2022 [October 15th, 2022]
- Tragedy in the making! 2 gigantic galaxies set to crash; NASA Hubble Telescope snaps photo - HT Tech - October 15th, 2022 [October 15th, 2022]
- Your Answers: What name would you give to a new planet? - ideastream - October 15th, 2022 [October 15th, 2022]
- George Fitzgerald: "Serum is one of the best synthesizers ever made, hardware of software" - MusicRadar - October 15th, 2022 [October 15th, 2022]
- JCPL Column: Looking to the stars - Daily Journal - September 29th, 2022 [September 29th, 2022]
- Hubble Telescope Captures Spectacular Image of Spiral Galaxy - Greek Reporter - September 22nd, 2022 [September 22nd, 2022]
- Hubble Telescope Captures What Might Be the Prettiest Spiral Galaxy Ever - CNET - September 22nd, 2022 [September 22nd, 2022]
- James Webb Telescope rediscovers Earendel, the farthest known star in the universe - EL PAS USA - September 22nd, 2022 [September 22nd, 2022]
- Viewpoints: The Formula Shortage Isn't Over; Male Lawmakers Aren't Doing Their Jobs As Fathers - Kaiser Health News - September 22nd, 2022 [September 22nd, 2022]
- What the Orion Nebula Looks Like to Webb Telescope Vs Hubble Telescope - Gizmodo - September 14th, 2022 [September 14th, 2022]
- JWST takes a peek at the first ever galaxies - Astrobites - September 3rd, 2022 [September 3rd, 2022]
- Scientists harness powers of Webb and Hubble in stunning galactic image - Mashable - August 30th, 2022 [August 30th, 2022]
- NASA Captured The Sound Of Space (And It's Bloody Terrifying) - Boss Hunting - August 27th, 2022 [August 27th, 2022]
- James Webb Telescope images of the planet open up new horizons of the imagination - The Indian Express - August 25th, 2022 [August 25th, 2022]
- NASA released a clip of what a black hole sounds like and it's haunting - KING5.com - August 25th, 2022 [August 25th, 2022]
- Galactic diversity captured in new Hubble telescope photo - Space.com - July 31st, 2022 [July 31st, 2022]
- What did Hubble Telescope see on your birthday? Find out here - India Today - July 31st, 2022 [July 31st, 2022]
- Opinion: At $10 billion, the Webb telescope is a bargain - Daily Press - July 31st, 2022 [July 31st, 2022]
- The First Images from the James Webb Telescope Are Breathtakingand Significant - The New Yorker - July 21st, 2022 [July 21st, 2022]
- Bob Korechoff, 75, is an aerospace engineer who worked with a team of engineers to fix the Hubble Telescope - The Spokesman Review - July 21st, 2022 [July 21st, 2022]
- Comparing The Hubble Telescope & The James Webb Is Like Millennials Vs. Gen Z - Bustle - July 21st, 2022 [July 21st, 2022]
- Rutgers Astrophysicist Selected for Research on the James Webb Space Telescope - Rutgers University - July 21st, 2022 [July 21st, 2022]
- Pillars of Creation: $16B Space Telescope vs $500 Backyard Photo - PetaPixel - July 11th, 2022 [July 11th, 2022]
- How the Webb Telescope will unveil the mysteries of cosmic star-making factories - Inverse - July 11th, 2022 [July 11th, 2022]
- Look: The Hubble telescope reveals of luminous sea of galaxies - Yahoo News - June 30th, 2022 [June 30th, 2022]
- Palestinian Journalist: The Arab And Muslim World Is Mired In Backwardness, Light Years Behind The World's Rapid Development - Middle East Media... - June 30th, 2022 [June 30th, 2022]
- Largest star in Milky Way is DYING and could collapse into a black hole... - The US Sun - June 30th, 2022 [June 30th, 2022]
- Comparison: Webb vs Hubble Telescope - Webb/NASA - June 24th, 2022 [June 24th, 2022]
- NASA shares Hubble view of Jupiter's auroras 100 times more energetic than those on Earth - Republic World - June 24th, 2022 [June 24th, 2022]
- Sun just killed a comet! Hubble Space Telescope reveals shocking details - HT Tech - June 24th, 2022 [June 24th, 2022]
- Uncovering the Mass of Distant Stars - AZoQuantum - June 24th, 2022 [June 24th, 2022]
- Hubbles Most Iconic Images Will Be Smoked By NASAs New Space TelescopeBut Its Not Webb - Forbes - June 11th, 2022 [June 11th, 2022]
- Jun 11: Music from the cosmos, thunderbird extinction, Hubble gets the big picture and more - CBC.ca - June 11th, 2022 [June 11th, 2022]
- Why Uranus and Neptune colours are different: NASA's Hubble Telescope has the answer - HT Tech - June 7th, 2022 [June 7th, 2022]
- Hubble Telescope captures giant star 32x larger than Sun, but it will die first! Check breathtaking NASA photo - HT Tech - June 7th, 2022 [June 7th, 2022]
- The Milky Way and Andromeda Galaxies Are Set to Collide in 4 Billion Years - My Modern Met - June 7th, 2022 [June 7th, 2022]
- Hubble telescope refines universe expansion rate mystery - Space.com - May 23rd, 2022 [May 23rd, 2022]
- Hubble telescope looks deep into the Needle's Eye in this dwarf spiral galaxy photo - Space.com - May 23rd, 2022 [May 23rd, 2022]
- Hubble Telescope: Something weird going on in universe - us.bolnews.com - May 23rd, 2022 [May 23rd, 2022]
- Hubble clicks photo that shows future of Milky Way Galaxy - WION - May 23rd, 2022 [May 23rd, 2022]
- #SpaceSnap The Alluring Crab Nebula Captured by the Hubble and Herschel Space Telescopes - iTech Post - May 23rd, 2022 [May 23rd, 2022]
- Space is an ecosystem, too. And it's in peril - Canada's National Observer - May 13th, 2022 [May 13th, 2022]
- The best Hubble Space Telescope images of all time! - May 11th, 2022 [May 11th, 2022]
- China to beat NASA Hubble Space Telescope with its Xuntian Telescope - HT Tech - May 11th, 2022 [May 11th, 2022]
- How many types of galaxies are there in the universe? - Interesting Engineering - April 25th, 2022 [April 25th, 2022]
- NASA to make announcement Wednesday regarding Hubble Telescope - The Edwardsville Intelligencer - March 29th, 2022 [March 29th, 2022]
- Expanding Universe. The Hubble Space Telescope - Taschen - March 26th, 2022 [March 26th, 2022]
- #SpaceSnap: Hubble Space Telescope's Photo of the Heart of the Flame Nebula - iTech Post - March 26th, 2022 [March 26th, 2022]