Monday, October 19, 2020

Linden Lab: "We Apologize For Recent Service Disruptions"


Glitches are more or less a part of the Second Life experience. But if it seems they've been getting worse lately, it's not your imagination. And even Linden Lab is admitting things are buggier than usual. In the Tools and Technology blog on Friday October 16, Oz Linden stated in the "Uplift Update" that while there's been overall improvement in the Second Life infrastructure, "it’s the bumps in the road that are most noticeable to our residents. We apologize for recent service disruptions ..." He would point out several "rough spots."

Region Crossings
One of the first troubles we found was that region crossings were significantly worse between a cloud region and a datacenter region. We did a deep dive into the code for objects (boats, cars, planes, etc) and produced an improvement that made them significantly faster and more reliable even within the datacenter. This has been applied to all regions already and was a good step forward.

Group Chat stalls
Many users have reported that they are not able to get messages in some of their groups; we're very much aware of the problem. The start of those problems does coincide with when the chat service was uplifted; unfortunately the problems did not become clear until moving that service back to the datacenter was not an option. We haven’t been able to get that fixed as quickly as we would like, but the good news is that we have some changes nearly ready that we think may improve the service and will certainly provide us with better information to diagnose it if it isn't fixed. Those changes are live on the Beta grid now and should move to the main grid very soon.

Bake Failures
Wednesday and especially Thursday of this past week were bad days for avatar appearance, and we're very much aware of how important that is. The avatar bake service has actually been uplifted for some time - it wasn't moving it that caused the problem, but another change to a related service. The good news is that thanks to a great cross-team effort during those two days we were able to determine why an apparently unrelated simulator update triggered the problem and got a fix deployed Thursday night.

Increased Teleport Failures
We have seen a slight increase in the frequency of teleport failures. I know that if it's happened to you it probably doesn't feel like a "slight" problem, especially since it appears to be true that if it's happened to someone once, it tends to keep happening for a while. Measured over the entire grid, it's just under two percentage points, but even that is unacceptable. We're less sure of the specific causes for this (including whether or not it's Uplift related), but are improving our ability to collect data on it and are very much focused on finding and fixing the problem whatever it is.

Marketplace & Stipend Glitches

We've had some challenges related to uplift for both the Marketplace and the service that pays Premium Stipends. Marketplace had to be returned to the datacenter yesterday, but we'll correct the problems that required the rollback and get it done soon. The Stipends issues were both good and bad for users; there were some delays, but on the other hand we sent some users extra stipends (our fault, you win - we aren't taking them back); those problems are, we believe, solved now.


So why all of the glitches and bugs? Oz called the Uplift to move Second Life's data to cloud servers "a massive, complicated project that I've previously compared to converting a steam-driven railroad to a maglev monorail -- without ever stopping the train." He would later try to reasure the residents, "While this week in particular has seen some bumps in the road, it's actually going well overall. Lots of the infrastructure you don't interact with directly, and some you do, has been uplifted and has worked smoothly." He would go on to say almost all of the Beta Grid's sims were in the Cloud, and "we've uplifted around a hundred regions on the main grid. Performance of those regions has been very very good, and stability has been excellent." He would go on to say more sims would be uplifted over the weekend and that "Uplift of the Release Candidate regions, which will bring the count into the thousands, will begin soon."

Oz would say the Uplift to the Cloud should be completed, or at least be close to it, by the end of the year.

To read the blog entry in full, Click Here.

Bixyl Shuftan

No comments:

Post a Comment