Eliminate the dreaded "no suitable nodes" error
I am running a number of Drupal sites that have a number of modules and appropriately many database tables. The problem is that whenever I clear the cache or visit a heavier and/or uncached page (a typical example is the list of modules available to admin; very long and uncached), your load balancer does not get a reply from the database and cuts off the connection after 30-60 seconds. This produces the dreaded "Unfortunately there were no suitable nodes available to serve this request." After two years of discussing with your techs I actually have started moving my sites to Amazon's EC2 because on some Drupal sites I was getting the error message more often than the expected pages. Please note - it is not a solution to move to Cloud Servers. MediaTemple and other shared hosts manage these Drupal sites fine. The problem is in a) too short time after which your balancer gives up and/or b) your system processing the request too slow. This message on the suggestions site is about my last effort to call for improvement, because as I said, I started to give up on Rackspace because of this very problem.
Many steps have been taken to improve this issue internally. While it will not be possible to completely avoid due to the current shared architecture, the occurrence of this issue should be reduced.
Improvements Rackspace has made to address this include updates to the Linux kernel for improved speed and stability, upgrades to caching infrastructure, increased PHP memory allocation, and improved PHP script and session state storage performance. Scenarios that will likely continue to trigger a time out error are page load times that exceed 30 seconds.
We will continue to monitor this item and make improvements as best we can going forward.
65 comments
-
Hubert Nguyen commented
To Rackspace's credit, I have run some tests today, and although things are not perfect, they did get a lot better in the past couple of months. I had previously run into a lot of disk IO issues (CloudSites seems to be using a SAN or some kind of shared network storage), and that would make things like Wordpress Updates (core or big plug-ins) fail on large/busy sites. Now, maintenance stuff run much more smoothly, even if I came close to the 30-sec limit a couple of times.
-
Helmut
commented
I don't think the word 'completed' fits what has happened here. Perhaps they need a 'Unable to do it' option or perhaps 'Swept under the rug' option. This should be kept open.
-
Hubert Nguyen commented
Vacilando, the cookie thing may work for Drupal (I'm not an expert), but I do know that Wordpress is a coockie monster. Also, the load-balancer itself has a tendency to attach cookies to just about anything, including static files like images etc... so I'm not sure if that's a viable option for most. I understand that one can be very frustrated by the "satisfy every request" promise.
Coming back to the time-out issue, I think that -in theory-, a load balancer could have no time out, but I guess that Rackspace would need to shed some light as of why extending the time-out (possibly indefinitely) is not possible.
AWS is quite amazing, I spent quite some time playing with it. However, it is fair to say that the big difference is that it is not a "managed" service (well, RDS and Elasticache are!), and that you're pretty much on your own if your setup has issues. If you're a developer, it's not so bad, but I'm a publisher, so I'd rather outsource this to someone else.
I'm with you on the backups. I'm not sure what the hold-up is on that one. It seems that incremental backups is an issue that has been solved... but again, Rackspace should shed some light on this one. In the meantime, I'm using Amazon Route 53 to be able to quickly switch my sites between Rackspace and Amazon... yet another great feature from AWS.
-
Vacilando
commented
Indeed, Hubert, I utterly don't understand why the load balancer timeout cannot be extended. At least Rackspace could check whether the request has a cookie attached and allow more processing time for such (as opposed to anonymous) requests.
Hubert, I don't want to put a time that would be "good enough". Mosso, I mean Rackspace Cloud Sites, promised to "satisfy every request", whatever the popularity of the site. That was why I went with them. Got badly burned with the fact that they may serve many requests, they do it at the cost of the heavier calls -- and that is not fair.
After 4 years waiting for a solution, I really don't think I should hope for a change anymore. There is no life in the original project, no new features, too little space, and huge performance issues such as this one.
All my serious websites are on Amazon AWS and Rackspace Cloud Sites hosts mostly just lightweight, less important sites. Once they finally approve backup and migration (another issue opened for years on this Q&A site) I will quickly and happily migrate away and never look back. -
Mark Schlaudraff
commented
Mr. Spencer: Thank you for supplying news about this, however I have a question... This is happening on asp.net code as well and you only speak about PHP.
Please supply details on how this has been fixed for asp.net issues.
-
Hubert Nguyen commented
To Vacilando's point: it's true that the issue is still there, so "closing" this thread is probably not the response that those who face this issue want to see.
Chris, I'm not sure why the load-balancer timeout can't be extended (other that it may require more available Apache nodes, which means more costs), but speed could be further improved by having managed Memcached nodes (I'll gladly pay for this). For unmanaged Memcached nodes, we would need access to CloudServers from the same internal network as Cloudsites.
Vacilando, I think that short of removing the load-balancers, there is no definitive solution to this. What kind of time-out value would be "good enough" to you?
For admin tasks, I did propose to Rackspace that one admin IP should be re-routed to another load-balancer with a much longer time-out. I'm not sure that it's completely possible, but at least, for updates and long DB processes on the admin side, it would alleviate this problem. Good luck to all.
-
Vacilando
commented
Full of hope, I immediately tried to module list of my Drupal site. What was the response? Guess:
Unavailable
Site temporarily unavailable.
Connection timed out - please try again.Perhaps some make-up has been applied to the site, but there is no proper solution.
This is just way too sad.
I am sorry, but by gut feeling is that closing this issue is just an attempt to stop the uncomfortable criticism piling up.
-
Hubert Nguyen commented
Thanks for the update Chris. We'll take any improvements we can get on this one.
-
Danny Copeland commented
May I be the first to say..."Thank you, gang."
Noticed a while back, while updating a client's site, that things had changed and I was receiving the error much less frequently. Took a look now and am not seeing any performance issues on huge page loads any longer. Big thanks go out to all who put in hours(hard ones, I'm sure) on this.
-
Mark Koh
commented
Jennifer F, where are you moving to?
-
Jennifer F
commented
I am having the same problem and will be moving ALL of my sites as well.
-
Matthew
commented
What we have here is a good old fashioned DIRTY BUSINESS PLAN!!!
-
Matthew
commented
Gotta love Rackspace's response... "This error is just to broad to really pinpoint..however you could signup for our Cloud Servers to fix the problem"
Seems as though this is their way of pushing out the old Mosso Customers and forcing them into higher cost plans. When we signed up for Mosso, this was the perfect solution. No Thanks for shoving this issue off to the corner Rackspace!
-
Erick Baum
commented
What makes it worse is that Wordpress and Joomla sites run like **** on CloudSites which causes more frequent No Suitable Nodes errors. Rackspace blamed our sites and their configuration. However, we've been moving to a dedicated cloud server (not Rackspace) and the exact same Wordpress and Joomla sites run like a dream. I don't think CloudSites is as powerful as they make it seem.
-
Geoff Sharp
commented
**** you Rackspace you terds. You don't even bother checking this website or offer any ******* updates! I'm ******* moving my sites as we speak *******.
You're the only host with this ******* error yet you blame Wordpress? WTF? I'm moving my websites out of here as we speak. 6 months of dealing with this ****, my clients are ****** at me!
Get your head out of your ******* arses!
-
Mark Schlaudraff
commented
Rackspace... This has been as a status of 'Under Review' now for over 4 months, what's the deal?
-
Renic
commented
I ran into this problem with my company's site at work. It was very disappointing and may force me to move the site.
-
Steve Holland
commented
I was just about to sign up for Rackspace Cloud Sites until I came across this website. I won't be now until this issue and the email limitations are fixed. This is really disappointing.
-
hc
commented
my client website just went down just because of this and lost the client to another hosting company.
-
Matt
commented
Acgann,
Yeah they do, I'm going to take a look at what would be the best bang for my buck….maybe I'll just give CloudSite a shot or do the $2/mailbox for my email hosting and look at spinning up a linux and windows cloud server here or at ec2.
Just offloading the email at this point would be a big improvement.
Thanks for the help…really glad I found this thread.
