grid.org/United Devices Newsbeat

Tuesday, February 28, 2006

Status update Feb 27

Dear Members,

Here is the weekly status of Grid.org.

Outage: There was a temporary outage over the weekend that caused connectivity problems with Grid.org servers. This was caused by a router failure in our datacenter and was unanticipated. The problem has since been rectified and everything should be functioning normally.

Cancer data: I have uploaded a new chunk of data and have sent the last result set to Oxford for analysis. Some members have noticed that this batch produces an awful lot of hits. I am asking if this is to be expected and if this could in any way be related to the occassional workunit aborts that some members are seeing.

Thank you for your contribution.
_________________
Robby Brewer
Senior Support Engineer
United Devices

Tuesday, February 21, 2006

Cancer /general status update

Dear Members,

I have uploaded another batch of cancer data. The last job will remain active for a few days to give members time to receive credit for any outstanding workunits. Since these new workunits are processing so quickly, I do not think we need to wait a whole week.

Thank you for your contribution.
_________________
Robby Brewer
Senior Support Engineer
United Devices


Dear Members,

Here is the weekly status:

Cancer Data - We continue to have an issue with occassional aborts that is still under investigation. As I mentioned in the Member to Member Support thread, this issue has been identified as a problem with the Ligandfit application itself crashing. This has nothing to do with the UD servers dropping results or not giving due credit. The Ligandfit application has not changed in a very long time, prior to the last batch of cancer data that received no complaints. This would point to something with the new data that is causing Ligandfit to be unstable.

Note that we are receiving successful results from each and every workunit so there are not "bad workunits" that will always fail. Some members have stated that retrying a workunit that just aborted will result in a successful completion.

Although a bit frustrating because of lost points, know that we are getting successful results that can be returned to Oxford to assist in the search for the cure for cancer which is the ultimate objective of this project.
_________________
Robby Brewer
Senior Support Engineer
United Devices

Friday, February 03, 2006

New Cancer data available

"Dear Members,

I have just uploaded a portion of the latest data I received from Oxford. My internal tests looked good as far as the workunits being able to be processed. We will let this job run for a while before shipping the results back to Oxford for verification. Note that the previous cancer job is still running as well, so it is up to the dispatcher as to whether you get one of the new workunits or not. As soon as I see a few successful results, I will stop the previous job from dispatching so everyone will be crunching the new data.
_________________
Robby Brewer
Senior Support Engineer
United Devices"

Tuesday, January 03, 2006

Update

Dear Members,

I hope everyone had a happy new year. Here is the latest status:

Grid.org Outage - As I posted previously, we had a temporary outage due to a blown City of Austin transformer (a rogue squirrel is suspected). This outage should not have required any reregistrations. There was a temporary period where devices may have been "Backing off..." due to a high load after the servers became available, but this should no longer be occurring.

New Cancer Data - A small batch of the new cancer data has been uploaded and members are currently crunching away at it. This is a very small subset of the data we received, ~20 WUs, just so we can verify that we are getting the results we expect. I plan on sending some of the results to our Oxford contact today to verify these are the results they expect. Once that is confirmed, I will upload more data. There have been a few members complaining of occassional lost results. I am not sure how wide spread this is yet nor the cause of the loss. Other members have reported complete success so we will need to investigate this more closely. Note that the old cancer job has been disabled so if you are working on the cancer project, you are crunching the new data.
_________________
Robby Brewer
Senior Support Engineer
United Devices

Wednesday, December 21, 2005

Cancer data progress

"Dear Members,

We have made some great progress on getting the new cancer job ready. There was an issue with the format of the data we received from Oxford and we believe that we have resolved this. I have been able to get successful cancer runs using the new data on an internal system. Hopefully we will be able to get some of the data loaded onto grid.org today or tomorrow.

In the past we have utilized the beta server and the beta tester team to help verify that things are working correctly before moving to grid.org. Due to hardware issues this is not going to be possible this time. If you are one of the beta tester members please do not be offended at this. We plan to reinstate the beta tester group when we migrate to the new system. I know everyone is anxious about the cancer data so we will not wait.

What this means is that it is possible that there may be a problem with the job when we submit it. I will be keeping an eye on the system and will be able to flip back to the current job quickly if necessary. Since the current job is basically stale data anyway this should not upset anyone. I will post again once the new job is submitted."
_________________
Robby Brewer
Senior Support Engineer
United Devices

Friday, November 18, 2005

Scheduled outage

Dear Members,

This is a reminder that Grid.org will be physically moving to a new facility this weekend. This will provide better reliability and is part of the Grid.org migration. Grid.org must be completely powered down in order to move. This will include the forum, member web, and the grid itself.

Since this is not a trivial outage, we cannot give an exact time that Grid.org will be down. The outage will take as long as it takes and it may be a day or two before everything is back online.
_________________
Robby Brewer
Senior Support Engineer
United Devices

Tuesday, November 15, 2005

Status update.....Nov 15th

Dear members,

Here is the weekly status...

Grid.org upgrade: Grid.org will be physically moving to a new facility in the next week. This will provide better reliability and is part of the Grid.org migration. There was a small outage this weekend due to some hardware maintenance that was performed in preparation for the facility move. Please note that next weekend there will be some additional outages. Grid.org must be completely powered down in order to move. This will include the forum, member web, and the grid itself. Since this is not a trivial outage, we cannot give an exact time that Grid.org will be down. The outage will take as long as it takes.

Rosetta: We have received a new batch of Rosetta data that we are in the process of uploading currently. Since the last batch is taking so long to process, we will continue to allow results to be uploaded for a while. Hopefully this new batch will be more in line with the previous batches.

New cancer data: Some progress has been made. The workunit processes without error and does produce limited output. We have an engineer working with this to make sure we are getting the results we expect.

Thanks to everyone for your contribution.
_________________
Robby Brewer
Senior Support Engineer
United Devices