ACCC Home Page ACADEMIC COMPUTING and COMMUNICATIONS CENTER
Accounts / Passwords Email Labs / Classrooms Telecom Network Security Software Computing and Network Services Education / Teaching Getting Help
 

News - Sep 2006

   
 
     
Change in ACCCeSS Helpdesk West Campus Location
 

Sep 29 2006 - The ACCCeSS Helpdesk West Campus location has changed, effective immediately. We've moved next door into PSR 210 (instead of PSR 212) on Wednesdays and Thursdays.

To ensure you have what you need in order for us to help you best, please visit our website at: http://acccess.accc.uic.edu

 
     
router reload problem
 

Sep 27 2006 - We're having a router reload problem that is affecting several servers and services. Working on it.

 
     
Mailserv - problem on one of the back-end servers
 

Sep 22 2006 - We have a disk array problem on one of the mailserv back-end servers. Working on it. Some mailserv users may not be able to access their mail.

 
     
ACCCeSS Helpdesk Removes South Campus Location
 

Sep 21 2006 - The ACCCeSS Helpdesk has removed the South Campus site from its available locations. All South Campus clients should instead come to the East Campus location during the days and times found on our website at http://acccess.accc.uic.edu

We apologize for any inconvenience this change has caused.

 
     
Tigger is Back
 

Sep 20 2006 - Tigger has been up since roughly 1AM, and so far appears normal. Even if it is now stable, and the signs are good, there will still be a bit of fallout to clean up. (Not to mention, deferred sleep compensation for the programmers working on this for 2 days.)

We now believe the orignal problem was due to a broken index on some key system files, which caused the system to slow to a crawl. A subsequent crash managed to corrupt some other files, and that corruption caused hard-to-diagnose problems on Tue.

We've restored an overall system configuration from Sept 11, and will adjust the differences manually. (User files are unaffected by this restore. They should be up to date.) So there may be a little fallout from this adjustment, but things are looking good right now.

Please report any further problems to consult@uic.edu

 
     
Tigger mostly back
 

Sep 19 2006 - Tigger is mostly back and available. Mail seems to be flowing. However, we still have some problems with our database, and this will affect the ability of the CSO to fix account issues, as well as various web pages including password changes, account creation, and so on.

 
     
Tigger -- downtime Tue night
 

Sep 19 2006 - 1:30pm We now anticipate tigger being down Tue night, starting about 5pm for more troubleshooting.

 
     
Tigger -- continuing issues
 

Sep 19 2006 - 12:30 I may have spoken too soon. The fixes we applied yesterday helped a lot, and when tigger is up most things do seem to work. But we still have trouble with the database, and with overall stability. There may be more unscheduled reboots during the day, as we continue troubleshooting.

 
     
tigger up but still unstable - logins and web pages slow
 

Sep 18 2006 - NOTE: Please be patient for you ssh or telnet sessions to start - you should get in eventually.

Tigger was down much of the night and we are still experiencing various performance problems, primarily slow logins and web pages. We're still trying to track down the root cause and will post here when we know more. Here is a posting from the sysadmin who stayed up all night with the problem:

---

Service Express (our hardware maintenance proviode) was here, and we fiddled around in diag for quite a long time. What he saw was a duplicate entry for one particular memory simm. He deleted it, and marked the memory "repaired" in diag. Then we rebooted, and it came up normally!

There had been a message upon some of the earlier reboots that made me suspicious of microcode, so I went to IBM's microcode site, and I think I found the smoking gun, in the description for a new microcode version 3K060626 for the model 7038-6M2 computer which was just released a couple weeks ago on August 21, 2006:

"A problem was fixed that was causing enhanced error handling (EEH) error codes to be erroneously generated when certain adapter card configurations were heavily stressed by the application code."

(I wish I could find out what those "certain" adapter card configurations are!)

So, our conclusion is that we hit a rare microcode bug, and that what he did to clear it, did indeed solve the problem, but that we need to schedule a microcode update for tigger real soon or it could happen again.

This was one of the most difficult to solve problems I have ever encountered, and I'm still not confident we identified what was broken or how what we did fixed it. I'm not even very sure whether it was hardware, software, or microcode.

 
     
tigger up again - we're very sorry for the lost day
 

Sep 18 2006 - Tigger is up again at 9:45pm

We're very tired now, so just a brief summary for now

Sometime Sunday afternoon, something (we still do not know what) caused tigger logins to be very slow and/or hang. We spent the night running diagnostics on the hardware because of an error message that eventually was found to be a red herring. The bottom line is that the problem was caused by a very obscure bug in an IBM program that makes indexes that are supposed to improve the speed of password authentication on very large systems (tigger has 10,000 user accounts).

The bug has apparently been present since we upgraded to AIX 5.3 last summer (this may also explain why tigger performance has not been good since the semester start). After spending an entire day working on the phone with IBM level 1 support discovered and applied a patch to bring us from AIX 5.3.0.4 to 5.3.0.5. It turns out IBM discovered the bug in August and just came up with a patch about 2 weeks ago.

We sincerely apologize for the long outage and any related problems caused by this event.

 
     
Blackboard
 

Sep 18 2006 - Blackboard is up and running. The system upgrades done 09/18/2006 were completed at 11:30am and the system is up and performing very well since the upgrade.

 
     
tigger down
 

Sep 17 2006 - Tigger crashed Sunday evening. Unfortunately, it failed to fully reboot. We have no diagnosis or prognosis yet.

 
     
blackboard down for upgrade
 

Sep 16 2006 - Blackboard will be down from 8 PM Sat. Sep. 16 to Noon Sun. Sep. 17 for a hardware upgrade. We hope to be done before Noon but we are allowing extra time in case problems occur. This upgrade will move the Blackboard application to a server which is much faster than the current server.

 
     
Authentication to dialups, resnet, wireless and VPN is down
 

Sep 10 2006 - We are again experiencing problems with authentication to dialups, wireless, resnet and VPN that we had two weeks ago. We do not have an estimate as to when this problem will be fixed.

Please note that all ACCC public labs are up and running at this time.

 
     
Authentication for dialups, resnet, wireless and VPN is now up
 

Sep 10 2006 - Authenticaiton for dialups, resnet, wireless and VPN came back up at 4:30pm. This service was down between 8am and 4:30pm, Sunday, 9/10/2006.

 
     
BLACKBOARD DOWN FOR MAINTENANCE
 

Sep 04 2006 - Blackboard is down for emergency maintenance to implement recently suggested configuration changes from Blackboard Inc. support and to add additional server hardware.

We apologize for the lack of warning and hope to be back up around 2am.

 
     
Blackboard's performance
 

Sep 01 2006 - We have significantly improved Blackboard's performance, but it is not back to normal yet. It is an intermittent problem that only manifest itself, sometimes, when lots of users are on the system. Several people from ACCC and from Blackboard Inc. continue to work on its resolution.

 


   JGS
UIC Home Page Search UIC Pages Contact UIC