Event id 1006.. No BSOD, but hard reset is the only way out

ipguru99

Member
Joined
Jan 23, 2009
Messages
1
I guess my next choice is to call MS, but that is such a crapshoot. ;-) So here goes..

Hello All!

I have a customer with an odd issue that took us a bit to catch. It is easily solved by rebooting the TS farm twice a week (99% of the time this works). Brand new Windows 2008 farm. In the month or so of testing, they didn't like the Session Broker stuff, so they went with 2X (great software.. and WAY cheaper than Citrix). Out of nowhere, we started noticing the farm would lock up. They would reboot the servers and everything would be fine. Everyone initially assumed 2X, but we were actually at their Company Holiday party and I had to tether my phone to my laptop to get in remotely (yes, alcohol was a factor!) and try to restart the servers. It was then that I realized the guys in their shop would always have to HARD reset one of the servers each time (there are 3 servers.. each time this happened, only one of the servers needed a hard reset.. but it is totally random.. no one server does it more than others).

I have now seen it happen several times myself and basically RDP stops working. If you go to the Console of the server that you can't RDP to, you can't login, you can't do anything.. but it isn't a BSOD. If you go into the 2X console (running on a different box) and take the offending server out of the mix, the farm works just fine. After you hard reset the server, everything is fine. I started going through the logs and I have figured out that I can get about 15 of the event id 1006... which says the server is under attack. These servers are behind an ASA and don't see the light of day as far as the Internet goes.. so I am pretty sure that isn't it.

I have looked for a little bit before I stumbled on this:
Citrix Forums : The terminal server received large ...

It isn't 2X (guys in the post above are running Citrix). It isn't HP (some of the servers are IBM). It isn't Broadcom (some of the guys are using Intel nic's).

I have now found several posts like this (the last response in that post was on 1-17-09.. so it is recent) and I didn't want to do a "Yeah, I am having the same problem". So, with every other product I use, I use the forums. So this is my first post, because the Windows 2008 product (although I am not using much of it) seems pretty solid to me.

Thanks for listening.. we are content to reboot 2 times a week.. but it is a 24 hour shop.. so either they all reboot on Wednesday and Saturday.. or we stagger the reboots. Either way, something doesn't look right to the powers that be that all this brand new (HP DL360 connecting to Cisco 2960G) equipment. I have switched nic's and switch ports. I even have a 3com gig switch in place now.. just to rule stuff out. There are about 100 BosaNova winterms running. And they do time out at 2 minutes, re-initialize and then sit with the login screen ready to go.

Thanks!
 
There are a few things could be causing this. Connection to the DC dying. Nic card drivers causing the problem. User credentials becoming corrupt

Are there any error codes associated with the event id?

That could help to pinpoint the exact problem.
 
Back
Top