October 2014
M T W T F S S
« Sep    
 12345
6789101112
13141516171819
20212223242526
2728293031  

11-Sep-2007

Yesterday’s trouble
turned out to be followed by a system crash, that, for some reason, wasn’t noticed until this morning, when looking for yesterday’s operator logs:

OPERATOR_done1396.txt 10-Sep-2007 00:00 18,628 plain text
OPERATOR_done1397.txt 10-Sep-2007 21:59 76,238 plain text

A double operator log on one date?
This is typically a sign of a reboot. It might have to do with restarting all web-related services - where I made a mistake and ran DIANA_STARTUP.COM (usually run in batch on startup) - and stopping and starting the web-server-related stuff.

The system went down more or less smoothly:

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.59 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.11 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.59 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.11 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.65 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.12 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.65 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.12 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.71 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.13 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.71 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.13 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.77 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.14 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.77 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.14 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.83 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.15 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.83 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.15 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:43:52.09 %%%%%%%%%%%
Message from user INTERnet on DIANA
INTERnet ACP SMTP Accept Request from Host: 192.168.0.2 Port: 61854

This is quite likely noticed when (restarting TCPIP, being part of the satrtup-sequence in DIANA_STARTUP.COM.
It had run for some time on. The next file starts:

%%%%%%%%%%% OPCOM 10-SEP-2007 21:59:16.64 %%%%%%%%%%%
Logfile has been initialized by operator _DIANA$OPA0:
Logfile is DIANA::SYS$SYSROOT:[SYSMGR]OPERATOR.LOG;1397

%%%%%%%%%%% OPCOM 10-SEP-2007 21:59:16.64 %%%%%%%%%%%
Operator status for operator DIANA::SYS$SYSROOT:[SYSMGR]OPERATOR.LOG;1397
CENTRAL, PRINTER, TAPES, DISKS, DEVICES, CARDS, NETWORK, CLUSTER, SECURITY,
LICENSE, OPER1, OPER2, OPER3, OPER4, OPER5, OPER6, OPER7, OPER8, OPER9, OPER10,
OPER11, OPER12

The crash happening after I restarted all web-related programs - that makes sense since I finished the previous post after this restart.

The ana/crash output (for those that would like to help me analyze it - if you need the full dump, just drop me a line) shows a machine check:

Time of system crash: 10-SEP-2007 21:54:19.79
Version of system: OpenVMS Alpha Operating System, Version V8.3

System Version Major ID/Minor ID: 3/0
VMScluster node: DIANA
System type: Digital Personal WorkStation

Primary CPU ID: 000 (0.)
Crash CPU ID: 000 (0.)

Bitmask of active CPUs: 00000000.00000001
Bitmask of available CPUs: 00000000.00000001

CPU bugcheck codes:
CPU 000 -- database address 81C38000 -- MACHINECHK, Machine check while
in kernel mode

The good news, however, is that the startup-sequence proves to be correct. At least, I haven’t found anything really missing.

Weird things happening
when I was typing this post, and in a terminal window I tried to find out if the error log showed something, I got this stream of data over the window:

%DECthreads bugcheck (version V3.22-077), terminating execution.
% Reason: lckMcsLock: deadlock detected, cell = 0x0000000000919300
% Running on OpenVMS V8.3() on Digital Personal WorkStation , 256Mb; 1 CPUs, pid
538968551
% The bugcheck occurred at 11-SEP-2007 17:10:00.22, running image
% $116$DKA100:[SYS0.SYSCOMMON.][SYSEXE]DIA.EXE;3 in process 202001E7 (named
% “_FTA3:”), under username “SYSTEM”. AST delivery is enabled for all modes;
% no ASTs active. Upcalls are disabled. Multiple kernel threads are disabled.
% The current thread sequence number is 8, at 0×00919300
% Current thread traceback:
% 0: PC 0×80A5FAA4, FP 0×0090F9F0, DESC 0×7BBB2B80
% 1: PC 0×80A661AC, FP 0×0090FAB0, DESC 0×7BBB3F20
% 2: PC 0×80A61E18, FP 0×0090FB60, DESC 0×7BBB3558
% 3: PC 0×80A63CE4, FP 0×0090FB80, DESC 0×7BBB37C8
% 4: PC 0×80A687E4, FP 0×0090FBC0, DESC 0×7BBB5048
% 5: PC 0×809C1CD4, FP 0×0090FC20, DESC 0×7B984658
% 6: PC 0×7C251918, FP 0×0090FC50, DESC 0×7C269EB0
% 7: PC 0×7C24F904, FP 0×0090FC80, DESC 0×7C269A10
% 8: PC 0×80081EF8, FP 0×0090FD00, DESC 0×8194CDF8
% 9: PC 0×8015040C, FP 0×0090FD50, DESC 0×818E9E50
% 10: PC 0×8017D5F0, FP 0×0090FDC0, DESC 0×818EECD8
% 11: PC 0×80A5B1FC, FP 0×0090FFB0, DESC 0×7BBB2300
% 12: PC 0×80A5AFCC, FP 0×00910000, DESC 0×7BBB2590
% 13: PC 0×80A5E4EC, FP 0×00910090, DESC 0×7BBB2690
% 14: PC 0×80A8EECC, FP 0×009100C0, DESC 0×7BBBA018
% 15: PC 0×8017D5F0, FP 0×00910100, DESC 0×818EECD8
% 16: PC 0×8016A394, FP 0×009102D0, DESC 0×818EC850
% 17: PC 0×802D7D64, FP 0×00910310, DESC 0×8192FAC0
% 18: PC 0×802D8C08, FP 0×00910350, DESC 0×8192FCB0
% 19: PC 0×80AA7474, FP 0×00910380, DESC 0×7BECD528
% 20: PC 0×80B5F864, FP 0×009104F0, DESC 0×7BEE1A70
% 21: PC 0×80B62A3C, FP 0×00910560, DESC 0×7BEE1AF8
% 22: PC 0×0020CAF0, FP 0×009105D0, DESC 0×000F52B0
% 23: PC 0×002C3358, FP 0×00910600, DESC 0×00101C38
% 24: PC 0×00219340, FP 0×00913CB0, DESC 0×000F5C50
% 25: PC 0×00228A60, FP 0×00915790, DESC 0×000F5CB0
% 26: PC 0×80A7573C, FP 0×00917DE0, DESC 0×7BBB6670
% 27: PC 0×80A61940, FP 0×00917FE0, DESC 0×7BBB33B0
% 28: PC 0×00000000, FP 0×7ADF9A50, DESC 0×7BBB0380
% 29: PC 0×80375CE4, FP 0×7ADFDB30, DESC 0×8194D110
% 30: PC 0×7AF66058, FP 0×7ADFDBB0, DESC 0×7AEE9050
% Bugcheck output saved to pthread_dump.log.
%SYSTEM-F-IMGDMP, dynamic image dump signal at PC=FFFFFFFF80A5FC48, PS=0000001B

after typing:

$ diag/sin=”10-sep-2007 23:30″

and hit <return%gt;

The SWB window was gone - but luckily, Wordpress had just saved my typing so I didn’t have to redo it all. It looked like MySQL or PHP needed some time to recover because I got an error opering this page again, but aftre that, it all worked fine, again.
.
Anyway: Diagnose didn’t give me anything when I retried:

$ diag/sin="10-sep-2007 23:30"

DECevent V3.4
Event file parsing error: event 17 invalid event header type
$

but that’s nothing strange because it seems to do so all the time…

PMAS issues
Getting used to little mail :) Just a few messages today that are quarantained but shouldn’t be, but the solution is simple: add the sender to the “allowed” list. What I sort-of miss is that I can no longer see what requests have been blocked because the sender address in on an RBL - that’s a matter of configuration. Would I like to know? Well, sometimes: yes.
One thing to be kept in observation is the time that quarantained and discarded messages are kept. I used the default (1`4 days) but my impression is that I can only access the ones added today. At least, in the web-interface. the notification seems to show them all, it seems.
However, one thing does NOT work: I cannot craete reports. Nor as administartor, nor as the report-user. The latter wasn’t added durting isntallation, I had to add it manually but it might be that there was something wrong during installation. Or the type of license disallows reporting. For this, I’ll have to contact Process.

But for the rest: no problems at all. Except fro a higher number of PHP troubles…(it looks that way)

11 September 2007 | System's Logbook | Comments

Comments:

You must be logged in to post a comment.