11-Sep-2007

Yesterday’s trouble
turned out to be followed by a system crash, that, for some reason, wasn’t noticed until this morning, when looking for yesterday’s operator logs:

OPERATOR_done1396.txt 10-Sep-2007 00:00 18,628 plain text
OPERATOR_done1397.txt 10-Sep-2007 21:59 76,238 plain text

A double operator log on one date?
This is typically a sign of a reboot. It might have to do with restarting all web-related services – where I made a mistake and ran DIANA_STARTUP.COM (usually run in batch on startup) – and stopping and starting the web-server-related stuff.

The system went down more or less smoothly:

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.59 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.11 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.59 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.11 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.65 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.12 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.65 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.12 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.71 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.13 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.71 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.13 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.77 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.14 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.77 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.14 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.83 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRDEL, WE0 192.168.0.15 alias address removed from node DIANA interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:38:03.83 %%%%%%%%%%%
Message from user INTERnet on DIANA
%TCPIP-I-FSIPADDRUP, WE0 192.168.0.15 alias active on node DIANA, interface WE0

%%%%%%%%%%% OPCOM 10-SEP-2007 21:43:52.09 %%%%%%%%%%%
Message from user INTERnet on DIANA
INTERnet ACP SMTP Accept Request from Host: 192.168.0.2 Port: 61854

This is quite likely noticed when (restarting TCPIP, being part of the satrtup-sequence in DIANA_STARTUP.COM.
It had run for some time on. The next file starts:

%%%%%%%%%%% OPCOM 10-SEP-2007 21:59:16.64 %%%%%%%%%%%
Logfile has been initialized by operator _DIANA$OPA0:
Logfile is DIANA::SYS$SYSROOT:[SYSMGR]OPERATOR.LOG;1397

%%%%%%%%%%% OPCOM 10-SEP-2007 21:59:16.64 %%%%%%%%%%%
Operator status for operator DIANA::SYS$SYSROOT:[SYSMGR]OPERATOR.LOG;1397
CENTRAL, PRINTER, TAPES, DISKS, DEVICES, CARDS, NETWORK, CLUSTER, SECURITY,
LICENSE, OPER1, OPER2, OPER3, OPER4, OPER5, OPER6, OPER7, OPER8, OPER9, OPER10,
OPER11, OPER12

The crash happening after I restarted all web-related programs – that makes sense since I finished the previous post after this restart.

The ana/crash output (for those that would like to help me analyze it – if you need the full dump, just drop me a line) shows a machine check:

Time of system crash: 10-SEP-2007 21:54:19.79
Version of system: OpenVMS Alpha Operating System, Version V8.3

System Version Major ID/Minor ID: 3/0
VMScluster node: DIANA
System type: Digital Personal WorkStation

Primary CPU ID: 000 (0.)
Crash CPU ID: 000 (0.)

Bitmask of active CPUs: 00000000.00000001
Bitmask of available CPUs: 00000000.00000001

CPU bugcheck codes:
CPU 000 -- database address 81C38000 -- MACHINECHK, Machine check while
in kernel mode

The good news, however, is that the startup-sequence proves to be correct. At least, I haven’t found anything really missing.

Weird things happening
when I was typing this post, and in a terminal window I tried to find out if the error log showed something, I got this stream of data over the window:

%DECthreads bugcheck (version V3.22-077), terminating execution.
% Reason: lckMcsLock: deadlock detected, cell = 0x0000000000919300
% Running on OpenVMS V8.3() on Digital Personal WorkStation , 256Mb; 1 CPUs, pid
538968551
% The bugcheck occurred at 11-SEP-2007 17:10:00.22, running image
% $116$DKA100:[SYS0.SYSCOMMON.][SYSEXE]DIA.EXE;3 in process 202001E7 (named
% "_FTA3:"), under username "SYSTEM". AST delivery is enabled for all modes;
% no ASTs active. Upcalls are disabled. Multiple kernel threads are disabled.
% The current thread sequence number is 8, at 0x00919300
% Current thread traceback:
% 0: PC 0x80A5FAA4, FP 0x0090F9F0, DESC 0x7BBB2B80
% 1: PC 0x80A661AC, FP 0x0090FAB0, DESC 0x7BBB3F20
% 2: PC 0x80A61E18, FP 0x0090FB60, DESC 0x7BBB3558
% 3: PC 0x80A63CE4, FP 0x0090FB80, DESC 0x7BBB37C8
% 4: PC 0x80A687E4, FP 0x0090FBC0, DESC 0x7BBB5048
% 5: PC 0x809C1CD4, FP 0x0090FC20, DESC 0x7B984658
% 6: PC 0x7C251918, FP 0x0090FC50, DESC 0x7C269EB0
% 7: PC 0x7C24F904, FP 0x0090FC80, DESC 0x7C269A10
% 8: PC 0x80081EF8, FP 0x0090FD00, DESC 0x8194CDF8
% 9: PC 0x8015040C, FP 0x0090FD50, DESC 0x818E9E50
% 10: PC 0x8017D5F0, FP 0x0090FDC0, DESC 0x818EECD8
% 11: PC 0x80A5B1FC, FP 0x0090FFB0, DESC 0x7BBB2300
% 12: PC 0x80A5AFCC, FP 0x00910000, DESC 0x7BBB2590
% 13: PC 0x80A5E4EC, FP 0x00910090, DESC 0x7BBB2690
% 14: PC 0x80A8EECC, FP 0x009100C0, DESC 0x7BBBA018
% 15: PC 0x8017D5F0, FP 0x00910100, DESC 0x818EECD8
% 16: PC 0x8016A394, FP 0x009102D0, DESC 0x818EC850
% 17: PC 0x802D7D64, FP 0x00910310, DESC 0x8192FAC0
% 18: PC 0x802D8C08, FP 0x00910350, DESC 0x8192FCB0
% 19: PC 0x80AA7474, FP 0x00910380, DESC 0x7BECD528
% 20: PC 0x80B5F864, FP 0x009104F0, DESC 0x7BEE1A70
% 21: PC 0x80B62A3C, FP 0x00910560, DESC 0x7BEE1AF8
% 22: PC 0x0020CAF0, FP 0x009105D0, DESC 0x000F52B0
% 23: PC 0x002C3358, FP 0x00910600, DESC 0x00101C38
% 24: PC 0x00219340, FP 0x00913CB0, DESC 0x000F5C50
% 25: PC 0x00228A60, FP 0x00915790, DESC 0x000F5CB0
% 26: PC 0x80A7573C, FP 0x00917DE0, DESC 0x7BBB6670
% 27: PC 0x80A61940, FP 0x00917FE0, DESC 0x7BBB33B0
% 28: PC 0x00000000, FP 0x7ADF9A50, DESC 0x7BBB0380
% 29: PC 0x80375CE4, FP 0x7ADFDB30, DESC 0x8194D110
% 30: PC 0x7AF66058, FP 0x7ADFDBB0, DESC 0x7AEE9050
% Bugcheck output saved to pthread_dump.log.
%SYSTEM-F-IMGDMP, dynamic image dump signal at PC=FFFFFFFF80A5FC48, PS=0000001B

after typing:

$ diag/sin=”10-sep-2007 23:30″

and hit <return%gt;

The SWB window was gone – but luckily, WordPress had just saved my typing so I didn’t have to redo it all. It looked like MySQL or PHP needed some time to recover because I got an error opering this page again, but aftre that, it all worked fine, again.
.
Anyway: Diagnose didn’t give me anything when I retried:

$ diag/sin="10-sep-2007 23:30"

DECevent V3.4
Event file parsing error: event 17 invalid event header type
$

but that’s nothing strange because it seems to do so all the time…

PMAS issues
Getting used to little mail 🙂 Just a few messages today that are quarantained but shouldn’t be, but the solution is simple: add the sender to the “allowed” list. What I sort-of miss is that I can no longer see what requests have been blocked because the sender address in on an RBL – that’s a matter of configuration. Would I like to know? Well, sometimes: yes.
One thing to be kept in observation is the time that quarantained and discarded messages are kept. I used the default (1`4 days) but my impression is that I can only access the ones added today. At least, in the web-interface. the notification seems to show them all, it seems.
However, one thing does NOT work: I cannot craete reports. Nor as administartor, nor as the report-user. The latter wasn’t added durting isntallation, I had to add it manually but it might be that there was something wrong during installation. Or the type of license disallows reporting. For this, I’ll have to contact Process.

But for the rest: no problems at all. Except fro a higher number of PHP troubles…(it looks that way)

More E[B/d]ay to come

At least according Hoff on his blog (read here). One good reason to have all incoming traffic run over the OpenVMS box (small chance that will be infected!), and being able to screen messages before actually donwloading them onto Windows boxes. (I would like to have apple systems around but having game-playing kids around, I’m stuck to Windows. And the company I work at – and their customers – heavily rely on Windows boxes for their office work…)

There is a fair chance that this type of scam is now filtered – even better!