30-Dec-2009

8.4 Fieldtest CD burning – Twice
Some days ago, I created a logical disk on the emultaor, to create a CD for installing the VMS 8.4 fieltest on the new PWS500. I thought I had it copied to Aphrodite – found an ISO-file ready to be burned – and stored that on CD. But when I tried to boopt the PWS from that file, it failed – it turned out to be the bad, non-bootable one. So I re-created a second LD-device, and now did the proper restore on that one, and copied the container file to the Windows environemnt and burned it as an image.
This morning, I tried to see if it really was bootable on the emulator – and it was, but there is something wrong on the CD:

>>> b dka300
(boot -file '' -flags '' 'dka300')
BOOT_RESET is ON: cold boot
Booting from the device 'dka300' file ''
Loaded the primary booter; size 0x9a000

OpenVMS (TM) Alpha Operating System, Version E8.4
© Copyright 1976-2009 Hewlett-Packard Development Company, L.P.

%DCL-W-ACTIMAGE, error activating image TYPE
-CLI-E-IMGNAME, image file DKA300:[SYS0.SYSCOMMON.][SYSEXE]TYPE.EXE;1
-SYSTEM-F-ILLBLKNUM, illegal logical block number
*** AXPVMS$PCSI_INSTALL_MESSAGES.COM called with invalid message identifier!
identifier: welcome

Installing required known files…

Configuring devices…

%DCL-W-ACTIMAGE, error activating image TYPE
-CLI-E-IMGNAME, image file DKA300:[SYS0.SYSCOMMON.][SYSEXE]TYPE.EXE;1
-SYSTEM-F-ILLBLKNUM, illegal logical block number
*** AXPVMS$PCSI_INSTALL_MESSAGES.COM called with invalid message identifier!
identifier: menu

Enter CHOICE or ? for help: (1/2/3/4/5/6/7/8/9/?)

I know that choice 8 will start DCL – and it did, without a problem, except that, again, TYPE will not show up the message what to do to finish the session:

%DCL-W-ACTIMAGE, error activating image TYPE
-CLI-E-IMGNAME, image file DKA300:[SYS0.SYSCOMMON.][SYSEXE]TYPE.EXE;1
-SYSTEM-F-ILLBLKNUM, illegal logical block number
*** AXPVMS$PCSI_INSTALL_MESSAGES.COM called with invalid message identifier!
identifier: notnormal
$$$

So TYPE is a problem: it will show the same error if invoked.

$$$ type SMGTERMS.TXTSMGTERMS.TXT
%DCL-W-ACTIMAGE, error activating image TYPE
-CLI-E-IMGNAME, image file DKA300:[SYS0.SYSCOMMON.][SYSEXE]TYPE.EXE;1
-SYSTEM-F-ILLBLKNUM, illegal logical block number
$$$

but DUMP does not show an error but runs to the end. So it IS readable.

$ analyze/image however, signals a different problem:

Analyze Image 30-DEC-2009 13:14:33.90 Page 1
DKA300:[SYS0.SYSCOMMON.][SYSEXE]TYPE.EXE;1
ANALYZ A01-07

*** This file is not a VMS native image.
The analysis uncovered 1 error.

$$$

I successfully updated the standard bootdisk of the emulator from a backup; I booted that system, and tried to access the specific image.
$ analyze/image
had the same problem as when booted from CD, and running the image directly issues yet another error:

$ run type.exe
%DCL-W-ACTIMAGE, error activating image TYPE.EXE
-CLI-E-IMGNAME, image file DKA300:[VMS$COMMON.SYSEXE]TYPE.EXE;1
-SYSTEM-F-IVADDR, invalid media address
$

but now, DUMP finds a problem:

$ dump/rec/page DKA300:[VMS$COMMON.SYSEXE]TYPE.EXE
Dump of file DKA300:[VMS$COMMON.SYSEXE]TYPE.EXE;1 on 30-DEC-2009 13:42:03.79
File ID (1301,1,0) End of file block 61 / Allocated 63
-SYSTEM-F-IVADDR, invalid media address

$ Analyze/disk revealed quite a lot of errors – including on directories

$ analyze/disk dka300:
Analyze/Disk_Structure for _$3$DKA300: started on 30-DEC-2009 13:47:09.72

%ANALDISK-W-CHKSCB, invalid storage control block, RVN 1
%ANALDISK-I-OPENQUOTA, error opening QUOTA.SYS
-SYSTEM-W-NOSUCHFILE, no such file
%ANALDISK-I-BADHIGHWATER, file (2297,1,0) SYS$ERRLOG.DMP;1
inconsistent highwater mark and EFBLK
%ANALDISK-W-READDIR, error reading directory [SYS0.SYSCOMMON.SYSFONT]
-SYSTEM-F-ILLBLKNUM, illegal logical block number
%ANALDISK-W-READDIR, error reading directory [SYS0.SYSCOMMON.SYSHLP]
-SYSTEM-F-ILLBLKNUM, illegal logical block number
%ANALDISK-W-READDIR, error reading directory [SYS0.SYSCOMMON.SYSHLP]
-SYSTEM-F-ILLBLKNUM, illegal logical block number
%ANALDISK-W-READDIR, error reading directory [SYS0.SYSCOMMON.SYSHLP]
-SYSTEM-F-ILLBLKNUM, illegal logical block number
%ANALDISK-W-READDIR, error reading directory [SYS0.SYSCOMMON.SYSHLP]
-SYSTEM-F-ILLBLKNUM, illegal logical block number

and because of that, subdirectories and files could not be located:

%ANALDISK-W-LOSTHEADER, file (1322,1,0) DECW.DIR;1
not found in a directory
%ANALDISK-W-LOSTHEADER, file (1323,1,0) 100DPI.DIR;1
not found in a directory

There may indeed be something wrong with the CD. Well, check it on the Alpha’s. If it runs there, it’s possible that either the CD on my laptop has a problem reading the files, it has broken the LD-image writing it to the CD…

Update
Inserted the CD in one of the drives on Diana and analyze/image didn’t reveal an error. Executing the program fails, but because of something completely different:

$ run type.exe
%LIB-F-INVARG, invalid argument(s)

At least, the program starts without a problem. So it’s quite likely to be the CD in the laptop, or the driver with Personal Alpha. Next: installing 8.4 on the other PWS….
This worked like a charm: upgraded from 7.3-2 to 8.4 directly, though I think it’s not supported – officially. It took quite some time but that may have been a matter of a slow CD-device. Once started, the system was restarted, and it ran a SYSGEN script – to update the system parameters due for the changes in the OS. Thsi would have been executed in an upgrade to 8.3 as well, but this one might have to do with the clustering-over-ip that comes with 8.4. This actually is the big thing to test in the coming weeks – with a new setup of WASD: version 10 has some big chnages and it’s a good moment to re-think the structure of the site. It’s prpably also the moment to re-define the mapping to fit WordPress 2.9 – if possible.
Anyway: Daphne runs 8.4 when booted. Next is the setup of SCS-IP, using one of the NICS for cluster traffic, the other for normal traffic.

23-Dec-2009

Router problem
Usually, I check the logs on the system and mail over the web as soon as I get at my office. This morning I could access the operation desk (the protected access to do system manager work) but there it ended. ANY site on grootersnet.nl was inaccessible – couldn’t be accessed.
Tonight it turned out the router hung: ALL access failed. Not just incoming traffic. All traffic failed: DNS (it serves as a resolver), Outgoing traffic, it’s admin page – it all failed. So what remains was resetting it. That solved the issue – but also removed any clue of what may have caused it.
Well, I’m in for a new one anyway. But nevertheless, a VMS port of syslogd will be installed in Diana to serve as loghost for the router.
Access problems of another kind
Since the new PWS has a differential SCSI card installed – KZPSA-CA – I decided to hook it on the Shared SCSI and boot from the shared system disk.
Bit problem. Diana lost connection to the quorum disk and postponed all activity.It regained access and resumed after the HSZ50 was disconnected.
It may have been a termination issue – though there is a termination on the cable. But it might have been the -CA type. I have been able to boot the old AlphaServer 400 from this disk using a KZPSA-CY controller. So I switched the cards and retried. This time, though Diana keeps complaining loosing the quorum disk, it always regained access and continued. The new box started booting, but always ended starting MSCP serving. Nor did it enter the clsuter environment! So that may actually be the problem
But booting form it’s own (VMS 7.3-2) system disk succeeds, even when hooked up to the shared SCSI. The system joins the cluster, as shown in SHOW CLUSTER, and accessing the disks seems possible.
But here as well, Diana looses access to the quorum disk at times….
Perhaps Iĺl have to re-create the node-environment from scratch…

18-Dec-2009

Booted E8.4
Now I got a bootable disk, nothing stopped installing 8.4, using the Personal Alpha, being easier to handle if something goes wrong.
I left the system as I had left it yesterday:

  • DKA0 = the original V8.3 system disk
  • DKA100 = the upgradedisk (the one created yesterday from the 8.4 backup saveset)
  • DKA200 = the disk destined to hold 8.4 for testing. The container is a copy of the disk loaded as DKA0 – only renamed and relabeled.
  • DKA300 = CD drive – in case I need it

  • So tonight I did the upgrade, by booting PA from DKA100 – and it worked without a problem, as could be expected. Next, I rebooted the system and that invoked SYSGEN, as shown in the logfile. But the next reboot imposed a problem: It hang because DKA200 could not be mounted; the reason is obvious:
    it’s the system disk. This mount is in SYSATRTUP_VMS.COM – without /NOASSSIST, that normnally isn’t required since the system is usually booted from DKA0.
    Anyway, I had to stop the emulator the hard way.
    Next step – which I should have done before – was moving the container holding 8.4 to DKA0, and define all normally used containers on their locations, so the only difference I have in this system is just the systemdisk.
    Now I booted the system. No real trouble, just that the webserver now has a problem:

    %WASD-I-STARTUP, begin
    %WASD-I-STARTUP, using SSL image
    %DCL-W-ACTIMAGE, error activating image SSL$LIBCRYPTO_SHR32
    -CLI-E-IMGNAME, image file $3$DKA0:[SYS0.SYSCOMMON.][SYSLIB]SSL$LIBCRYPTO_SHR32.
    EXE
    -SYSTEM-F-SHRIDMISMAT, ident mismatch with shareable image

    Since WASD on this system is linked against the HP-supplied SSL files, this might have been expected, though I would prefer it didn’t happen. But the easy way out is to relink WASD and start it; it now runs.

    No problem for MySQL – because that is linked against another SSL implementation, appearently. SWS did have a problem , but that has nothing to do with the upgrade; At least, I cannot think of a reason:

    Syntax error on line 5 of /apache$root/conf/mod_php.conf:
    Can't locate API module structure `php5_module' in file /apache$root/000000/modules/mod_php_apache-2_0.exe: function not implemented

    Next step is installing it on the new PWS. That means I first need to set it up with 8.3 – which may be a problem if the CD is bad indeed. Or install it fresh – but than I’ll have to burn a CD first.

    17-Dec-2009

    8.4 fieldtest
    The Alpha safeset of the OpenVMS 8.4 environment comes as a zipped backup saveset.

    First, I created a new disk on the Personal Alpha and restored the files. But that left the disk as an ordinary datadisk – it’s not bootable. Another approach involved a copy of the bootdisk, on which the backup saveset was restored, with /OVERWRITE; but that would caused another poblem: a few files could not be copied because of a lack of contiguous space; so I removed all files and retried. Now the disk started a boot, but then it failed on the second bootstrap.
    Next, I tried an install from the original 8.3 CD but there is an error on it; That is: BACKUP ran into a CRC error that it couln’t pass. However, DUMP of the file where it happened went straight on, without an error…..
    A next INIT of the target disk, and restoring the saveset once more, I tried SETBOOT. For Alpha, this is a foreigh command:

    $ SETBOOT := $SYS$SETBOOT
    $ SETBOOT DKA100:

    which would setup the boot-sector as on SYS$SYSDEVICE, accoring the (rather limited) documentation.It took some time but in the end, the disk was bootable – but, again, failed on the secondary bootstrap:

    >>> b dka100
    (boot -file '' -flags '' 'dka100')
    BOOT_RESET is ON: cold boot
    Booting from the device 'dka100' file ''
    Loaded the primary booter; size 0x9a000
    %APB-I-FILENOTLOC, Unable to locate SYSBOOT.EXE
    %APB-I-LOADFAIL, Failed to load secondary bootstrap, status = 00000910
    CPU0 halted: reason: halt instruction executed
    CPU0 halted: default

    I’ve seen this error before, I think: when the directory holding the system was in lowercase: [vms$common], in stead of uppercase: [VMS$COMMON]. But here, there was nothing wrong with the whole environment..
    Asked my collegues on the issue, and it turned out the saveset is actually an image backup.
    So I reloaded the backup file onto the PA disks, and rerstored using:

    $ backup/imag alphae84.bck/save dka100:

    and that did the trick:

    >>> b dka100

    Alpha Emulator Firmware
    Version 2.0.16 build Oct 8 2008 09:38:01
    Initialized physical memory: 128 MB
    Loaded HWRPB: physical address 2000
    Initialized CPU0
    (boot -file '' -flags '' 'dka100')
    BOOT_RESET is ON: cold boot
    Booting from the device 'dka100' file ''
    Loaded the primary booter; size 0x9a000

    OpenVMS (TM) Alpha Operating System, Version E8.4
    © Copyright 1976-2009 Hewlett-Packard Development Company, L.P.

    Now this is a 4GB disk…
    Prepared 8.4 CD
    Once I knew how to restore the saveset into a bootable disk, I prepared writing it to CD: Created a 1.400.000 block logical device (fits on a 700Mb CD), and restored the saveset onto it. Now this disk image it to be burned to CD – I only need to get some, I’ve none left 😉

    01-Dec-2009

    Statistics for November
    PMAS statistics for Nov
    Total messages    : 4270 = 100.0 o/o
    DNS Blacklisted   : 2581 =  60.4 o/o (Files: 30)
    Relay attempts    :  163 =   3.8 o/o (Files: 30)
    Processed by PMAS : 1526 =  35.7 o/o (Files: 30)
            Discarded :  290 =  19.0 o/o (processed),   6.7 o/o (all)
         Quarantained :  421 =  27.5 o/o (processed),   9.8 o/o (all)
            Delivered :  815 =  53.4 o/o (processed),  19.0 o/o (all)

    It looks the number of spam messages is increasing again. Top month this year so far was May, the number of messages dropped in June, and even more in July; until October it stayed failty stable; Octbober had the least messages. However, November shows spam is “back on track” with numbers just under June.
    Logs
    Logs show no real suprises.
    There have been the usual atempts on the FTP port – all based on the assumption this is a Linux or Windows machine with some default pachages. The same on the Webserver, and I did check the Wiki but found no evidence of trouble. Same on the blogs, but I’ll keep a keen eye on all.
    Updates to come
    Later this month, around Christmas, there is time to do some updates.
    PHP will be updated anyway. It works fine on the test environment, no real porblems with WP and MySQL. The newer version of PHPMyAdmin fails on that environment, but it’s reported to run; it must have something to do with mapping being wrong, or incomplete. I’ll ask around to find out how others have done it. Probably the current version will have no problems – in which case, I may keep it alive.
    WP is still a matter of testing – perhapst it’s exactly the same issue as PHPMyAdmin: a mapping problem.
    More important – and requireing far more preparation, is the upgrade of the WASD webserver. Major changes have been made in the running environment: different naming of logicals and directories; logicals are stored in a separate table….So preparation is required, and I’m not certain (didn’t read the notes yet) if it can be installed aside the current one, or if there is some tooling around.
    And, of course, there are some patches to the OS that need to be installed. Since the www.OpenVMS.Org listserver is down (several months now) I’ll have to manually check what hasn’t been installed yet.
    But that all has to wait. First I’ve got to pass (well, sort of) a certification exam for ITIL V3 over a week and a half….