12-Dec-2007

Updates
Last patches have been installed tonight – and I ran AUTOGEN to recalculate the system settings based on feedback. It caused NPAGEDYN be increased signficantly, and after last reboot (there were two: one after Update 5, and one after the other three) freepages was lower than before. While typing this, I have a program running that monitors MySQL to see where it breaks – if it does. It certainly looks faster.
The MySQL problems might be caused by PTHREAD or CRTL – both are heavily used by MySQL according the VMS-author (Jean Francois Pieronne) and Update 5 contains updates for both. I thought I had them installed, the PThread patch one anyway.
So far, nothing’s wrong – it hasn’t run for a long time so it might be fine today and cause trouble later on. The watcher is running so if things go wrong, I’ll just have to wait for a short time (15 minutes at most).

Current state: about 60% of WSQUO is used (77000 out of 131072) and it has been more (70%, or 92100). Well, time will tell what comes out of this 🙂

Clamav antivirus built
which wasn’t as straight forward as described. Expanding .TAR-GZ and untarring the resulting .TAR was no problem (the originally loaded .GZ was definitely broken) but building the products was another matter.
First, I started building BZIP2. There is a CONFIGURE procedure that ran fine and craetes a BUILD procedure, but I got compilation warnings that caused MMS to give up – and building failed, completely. But after I removed all files and untarred the package again, building went flawless. So now I got that libary 😀
Building Clamav has the same strcuture: First run the configure script and the Build one thereafter. However, Configure siganlled a fatl condition: I was still missing a library that had to be downloaded as well. After that had been installed, the configure script went on and I could start a BUILD. That gave a number of warnings and these may cause problems when running the programs. It might have to do with the C-compiler version:

$ cc/version
HP C V7.3-009 on OpenVMS Alpha V8.3

where the original may have been built with 6.5 (or earlier).
Now I only need to install it to test it – but that will have to wait until I’ve time enough: I still have to figure out how to integrate it with PMAS.

Web content
Some older tracks have been published and in the coming weeks, I hope to have done the backlog.
The program to replace the PHP code is getting along. All modules compile on both Alpha and Itanium (I got myself an account on HP’s testdrive) and I’m curently do some rework on the most important wones for this work. The testprogram compiles does link still fails. But we’re getting somewhere! (Walking is not that fast either)

09-Dec-2007

Fine so far
The system has been working fine since the last MySQL crashes and I mailed to the mailing list for assistance. Not any process in outswapped mode (LEFO or HIBO) found druing the last days, even when PHP was running. and for the rest there seems to be no trouble either. This is what the system should run like 🙂
I downloaded latest patches, I may do the update tonight as well, but since it requres one or two reboots, I prefer to have some more time.
I also wanted to install ClaMAV – an OpenSource virus scanner that is ported to VMS but with the currently installed executables I cannot extract the files from the file – it’s a .GZ file and Gunzip seems to locate more than on e entry so won’t process the file. There maight be something wrong, whatever I try it fails. See if I can get a decent (more recent) version of Gunzip somewhere – freeware CD, most likely.
Another thing to do is install CFIS – Samba – but again, too late in the day to start with it. I;ll have to spin up a disk and keep it online. No problem since I got one in this shelf available, but seeting it all up needs to be thought over and prepared well. Maybe some tim ethis week – if programming doesnt take all time (which is quite likely).
I found that the new package I installed on the forums (User_Shield) will actually allow a bunch of inactive users be activated or deleted in one push – but the buttons show up in the SubSilver template only, to in order to flush the fake members, it’s needed to change the forum’s look. I think I’l have to update the PHP code of the theme in order to have the buttons available in the Galaxian theme used.
Some more data might be added to Trips, Track and Travels – there is still a number of tracks to publish, even where MapSource refuses to startup at the moment. Garmin has been notified, waiting for the answer. Likely it will be that a files needs to be re-installed.
But No….
When storing this post, MySQL crashed again when I tried to access the page:

071209 21:54:32 InnoDB: Error: Write to file /database/mysql41/ibdata1 failed at offset 0 1048576.
InnoDB: 212992 bytes should have been written, only -1 were written.
InnoDB: Operating system error number 12.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 12 means 'not enough core'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/mysql/en/Operating_System_error_codes.html
071209 21:54:32InnoDB: Assertion failure in thread 72258240 in file MYSQL_ROOT:[innobase.fil]fil0fil.c;1 line 3922
InnoDB: Failing assertion: ret
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
mysqld got signal 10;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=2
max_connections=100
threads_connected=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 72191 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

InnoDB: Thread 72159936 stopped in file MYSQL_ROOT:[innobase.os]os0sync.c;1 line 501
InnoDB: Thread 72209088 stopped in file MYSQL_ROOT:[innobase.include]sync0sync.ic;1 line 111
%CMA-F-EXIT_THREAD, current thread has been requested to exit
InnoDB: Thread 82752192 stopped in file MYSQL_ROOT:[innobase.include]sync0sync.ic;1 line 111
%TRACE-F-TRACEBACK, symbolic stack dump follows
image module routine line rel PC abs PC
PTHREAD$RTL 0 000000000004367C FFFFFFFF80E6367C
PTHREAD$RTL 0 000000000006E9E8 FFFFFFFF80E8E9E8
0 FFFFFFFF8016F054 FFFFFFFF8016F054
0 FFFFFFFF80376E0C FFFFFFFF80376E0C
DECC$SHR_EV56 0 00000000001D2DC4 FFFFFFFF8106ADC4
DECC$SHR_EV56 0 00000000001D2A8C FFFFFFFF8106AA8C
0 FFFFFFFF801877CC FFFFFFFF801877CC

(Today’s crash data)

It might be this file is too small:

$ dir database:[mysql41]/siz

Directory DATABASE:[mysql41]

ibdata1.;1 53248
ib_logfile0.;1 4096
ib_logfile1.;1 4096

Total of 3 files, 61440 blocks.

that is:

$ dir database:[mysql41]/siz=unit=bytes

Directory DATABASE:[mysql41]

ibdata1.;1 26MB
ib_logfile0.;1 2MB
ib_logfile1.;1 2MB

Total of 3 files, 30MB

Should not be a problem. Or is it because the file cannot be extended:

$ dir database:[000000]/sec

Directory DATABASE:[000000]

000000.DIR;1 [SYSTEM] (RWED,RWED,RE,E)
BACKUP.SYS;1 [SYSTEM] (RWED,RWED,RE,)
BADBLK.SYS;1 [SYSTEM] (RWED,RWED,RE,)
BADLOG.SYS;1 [SYSTEM] (RWED,RWED,RE,)
BITMAP.SYS;1 [SYSTEM] (RWED,RWED,RE,)
CONTIN.SYS;1 [SYSTEM] (RWED,RWED,RE,)
CORIMG.SYS;1 [SYSTEM] (RWED,RWED,RE,)
INDEXF.SYS;1 [SYSTEM] (RWED,RWED,RE,)
mysql41.DIR;1 [SYSTEM] (RWE,RWE,RE,E)
SECURITY.SYS;1 [SYSTEM] (RWED,RWED,RE,)
VOLSET.SYS;1 [SYSTEM] (RWED,RWED,RE,)

Set the directory to W:RE (alas, PHP dos not (yet) take ACL in account) and make MySQL_SERVER the owner. If it happens again, failure to extend the file should not be the cause.

04-Dec-2007

L’Histoire se repète….
This morning, trying to access the Bootcamp blog, MySQL crashed again. Same (or fairly similar) cause:

071204 7:52:03 InnoDB: Error: the OS said file flush did not succeed
071204 7:52:03 InnoDB: Operating system error number 12 in a file operation.
InnoDB: Error number 12 means 'not enough core'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/mysql/en/Operating_System_error_codes.html
InnoDB: File operation call: 'flush'.
InnoDB: Cannot continue operation.

It could be the case that multiple PHP_WASD processes were activated. I screwed the throttle back – and lowered PHP’s memory requirements from 16M to 10M. As far as I can see, it still works – with an occasional “Not enough core” wtithin PHP – MySQL is running without a problem….

03-Dec-2007

MySQL down – again
MySQL server crashed again this morning:

Version: '4.1.14-log' socket: '' port: 3306 Source distribution
071203 7:35:45 InnoDB: Error: the OS said file flush did not succeed
071203 7:35:45 InnoDB: Operating system error number 12 in a file operation.
InnoDB: Error number 12 means 'not enough core'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/mysql/en/Operating_System_error_codes.html
InnoDB: File operation call: 'flush'.
InnoDB: Cannot continue operation.
%DELETE-W-SEARCHFAIL, error searching for MYSQL_ROOT:[MYSQL_SERVER.TMP]*.*;*
-RMS-E-FNF, file not found
MYSQL_SERVER job terminated at 3-DEC-2007 07:35:47.95

The watcher wasn’t running, but since logging ws disabled, There is no way to locate the reason. The last log is of 26-Nov-2007, 23:32 so it might be that changing data may have caused an error. But even in acconting, it cannot be found.
So I just started it again – with logging enabled, to find out – eventually.

In general this means that MySQL 4.1 does not have the required stability….
Adjusting management procedures
The >anti-spam engine creates a large number of log files when storing statistics information and these files must be deleted “manually”. That is: in the local clean-up procedure that is called in the nightly batch, it is cheecked if such files exist and if so, all are deleted:

if f$search("sqlite*.*") .nes. "" then delete sqlite*.*;*

I’ve added this line some time ago but still there is quite a number of files present. No wonder: I’d better specified where to look:

if f$search("sys$manager:sqlite*.*") .nes. "" then delete sys$manager:sqlite*.*;*

👿 Duh 👿
The backup procedures have been enhanced a bit as well. Logging of the MySQL backup went to the wrong place, and only the public web is backed up – until today. I chnaged both procedures: have the logging come to the right spot, and keep only the latest web backup available, otherwise the disk wqill be crowded with unused files. They are big enough, but all tyhese backups are not needed so I can clean them up anyway – if backup succeeds.
Why did mysql-watcher not run? Well, the answer is found in the logfile:

Resubmitted 30-Nov-2007 23:32:00.00
Resubmitted 30-Nov-2007 23:47:00.00
Resubmitted 31-Nov-2007 00:02:00.00

😳 Invalid date…

01-Dec-2007

Mail statistics of November
PMAS statictics and logfiles show for November:

Total messages : 3803
Blacklisted : 2882 = 76% (Avg 96 /day)
Refused relay : 208 = 3% (Avg 3.6 /day)
-----
Checked : 812 = 21% (Avg 27 /day)
Filtered : 453 = 12% (Avg 15/day )
-----
VALID 360 = 9% (Avg 12 /day)

In other words: 90% is JUNK