29-Aug-2015

New memory
New memory has arrived: the full 2 Gb, now stated “Compaq branded”. I installed it all tonight and so far it’s all in working order. The first set will be destroyed, because it is too unreliable containing these bad buffers.
Throttle and Max-execution-time
The problems with WordPress seems twofold.
First, as I expected, WASD’s throttle causes problems. Immediately after Diana was started, I started the blog. It stated showing up but ended halfway the calendar or the links, or gave me a 503 error (too many processes in FIFO) but never showed the text. And there were no other processes accessing the blog except for this site itself:
2015-08-29_21-50-34
First of all, I trued to give the processes a bit more room: all three times as much. That did help a bit, but now it became clear that the maximum execution time (set to 90 seconds) was too low. So I gave the process two minutes to execute, and disabled throttling – for now, until I find a solution to throttle only incoming requests from other sites than my own.
Somethin like:
if (!remote-addr:(my address) throttle=5,0,0,8,00:02:00, 00:05:00
So the same rule as before, but limited to ‘foreign’ addresses, to prevent an overload.
The overall effect seems dramatic, especially in the number of processes: this dropped!
2015-08-29_22-49-50
but it shows that WordPress – at least: most likely – may start more than just a few processes…

27-Aug-2015

Throttle
Most definitely.
It shows when looking at the logs:
%HTTPD-W-NOTICED, 27-AUG-2015 06:24:04, CGI:2107, not a strict CGI response
-NOTICED-I-SERVICE, http://www.grootersnet.nl:80
-NOTICED-I-CLIENT, 68.180.230.158
-NOTICED-I-URI, GET (18 bytes) /sysblog/?m=201201
-NOTICED-I-SCRIPT, /sysblog/index.php sysblog:[000000]index.php (phpwasd:) SYSBLOG:[000000]index.php
-NOTICED-I-CGI, 2553595354454D2D462D41424F52542C2061626F7274 (22 bytes) %SYSTEM-F-ABORT, abort
-NOTICED-I-RXTX, err:0/0 raw:194/0 net:194/0
%HTTPD-W-NOTICED, 27-AUG-2015 06:27:36, CGI:2107, not a strict CGI response
-NOTICED-I-SERVICE, http://www.grootersnet.nl:80
-NOTICED-I-CLIENT, 82.161.236.244
-NOTICED-I-URI, GET (9 bytes) /sysblog/
-NOTICED-I-SCRIPT, /sysblog/index.php sysblog:[000000]index.php (phpwasd:) SYSBLOG:[000000]index.php
-NOTICED-I-CGI, 2553595354454D2D462D41424F52542C2061626F7274 (22 bytes) %SYSTEM-F-ABORT, abort
-NOTICED-I-RXTX, err:0/0 raw:357/0 net:357/0
%HTTPD-W-NOTICED, 27-AUG-2015 06:27:52, CGI:2107, not a strict CGI response
-NOTICED-I-SERVICE, http://www.grootersnet.nl:80
-NOTICED-I-CLIENT, 82.161.236.244
-NOTICED-I-URI, GET (9 bytes) /sysblog/
-NOTICED-I-SCRIPT, /sysblog/index.php sysblog:[000000]index.php (phpwasd:) SYSBLOG:[000000]index.php
-NOTICED-I-CGI, 2553595354454D2D462D41424F52542C2061626F7274 (22 bytes) %SYSTEM-F-ABORT, abort
-NOTICED-I-RXTX, err:0/0 raw:357/0 net:357/0
%HTTPD-W-NOTICED, 27-AUG-2015 06:27:54, CGI:2107, not a strict CGI response
-NOTICED-I-SERVICE, http://www.grootersnet.nl:80
-NOTICED-I-CLIENT, 82.161.236.244
-NOTICED-I-URI, GET (18 bytes) /sysblog/index.php
-NOTICED-I-SCRIPT, /sysblog/index.php sysblog:[000000]index.php (phpwasd:) SYSBLOG:[000000]index.php
-NOTICED-I-CGI, 2553595354454D2D462D41424F52542C2061626F7274 (22 bytes) %SYSTEM-F-ABORT, abort
-NOTICED-I-RXTX, err:0/0 raw:366/0 net:366/0
%HTTPD-W-NOTICED, 27-AUG-2015 06:28:17, CGI:2107, not a strict CGI response
-NOTICED-I-SERVICE, http://www.grootersnet.nl:80
-NOTICED-I-CLIENT, 66.249.67.27
-NOTICED-I-URI, GET (16 bytes) /sysblog/?p=1095
-NOTICED-I-SCRIPT, /sysblog/index.php sysblog:[000000]index.php (phpwasd:) SYSBLOG:[000000]index.php
-NOTICED-I-CGI, 2553595354454D2D462D41424F52542C2061626F7274 (22 bytes) %SYSTEM-F-ABORT, abort
-NOTICED-I-RXTX, err:0/0 raw:357/0 net:357/0

The address 82.161.236.244 is my own. And when looking at the active processes, there is a number of WASD processes that run SYSBLOG, and some of them have the same address. Some of them show up in the list that is throttled.

Settings for the blogs show it can take some time for such a process to continue:

Throttle Report
Throttle Report

and it may be that this causes problems if one or some of the worker processes (there may be several, concurrent or in sequence) are stuck in the FIFO queue and time out. Likely scenario, given the number of entries that got into the FIFO queue; it exceeds any number I’ve seen with the older WordPress version – even with PHP 5.2-13…

This is something that definitely needs more investigation: the combination of throttle and PHP to begin with, but it is very likely caused by WordPress: What to look for are the number of processes are actually involved, what they do and how they interact. IIRC, in earlier investigations I found that WordPress will cause several WASD worker processes to be started, apparently either processing PHP code, or executing the results. If these processes depend on each other, where one waits for another to finish, and the one executing is held in the FIFO queue, of just rejected because the queue is exhausted, there is trouble.
For this, I set up my old Personal WorkStation 600au again as Daphne, after adding memory (to a whopping 512 Mb) and booted it into the cluster. Now it’s setting up the test environment (MySQL is already installed, need the same WASD, PHP and WordPress versions and setup, and data as on the main system) and see how it behaves: so there will also be a need for some software to keep track of all processes in the system.

Since it is within the cluster, I could easily refer to the real stuff as well, but execute it here. Just a matter of the right logicals and WASD mapping..
A difference in load
some days, I check the load of the previous day. Yesterday’s load was quite different than the one a few days back:

Load over26-Aug-2015
Load over26-Aug-2015

I did notice a site from Russia that constantly poked one of the WordPress files and that forced me to restart WASD (silently) on that day about 11:00 UTC, and after that, the load has been fairly even, like this one.
This ‘ user’ may have caused problems by continuously sending these requests, causing the FIFO queue to be constantly exhausted (some requests returned a 503-error – “Server unavailable” – as a result).
So yet another thing toe look at.
On the memory
I contacted the supplier on the memory issues, and he will replace the bad one (well, two of them to have the same batch in that bank). But shortly afterwards, he asked me to send him a picture of one of the chips on the DIMM:
Buffer chip
Buffer chip

Within a few hours, I got the answer:

And there is the problem.
These DIMMS (we purchased these from HP!) are bad. The buffer chips are some that were intermittently defective
causing no apparent reason for failure.
I am sending 2GB of memory. Please do not resell the memory. please destroy it and send me a photo of it crushed or damaged as it should not find its way back into the marketplace

First of all: Shame on HP. David is right: These should be removed from the circuit.
As soon as the new memory has arrived and is installed and has proven to be functioning well, these DIMMs will be destroyed. There is however one problem I have to tackle: VAT. This is a repair but customs won’t buy that. It may be circumvented by using mail, but:

I will send Fedex. I have had people hit up for large fees to with the mail service. I did notate that it was a repair return so you can tell Fedex to check their records.
You will need to reference the original fedex tracking no.

That’s to be done….(luckily I did’t remove the original mail).

24-Aug-2015

Changes in load
Something has changed: where over a day there was a continuous change in CPU usage, it changed to peak every 90 minutes or so; drop significantly and gradually increase to about 80% – an drop again:
2015-08-24_09-29-33.No idea yest where this comes from….
Changing the execution time of the PHP processes didn’t help much. But before I could add something, the blog didn’t show up; that is: it started displaying but broke halfway. So I started the admin pages and found that there were quite a lot of requests that succeeded, others stalled, or simply broke: network connections dropped, processes exited…So there is another possible cause: the throttle; intended to prevent overload, it now bites back: PHPWASD does start a number of worker processes and I already noticed that this version of wWordpress puts quite a load on the resources. So a max of 5 may be too low.
So I’ll increase that.

23-Aug-2015

PHP errors?
Operator log of yesterday shows quite a number of this type of messages:

%%%%%%%%%%% OPCOM 22-AUG-2015 07:43:14.63 %%%%%%%%%%%
Message from user HTTP$SERVER on DIANA
Process WASD:80 reports
%HTTPD-W-NOTICED, CGI:2107, not a strict CGI response
-HTTPD-I-SCRIPT, /sysblog/index.php (sysblog:[000000]index.php) phpwasd:
-HTTPD-I-CGI, 2553595354454D2D462D41424F52542C2061626F7274 (22 bytes) %SYSTEM-F-ABORT, abort

The server log tells the same story:

%HTTPD-W-NOTICED, 22-AUG-2015 07:43:14, CGI:2107, not a strict CGI response
-NOTICED-I-SERVICE, http://www.grootersnet.nl:80
-NOTICED-I-CLIENT, 82.161.236.244
-NOTICED-I-URI, GET (9 bytes) /sysblog/
-NOTICED-I-SCRIPT, /sysblog/index.php sysblog:[000000]index.php (phpwasd:) SYSBLOG:[000000]index.php
-NOTICED-I-CGI, 2553595354454D2D462D41424F52542C2061626F7274 (22 bytes) %SYSTEM-F-ABORT, abort
-NOTICED-I-RXTX, err:0/0 raw:357/0 net:357/0

and does so on earlier days. This happens on a number of files, but mainly on index.php and xmlrpc.php; and not just on my address…

It is not caused by PHPcode. I have set in PHP.INI:
log_errors = On
log_errors_max_len = 1024
ignore_repeated_errors = Off
ignore_repeated_source = Off
error_log = user:[phplog]php_errors.log

and the last entry in the logfile is 31-May-2015 – when I was experimenting with highr PHP version.

If it were reproducible, WATCH would show the reason, But it seems to be at random, I cannot recognize a pattern here. And without crash data, it will not be easy to find the cause.
There may be one thing though: the maximum execution time. Reset to 60 seconds as before, it may be a little too low so it may cause a PHP-worker process to stop before it has reached the end of processing – hence “ABORT” – by the PHP-engine. So I gave it a bit more time (30 seconds); See what happens next. another one may be max_input_time, kept to the default value of 60 seconds, for now.

This is the only issue, as far as I can see. The system is at about 75-80% of internal memory, and without a lot of hard paging (to disk, that is) – there is apparently enough free space to be used. That is reflected in T4 data: Low hard page rate (unless a process is started, especially a PHPWASD process, I guess. But that is to be expected).
One thing to improve performance of PHP would be to install PHPSHR. Well, some time next week 🙂

22-Aug-2015

New licenses installed
My current licenses expire September 7th, so to be sure I requested a new set. Within a day, John Egolf,  who handles hobbyists license at HP,  sent me the new set. This comes as a command procedure so it is sufficient to copy it to the VMS system and execute it.
So VMS,  layered products and compilers are covered until 18-Sep-2016.
A new license for PMAS has also been requested at Process, Hunter has sent me a new one that will expire two weeks earlier. Installed that one as well.