November 2018 – SYSMGR in the attic

29-Nov-2018

System moved
As mentioned, the Alpha system (and storage) have been taken off-line and off power, in order to install it all in a rack. It took less time – just a few hours – than anticipated. I didn’t have to prepare the location: the ‘table’ on which the rack and machines – in all about 150 Kgs – is sturdy anough to carry the weight, which save me lot of work and time. The action started at about 13:40 – system shut down at 13:48, and was restarted around 16:15; With one minor issue in startup: WCME was started before WASD – causing a failure starting WCME-overseer, because CGI-BIN wasn’t defined yet (this is done in startup of WASD). And the routine to define WGPAGETEXT – for the home page – didn’t work properly, strange since I have been busy correcting the previous errors…
Next action in this matter is installing the Itanium servers in the rack – all is prepared. I only need a 16-port switch to fit all connections

26-Nov-2018

Preparations for moving machinery
Now the rack has arrived and constructed, and power lines are (almost) done, it is time to move the server to a temporary location (within the attic – there is no more room elsewhere..) and have it run there, so I have space and time to make space (remove the current cupboard), install the rack and place servers in it: if that is finished the Alpha server will (once more) be shutdown, next installed in lace and restarted.
My aim (!) is to do this exercise next Thursday – so I have a full day, and there is no problem if the activity needs an extension.

It means everything will be black for some time. Or I set up a small machine – emulation, perhaps – to server an “Under Construction” page….We’ll see.

11-Nov-2018

Rearrangement in progress
I decided – quite some time ago – to rearrange the attic. The original workplace under the roof was a table 2.5 meters wide and 1.5 meters deep, causing the monitors over an arm-length from the edge of the desk. Too wide, especially for 4K monitors, to be easily readable. Second, under the window so I was looking against the sunlight in the normal working hours: after 18:00. The window needed to be shut if there even a light rain in windy weather, to prevent the hardware to get wet.
Also, there is quite a lot of old paperwork that is outdated, of no more interest, directly or potentially, so I could get rid of a lot of this. Keeping just the books and binders that I will need – or want to keep regardless usability. Mainly OpenVMS documentation (the official paper set and books on system management and programming), language references and some projects I am still working on, or use.

It also means buying some furniture, and a relay rack to store all hardware (finally…) Not a complete enclosure since it would be impossible to move it two stories up, and the ones that come as a DIY package cannot accommodate the Itanium servers – they are too short to handle the depth of these machines. Also, a relay rack offers a better airflow – important as the attic can be hot in summer. Probably, the newly installed air conditioner will be able to cope with that next summer….

5-Nov-20185-Nov-2018

05-Nov-2018

Connection problems
When trying to do some redesign for this blog, all of a sudden connections failed. at least, from within the LAN. Even trying to access the webserver from the VMS-machine itself failed: no connection could be made. Though it looked like connections from the Internet were able – but it looked like availability was sometimes interrupted.
The sites that require https had no problem at all, I could access them and they responded nivcely. just thhe two sites that can be accessed using an unsecured connection failed.

I found one accessor with a large amount of connections to the system, in particilar this blogs main page. But the number of WASD processes was limited – and no large amount of error messages in the logs. Though the browsers mentioned an DNS error, it wasn’t since translation of www.grootersnet.nl just worked – it was just WASD not responding.
WASD was stopped and restarted but it didn’t solve the issue; I also made some adjustments to the router firewall to block this connection; restarted the router (it restarted all by itself at some point) but the problem persisted.

Did some testing and analysis but there is no reason why NOT ANY data seems to be passed to the webserver; Not even the mapping of abusers (that will get a message they have been banned) were shown in WATCH, not even the request was logged.

I contacted the WASD mailing list, even mark is puzzled, but he also noted interruptions in accessing the homepage.
In the mean time, I noted still some old addresses here and there – but since the system has run fine for a few weeks (after latest and suddenly stopped – without any changes in the environment) these were not causing the problems…

In all, I spent all night (until 5:00) trying to solve the issue, to no avail. this morning it seemed a bit better, I’m now outside the LAN so I it looks fine from here. It may be a good idea to reboot the server, just to reset everything -0 completely; hopefully it solves the problem.

Update
Tried like Mark suggested in his last reply: What happens if I use telnet to port 80 from another machine in the LAN? Done that several times last night and results were dramatic. But today the connection was setup in no time – and closed immediately. My mistake – it won’t run on SSH :). With telnet: No problem: Entering GET / gives me the page I requested. Same using the normal access using the website URL: It just works – and fast.

So it does seem to be caused by an overload on port 80: site at address 223.130.9.71 (according Robtex a Chinese address, no further information) was accessing the site multiple times a second, until 9:21 this morning according the router logging. But both this address as any address from China is blocked in the router. At least, I would expect that. So it’s up to Draytek to explain how to block the front gate for any address (or country).

So NO reboot needed.

Local DNS issue
however, there is just one minor detail I need to solve.
$ tcpip show host
only shows the statically defined hosts. Not the ones that get their data from the DHCP server. Previously, DHCP would update the DNS database, but now it does not. Must have something to do with DNS address and some data in DHCP, I guess.

4-Nov-2018

04-Nov-2018

As usual
No real surprises in cleaning up the mess of one month, but there is some concern on mail…Though in first glance, there seems nothing wrong:
PMAS statistics for October Total messages : 3963 = 100.0 o/o DNS Blacklisted : 108 = 2.7 o/o (Files: 1) Relay attempts : 274 = 6.9 o/o (Files: 31) Accepted by PMAS : 3581 = 90.3 o/o (Files: 31) Handled by explicit rule Rejected : 2762 = 77.1 o/o (processed), 69.6 o/o (all) Accepted : 129 = 3.6 o/o (processed), 3.2 o/o (all) Handled by content Discarded : 288 = 8.0 o/o (processed), 7.2 o/o (all) Quarantained : 383 = 10.6 o/o (processed), 9.6 o/o (all) Delivered : 19 = .5 o/o (processed), .4 o/o (all)
and there was one day that the number of relay attemps was scaled up:

20-OCT-2018 01:27:20.28 – 20-OCT-2018 01:31:04.03 228 attempts from address 142.11.210.66 – All the usual: mimicking a Grootersnet.nl user and sending to the very same gmail.com address as the other days, but there is a small difference: Robtex.com states on this address:

The PTR is bientions.net. The IP number is in Tulsa, United States. It is hosted by HOSTWINDS-4-ROUTE.
We estimate that it is used as PTR for 544 IP numbers. We have a premium report available for bientions.net.
Results found
Bonistein.cn, benisonit.com, benisonti.com, bentsioni.com, biointens.com, biotennis.com, bisontine.com, bonistein.com, enbitions.com, inbetsion.com, inbisonte.com and neobisint.com.

So still hosted by Hostwinds – not to blame, it’s one of their customers that is either abusing their connection, or have been hacked or are relaying email (and so don’t have sufficient awareness of security..)
The subnet has been added to the list of connections to be refused.

But looking into the numbers of messages that are rejected – meaning that they cannot pass – which is significantly higher. It is also reflected in the size of the OPERATOR.LOG files – usually about 50 blocks in size but today exceeding this by 2, 3 of even 4 times. It’s mainly lines of email that is accepted by PMAS and passed to the SMTP server – if PMAS rejects quarantines, discards or even rejects a message, he message passed is incomplete and will be dropped by the SMTP server, but I still get that signal…

November 2018
M	T	W	T	F	S	S
« Oct				Dec »
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30