27-Sep-2013

It’s a phisherman!
One of the sites I encountered a few days ago now appears to be a bad guy. Though the header appears to be valid:

Return-Path: internationalcardservices.notificationiare@mailing.internationalcardservices.nl
Received: from DIANA.INTRA.GROOTERSNET.NL (192.168.0.2)
by diana.intra.grootersnet.nl (V5.6-ECO5, OpenVMS V8.3 Alpha);
Fri, 27 Sep 2013 10:57:27 +0000 (UTC)
X-PMAS-MAIL-FROM:
internationalcardservices.notificationiare@mailing.internationalcardservices.nl
Received: from unknown ([87.106.96.232] EXTERNAL) (EHLO s16978676) by
diana.INTRA.GROOTERSNET.NL ([192.168.0.200]) (PreciseMail V3.2); Fri, 27 Sep
2013 10:05:41 +0000
Received: from mailing.internationalcardservices.nl ([127.0.0.1]) by s16978676
with Microsoft SMTPSVC(7.5.7601.17514); Fri, 27 Sep 2013 12:05:28 +0200
From: International Card Services
<internationalcardservices.notificationiare@mailing.internationalcardservices.nl>
To: (my address)
Subject: Uw rekeningoverzicht bekijken en betalen
Date: 27 Sep 2013 12:05:26 +0200
Message-ID:
<20130927112751.4EA0D4FB379FEEC7@mailing.internationalcardservices.nl>
MIME-Version: 1.0
Content-Type: multipart/related;
boundary="----=_NextPart_000_0012_219D19A8.7D241EFA"
Return-Path:
internationalcardservices.notificationiare@mailing.internationalcardservices.nl
X-OriginalArrivalTime: 27 Sep 2013 10:05:28.0797 (UTC)
FILETIME=[17F140D0:01CEBB69]
<internationalcardservices>

and the content as welll, it is a phising attempt.
First, ICS normally sends just one reminder, and not two within a few hours. Nor will ISC send from an unknown address:

Received: from unknown ([87.106.96.232] EXTERNAL) (EHLO s16978676)

So I was triggered to check the included URL, and that is definitly NOT an ISCCards address:

href="http://www.lemrith.net/images/ICS.php"

Of course, the address has no longer access to the my network.
Lemmrith.net is actually a valid site: a small town in Germany (it is safe to check www.lemrith.net) but they have not secuired their site – given the fact that someone dropped a .PHP file on thein images directory. They have been notified.

24-Sep-2013

Just In Time
Purely coïncidental, but being there may prevent a problem.
At times it happens that my werkstation doesn’t get all settings form the DHCP server in one go and I need to renew the DHCP settings. Today was such a day: nothing seemed wrong when I started doing my thing, but as soon as I set up a connection to Diana using a terminal emulator, it failed because it’s logical address couldn’t be translated – meaning the DNS server (on Diana) didn’t respond. Renewal of the settings (ipconfig /renew) helped, but login hung – there was simply no response.
Luckily, it’s just a matter of rotating my chair to access the (graphical) console. $ SHO SYS revealed that there were 10 PMAS worker processes working – so it was obvious that, once more, someone was trying to break the mail server. BEcause there have been a number of issues lately with mail, I have the worker set up to log, so it soon became obvious that there was one particular server trying to fire large amount of connect requests that overwhelmed the spam agent:
24-SEP-2013 17:28:22.47: ======================================================
24-SEP-2013 17:28:22.47: RelayPlug: Connection accepted from 81.5.148.161
24-SEP-2013 17:28:22.48: RelayPlug: External IP address, so set up to check MAIL FROM:....
Failed to connect to 192.168.0.2-25 - connection refused
24-SEP-2013 17:28:22.49: RelayPlug: End connection event triggered
24-SEP-2013 17:28:22.67: ======================================================
24-SEP-2013 17:28:22.67: RelayPlug: Connection accepted from 81.5.148.161
24-SEP-2013 17:28:22.68: RelayPlug: External IP address, so set up to check MAIL FROM:...
Failed to connect to 192.168.0.2-25 - connection refused
24-SEP-2013 17:28:22.69: RelayPlug: End connection event triggered
24-SEP-2013 17:28:23.47: ======================================================

This went on, and on, and on…
To prevent further damage – because it was not to be expected the bombardment would stop by these actions – I blocked the address in the firewall.
Next, I started investigating.
These messages show the front-end couldn’t connect to port 25 – the SMPT port used for outgoing traffic. The fast – and easy – way to set things straight againwas sstopping the mail frontend, and restarting it. It removed the worker processes but left the main process in RWAST state. STOP/ID did indeed – after some time – stop the main frontend process.
Next, it was a matter of examining the firewall logs, and sure enough, that showed the full sequence. It started just before 17:00 (UTC) with a connect request to the mail server – by NAT, transferred to port 2525: the mail frontend. That responds with a test of the validity of that address. Before that is answered, the next requests arrives, causing a second request from the mail frontend, and so on. Within a minute, there was hardly any other traffic over the firewall than these mail connect requests and the forntend’s response:
Sep 24 16:56:05 FW: [FILTER][Pass][WAN->LAN/RT/VPN, 476:36:02 ][@S:R=13:1, 81.5.148.161:52059->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=3513532061, Ack=0, Win=65535]
Sep 24 16:56:05 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:02 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:52059][TCP][HLen=20, TLen=44, Flag=AS, Seq=2090788352, Ack=3513532062, Win=16824]
Sep 24 16:56:05 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:02 ][@S:R=13:1, 192.168.0.2:42266->194.109.9.99:53][UDP][HLen=20, TLen=86]
Sep 24 16:56:05 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:02 ][@S:R=13:1, 192.168.0.2:60665->194.109.9.99:53][UDP][HLen=20, TLen=86]
Sep 24 16:56:05 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:02 ][@S:R=13:1, 192.168.0.2:7587->194.109.9.99:53][UDP][HLen=20, TLen=85]
Sep 24 16:56:05 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:02 ][@S:R=13:1, 192.168.0.2:44268->194.109.9.99:53][UDP][HLen=20, TLen=84]
Sep 24 16:56:06 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:03 ][@S:R=13:1, 192.168.0.2:25782->194.109.9.99:53][UDP][HLen=20, TLen=92]
Sep 24 16:56:07 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:04 ][@S:R=13:1, 192.168.0.2:46706->192.55.83.30:53][UDP][HLen=20, TLen=72]
Sep 24 16:56:09 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:06 ][@S:R=13:1, 192.168.0.2:21156->38.103.2.1:53][UDP][HLen=20, TLen=72]
Sep 24 16:56:09 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:06 ][@S:R=13:1, 192.168.0.2:36146->199.249.121.1:53][UDP][HLen=20, TLen=73]
Sep 24 16:56:11 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:08 ][@S:R=13:1, 192.168.0.2:62610->72.52.71.1:53][UDP][HLen=20, TLen=73]
Sep 24 16:56:11 FW: [FILTER][Pass][WAN->LAN/RT/VPN, 476:36:08 ][@S:R=13:1, 81.5.148.161:50729->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=2905819008, Ack=0, Win=65535]
Sep 24 16:56:11 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:08 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:50729][TCP][HLen=20, TLen=44, Flag=AS, Seq=1903256155, Ack=2905819009, Win=16824]
Sep 24 16:56:11 FW: [FILTER][Pass][WAN->LAN/RT/VPN, 476:36:08 ][@S:R=13:1, 81.5.148.161:50751->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=1608792645, Ack=0, Win=65535]
Sep 24 16:56:11 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:08 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:50751][TCP][HLen=20, TLen=44, Flag=AS, Seq=491343422, Ack=1608792646, Win=16824]
Sep 24 16:56:11 FW: [FILTER][Pass][WAN->LAN/RT/VPN, 476:36:08 ][@S:R=13:1, 81.5.148.161:50900->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=2106594754, Ack=0, Win=65535]
Sep 24 16:56:11 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:08 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:50900][TCP][HLen=20, TLen=44, Flag=AS, Seq=1610899123, Ack=2106594755, Win=16824]
Sep 24 16:56:12 FW: [FILTER][Pass][WAN->LAN/RT/VPN, 476:36:09 ][@S:R=13:1, 69.197.128.26:47499->192.168.0.200:80][TCP][HLen=20, TLen=60, Flag=S, Seq=851656388, Ack=0, Win=14600]
Sep 24 16:56:12 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:09 ][@S:R=13:1, 192.168.0.200:80->69.197.128.26:47499][TCP][HLen=20, TLen=48, Flag=AS, Seq=1834372936, Ack=851656389, Win=61688]
Sep 24 16:56:12 FW: [FILTER][Pass][WAN->LAN/RT/VPN, 476:36:09 ][@S:R=13:1, 81.5.148.161:52269->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=4034781716, Ack=0, Win=65535]
Sep 24 16:56:12 FW: [FILTER][Pass][LAN/RT/VPN->WAN, 476:36:09 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:52269][TCP][HLen=20, TLen=44, Flag=AS, Seq=3336740259, Ack=4034781717, Win=16824]
Sep 24 16:56:12 FW: [FILTER][Pass][WAN->LAN/RT/VPN, 476:36:09 ][@S:R=13:1, 81.5.148.161:52404->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=673951290, Ack=0, Win=65535]

and at some point, it was only the frontend trying to get to the sender, hundreds of attempts a minute.
After I blocked the adress, the frontend continued asking for the sender:
Sep 24 17:34:43 FW2: [FILTER][Pass][LAN/RT/VPN->WAN, 477:14:39 ][@S:R=13:1, 192.168.0.2:25747->194.109.9.99:53][UDP][HLen=20, TLen=84]
Sep 24 17:34:43 FW2: [FILTER][Pass][LAN/RT/VPN->WAN, 477:14:39 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:50163][TCP][HLen=20, TLen=40, Flag=AF, Seq=345529474, Ack=3100051374, Win=16824]
Sep 24 17:34:44 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:14:39 ][@S:R=3:2, 81.5.148.161:50540->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=2606349597, Ack=0, Win=65535]
Sep 24 17:34:44 FW2: [FILTER][Pass][LAN/RT/VPN->WAN, 477:14:39 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:50794][TCP][HLen=20, TLen=40, Flag=AF, Seq=2937753010, Ack=3183395198, Win=16824]
Sep 24 17:34:44 FW2: [FILTER][Pass][LAN/RT/VPN->WAN, 477:14:40 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:0][TCP][HLen=20, TLen=44, Flag=AS, Seq=1040412343, Ack=310902461, Win=16824]
Sep 24 17:34:44 FW2: [FILTER][Pass][LAN/RT/VPN->WAN, 477:14:40 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:50280][TCP][HLen=20, TLen=40, Flag=AF, Seq=1422084251, Ack=3320381021, Win=16824]
Sep 24 17:34:44 FW2: [FILTER][Pass][LAN/RT/VPN->WAN, 477:14:40 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:52618][TCP][HLen=20, TLen=40, Flag=AF, Seq=4161431313, Ack=2911615049, Win=16824]
Sep 24 17:34:44 FW2: [FILTER][Pass][LAN/RT/VPN->WAN, 477:14:40 ][@S:R=13:1, 192.168.0.200:2525->81.5.148.161:53105][TCP][HLen=20, TLen=40, Flag=AF, Seq=3854292009, Ack=4072697059, Win=16824]

while the sender tried, and tried…Eventually, I stopped the frontend so these requests were now gone, and other traffic got a chance to get through:
Sep 24 17:57:44 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:37 ][@S:R=3:2, 81.5.148.161:52466->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=1711623955, Ack=0, Win=65535]
Sep 24 17:57:45 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:38 ][@S:R=3:2, 81.5.148.161:49595->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=3531258844, Ack=0, Win=65535]
Sep 24 17:57:46 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:39 ][@S:R=3:2, 81.5.148.161:50602->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=2416492183, Ack=0, Win=65535]
Sep 24 17:57:46 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:39 ][@S:R=3:2, 81.5.148.161:51035->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=4075797788, Ack=0, Win=65535]
Sep 24 17:57:47 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:40 ][@S:R=3:2, 81.5.148.161:51649->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=1615220663, Ack=0, Win=65535]
Sep 24 17:57:48 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:41 ][@S:R=3:2, 81.5.148.161:53184->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=964671704, Ack=0, Win=65535]
Sep 24 17:57:49 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:42 ][@S:R=3:2, 81.5.148.161:53108->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=2694901457, Ack=0, Win=65535]
Sep 24 17:57:49 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:42 ][@S:R=3:2, 81.5.148.161:50126->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=2149152341, Ack=0, Win=65535]
Sep 24 17:57:49 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:43 ][@S:R=3:2, 81.5.148.161:50192->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=467935095, Ack=0, Win=65535]
Sep 24 17:57:49 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:43 ][@S:R=3:2, 81.5.148.161:50254->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=3213471951, Ack=0, Win=65535]
Sep 24 17:57:49 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:43 ][@S:R=3:2, 81.5.148.161:50316->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=614172963, Ack=0, Win=65535]
Sep 24 17:57:49 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:43 ][@S:R=3:2, 81.5.148.161:50322->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=1804337752, Ack=0, Win=65535]
Sep 24 17:57:50 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:43 ][@S:R=3:2, 81.5.148.161:50343->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=1710313407, Ack=0, Win=65535]
Sep 24 17:57:50 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:43 ][@S:R=3:2, 81.5.148.161:50554->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=2321196107, Ack=0, Win=65535]
Sep 24 17:57:51 FW2: [FILTER][Block][WAN->LAN/RT/VPN, 477:37:44 ][@S:R=3:2, 81.5.148.161:51346->192.168.0.200:2525][TCP][HLen=20, TLen=48, Flag=S, Seq=2653175112, Ack=0, Win=65535]

In that time, I tried to figure out whose server this is:
$ dig -x 81.5.148.161
; < <>> DiG 9.3.1 < <>> -x 81.5.148.161
;; global options:  printcmd
;; Got answer:
;; ->>HEADER< <- opcode: QUERY, status: NOERROR, id: 44357 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 0 ;; QUESTION SECTION: ;161.148.5.81.in-addr.arpa.     IN      PTR ;; ANSWER SECTION: 161.148.5.81.in-addr.arpa. 7200 IN      PTR     mailhost2.cairnsbrown.co.uk. ;; AUTHORITY SECTION: 148.5.81.in-addr.arpa.  20819   IN      NS      dns2.as12513.net. 148.5.81.in-addr.arpa.  20819   IN      NS      dns1.as12513.net. 148.5.81.in-addr.arpa.  20819   IN      NS      dns1.eclipse.net.uk. ;; Query time: 416 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Wed Sep 25 07:30:05 2013 ;; MSG SIZE  rcvd: 164 $ It turned out to be a solliciter's firm, their mailserver is behaving badly, is badly configured (an open relay?), hacked or otherwise compromised. So I warned them - by mail. But I couldn't connect to the SMTP server. So I stopped and restarted it, but found that a number of TCPIP$SMTP processes were stalled in RWAST. I could stop two of them by STOP/ID - it tool a minute or so to get them off the hook - but two of them were waiting for a mailbox to be read. It was sufficient to open a channel to that mailbox, read it twice and close it, and these two were gone as well. Now I could send the message. But during the time I was composing the message, the offending server all of a sudden stopped bothering me; and no more mesages were sent after 18:05. There now is a request to Process.com: If one address sends messages in this rate: several within a second: don't bother to process them but simply disconnect. Tuned the firewall rules
I fine-tuned teh firewall rules so I can still block eDonkey traffic but still have the forntend communicate to the two sites found yesterday. No others – yet.

23-Sep-2013

More on port 2525 traffic
There is another source for information to investigate: I have the PMAS set up so the worker processes log as well (default, it’s off) and I found a hint on what’s going on.
Both addresses have the same type of signature: Mail is received for procesing, a test is made to check whether the sender is located in a blacklist, and that connection is aborted:

23-SEP-2013 21:08:24.14: RelayPlug: Connection accepted from 208.75.123.163
23-SEP-2013 21:08:24.17: RelayPlug: External IP address, so set up to check MAIL FROM:....
23-SEP-2013 21:08:26.38: RelayPlug: MAIL FROM:<(some large 'name')@in.constantcontact.com>
23-SEP-2013 21:08:26.38: PMASplug: MAIL FROM: <(some large 'name')@in.constantcontact.com>
23-SEP-2013 21:08:26.38: DNSBLplug: MAIL FROM: <(some large 'name')@in.constantcontact.com>
23-SEP-2013 21:08:26.39: DNSBLplug: Checking address 208.75.123.163....
23-SEP-2013 21:08:26.39: DNSBLplug: Address action = 0 (OK (no match))
23-SEP-2013 21:08:26.39: DNSBLplug: Checking domain name in.constantcontact.com....
23-SEP-2013 21:08:26.39: DNSBLplug: Domain action = 0 (OK (no match))
Client connection error (connection reset by peer ); aborting session
23-SEP-2013 21:08:26.43: RelayPlug: End connection event triggered

So the domains are in.constantcontact.com (an email host) and mailing.internationalcardservices.nl – both respectable companies.

There has been a connection from in.constantcontact.com but that came from another address:

23-SEP-2013 20:16:59.24: RelayPlug: Connection accepted from 208.75.123.225
23-SEP-2013 20:16:59.24: RelayPlug: External IP address, so set up to check MAIL FROM:....
23-SEP-2013 20:17:01.88: RelayPlug: MAIL FROM:<(generated_username)@in.constantcontact.com>
23-SEP-2013 20:17:01.88: PMASplug: MAIL FROM: <(generated_username)@in.constantcontact.com> 23-SEP-2013 20:17:01.88: DNSBLplug: MAIL FROM: <(generated_username)@in.constantcontact.com>
23-SEP-2013 20:17:01.88: DNSBLplug: Checking address 208.75.123.225....
23-SEP-2013 20:17:01.88: DNSBLplug: Address action = 0 (OK (no match))
23-SEP-2013 20:17:01.90: DNSBLplug: Checking domain name in.constantcontact.com....
23-SEP-2013 20:17:01.90: DNSBLplug: Domain action = 0 (OK (no match))
23-SEP-2013 20:17:02.06: RelayPlug: RCPT TO:<(my address)>
23-SEP-2013 20:17:02.44: PMASplug: Created source header X-PMAS-External: name [address] (HELO (name))
23-SEP-2013 20:17:03.12: PMASplug: Message complete, deferred size is 37588
23-SEP-2013 20:17:03.13: PMASplug: Adding header: X-PMAS-External: name [address] (HELO (name))
23-SEP-2013 20:17:03.13: PMASplug: RCPT TO: (my address) / (my address)
23-SEP-2013 20:17:03.13: PMASplug: Calling PMAS to process the message....
23-SEP-2013 20:17:25.23: PMASplug: PMAS returned status: 1
23-SEP-2013 20:17:25.23: PMASplug: Number of recipients: 1, number of dispositions: 1
23-SEP-2013 20:17:25.23: PMASplug: Message quarantined or discarded for all recipients
23-SEP-2013 20:17:25.24: DNSBLplug: Reset event triggered
23-SEP-2013 20:17:25.39: RelayPlug: End connection event triggered

and that message is indeed found in the ‘discarded’ box.

Why this started happening in the first place? The protocol (eDonkey) is blocked from the beginning and these messages never appeared before. And just these two addresses? The only remarkable is the size of the FROM address: almost 90 characters. If this is similar to the eDonkey signature, it explains the behaviour…
A second question in this matter: in the firewall, I have defined these two addresses as objects to be blocked, but they still access the network: the messages keep appearing

So Draytek has to answer a question or two …

22-Sep-2013

Machine check
In publishing the images of our Moroccan holiday this year, something had gone wrong in generating the album and I didn’t notice until I published what I thought to be the right data. On checking, I noticed the error, and yesterday evening I found what went wrong and would repair it. But here something else stroke me: I was missing something in the logs. First, I thought it had been a power outage, but since there was no other sign of such an event: every electric clock would notify outage by blinking but none did, so Diana must have had a reboot for some reason.
So I checked, and indeed:
$ ana/crash

OpenVMS system dump analyzer
...analyzing an Alpha compressed selective memory dump...

Dump taken on 21-SEP-2013 08:13:04.29 using version V8.3
MACHINECHK, Machine check while in kernel mode

SDA>

Checking the error log, this is what is says around the time of the crash:

**** V3.4  ********************* ENTRY 3791 ********************************

Logging OS                        1. OpenVMS
System Architecture               2. Alpha;
OS version                           V8.3
Event sequence number         47555.
Timestamp of occurrence              21-SEP-2013 08:13:04
Time since reboot                    276 Day(s) 20:42:02
Host name                            DIANA

System Model                         AlphaServer DS10 617 MHz

Entry Type                        2. Machine Check Error

========================
Raw Event Data Dump
========================

Entry# (record in file)        3791.

Entry Body Size:          x0000016C
Entry body:

          15--<  -12  11--<-08  07--<-04  03--<-00   :Byte Order   0000:    00000000  000B0000  0022FFF9  000007B2   *......".........*   0010:    00000000  00000020  2020414E  41494408   *.DIANA   .......*   0020:    B9C300AD  9A2ADC19  68200002  60030000   *...`.. h..*.....*   0030:    016D011A  00000000  20202020  332E3856   *V8.3    ......m.*   0040:    31534420  72657672  65536168  706C4118   *.AlphaServer DS1*   0050:    00000000  0000007A  484D2037  31362030   *0 617 MHz.......*   0060:    000000A0  00000018  00000000  000000C8   *................*   0070:    00000000  00000000  00000001  00000098   *................*   0080:    00000000  122CCC00  00000000  00000000   *..........,.....*   0090:    00000000  00000000  00000000  00000000   *................*   00A0:    00000000  00000009  00000000  00000001   *................*   00B0:    FFFFFFFF  81F3AC00  00000000  00000290   *................*   00C0:    00000000  00000000  0000006E  FFC00000   *....n...........*   00D0:    00000000  00008000  00000000  00000000   *................*   00E0:    00000000  00000000  FFFFFEFC  1A300386   *..0.............*   00F0:    00000000  00000000  00000000  00000000   *................*   0100:    00000000  00000000  00000000  00000004   *................*   0110:    00000000  00000000  00000000  00000000   *................*   0120:    00010007  000C0038  00000000  00000000   *........8.......*   0130:    00000000  00000000  00000000  00006005   *.`..............*   0140:    00000000  00000000  00000000  00000000   *................*   0150:    0000010A  00000000  00001244  014C0091   *..L.D...........*   0160:              00000000  00010000  00000008   *    ............*  **** V3.4  ********************* ENTRY 3792 ********************************  Logging OS                        1. OpenVMS  System Architecture               2. Alpha OS version                           V8.3      Event sequence number         47556. Timestamp of occurrence              21-SEP-2013 08:13:04  Time since reboot                    276 Day(s) 20:42:02  Host name                            DIANA     System Model                         AlphaServer DS10 617 MHz  Entry Type                       37. Crash Re-Start  Bugcheck Minor class              1. Crash Re-start  Bugcheck Msg                         MACHINECHK, Machine check while in kernel mode  Process ID                x00010000  Process Name                         NULL  KSP                       xFFFFFFFF854CFD80  ESP                       xFFFFFFFF854D1000  SSP                       xFFFFFFFF854CB000  USP                       xFFFFFFFF854CB000  R0                        x0000000000000000  R1                        x0000000000000210  R2                        xFFFFFFFF81C3FE50  R3                        xFFFFFFFF81C402F0  R4                        x0000000000000080  R5                        x0000000000000000  R6                        xFFFFFFFF82036038  R7                        x0000000000000038  R8                        x0000000000000000  R9                        x0000000000000000  R10                       x0000000000000002  R11                       xFFFFFFFF81C0B1C8  R12                       x000000007BBBBBC8  R13                       xFFFFFFFF81C57110  R14                       x0000000000000001  R15                       x0000000000000000  R16                       x0000000000000215  R17                       x0000000000000002  R18                       xFFFFFFFF82032004  R19                       x8203208000000000  R20                       x8203228000000000  R21                       x8203208000000000  R22                       x0000000000000000  R23                       xFFFFFFFF81C0A0F8  R24                       x0000000000000000  R25                       x0000000000000001 
This means deep digging, but there is little chance to find out what exactly has gone wrong - after being available for over 270 days - without access to the source code or (better!) listings. However: the system restarted flawlessly, and all works like it should. No wonder I didn't notice; if it were not that I missed something in the logfiles, I wouldn't have noticed for a long time it had happened...
Port 2525 abused?
The router has been set up to block certain traffic, and this time there is definitively something weird going on. I get these message:

Event Time : 2013/09/22 13:53:44, [CSM_AE] [eDonkey] [Block]
Packet info: 192.168.0.200:2525 -> 208.75.123.163:46328, PR tcp
-AP-----------------------------------------------

and
Event Time : 2013/09/22 12:16:31, [CSM_AE] [eDonkey] [Block]
Packet info: 192.168.0.200:2525 -> 87.106.96.232:57343, PR tcp
-AP-----------------------------------------------

Just these two addresses…
Port 2525 is the PMAS port – incoming mail – that has been set to be accepting mail for my domain only, disallowing relay, for obvious reasons. So what causes a torrent-like behaviour on this port?
It look working – though stated “Disabled” it actulally isn’t:
$ tcpip sho serv

Service             Port  Proto    Process          Address            State

BE-SMTP             2525  TCP      PTSMTP           0.0.0.0            Disabled
because PMAS connects directly to this port.
There have been more strange issues with mail, lately.
The logfile does sometimes show messages like:
%%%%%%%%%%%  OPCOM   5-SEP-2013 08:42:42.53  %%%%%%%%%%%
Message from user WILLEM on DIANA
%PTSMTP-E-WORKERDIED, worker PTSMTP    0056  (21265CA4) terminated unexpectedly
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket -
-SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from <address>,<port>

for no reason (this may happen if there has been a bulk of messages before this message but here, it was rather quiet), and some days, I find:
%%%%%%%%%%%  OPCOM   5-SEP-2013 11:32:35.30  %%%%%%%%%%%
Message from user TCPIP$SMTP on DIANA
%TCPIP-W-SMTP_NOSPAMRLY, Rejected relay to  <user@domain> from client IP address 10.0.0.1 as suspected SPAM

but though this address is set up on my second network interface, it has been disabled:
$ tcpip sho int
                      
                      
               Packets
Interface   IP_Addr         Network mask          Receive          Send     MTU

 LO0        127.0.0.1       255.0.0.0              330183        330183    4096
 WE0        192.168.0.2     255.255.255.0         3620928       2038155    1500
 WE1        ?                                           1             1    1500
$ tcpip sho conf
_Facility: inter
_PARAM1:

 Interface: LO0
   IP_Addr: 127.0.0.1         NETWRK: 255.0.0.0         BRDCST:

 Interface: WE0
   IP_Addr: 192.168.0.2       NETWRK: 255.255.255.0     BRDCST: 192.168.0.255

 Interface: WE1
   IP_Addr: 10.0.0.1          NETWRK: 255.0.0.0         BRDCST: 10.255.255.255

$ tcpip ifconf -a
LO0: flags=100c89<up ,LOOPBACK,NOARP,MULTICAST,SIMPLEX,NOCHECKSUM>
     inet 127.0.0.1 netmask ff000000 ipmtu 4096

TN0: flags=80<noarp>

TN1: flags=80</noarp><noarp>

WE0: flags=c43<up ,BROADCAST,RUNNING,MULTICAST,SIMPLEX>
    *inet 192.168.0.2 netmask ffffff00 broadcast 192.168.0.255 ipmtu 1500
     inet 192.168.0.200 netmask ffffff00 broadcast 192.168.0.255 ipmtu 1500

WE1: flags=c02<broadcast ,MULTICAST,SIMPLEX>
    *inet 10.0.0.1 netmask ff000000 broadcast 10.255.255.255 ipmtu 1500</broadcast></up></noarp></up>

I didn’t see this before, it is just the last week or so that this happens. Nothing particular is found in the PMAS log files for these addresses, so I turned to the most reliable source of traffic of incoming and outgoing traffic:the router/firewall log. And there it was:
Sep 22 02:56:59 charon.intra.grootersnet.nl Charon2: [FILTER][Pass][WAN->LAN/RT/VPN, 414:40:37 ][@S:R=13:1, 208.75.123.163:60401->192.168.0.200:2525][TCP][HLen=20, TLen=60, Flag=S, Seq=4070629722, Ack=0, Win=5840]
Sep 22 02:56:59 charon.intra.grootersnet.nl Charon2: [FILTER][Pass][LAN/RT/VPN->WAN, 414:40:37 ][@S:R=13:1, 192.168.0.200:2525->208.75.123.163:60401][TCP][HLen=20, TLen=48, Flag=AS, Seq=3896561846, Ack=4070629723, Win=16824]
Sep 22 02:57:03 charon.intra.grootersnet.nl Charon2: [CSM_AE][Block][eDonkey][@S:R=13:1, 192.168.0.200:2525->208.75.123.163:60401][TCP][HLen=20, TLen=148, Flag=AP, Seq=3896562051, Ack=4070629856, Win=16824]

Sure, from the address 208.75.123.163 there has been a connection to port 2525 – which is PMAS, and there is a first reply almost immediately. But a second reply is blocked by the router, due to it’s signature (I don’t allow torrents and bulk downloads from my server). I just have to guess what they might have sent to the server – by mail – causing PMAS to respond in that way, there is no way to tell: the firewall log doesn’t tell, and PMAS doesn’t point to anything. After latest reboot, the worker logs were not enabled automatically so there is no data on that – yet.
The same applies to the second address: 87.106.96.232. Neither of these can really be tracked down to a particular provider. So I had to ban them from the network.

08-Sep-2013

A family member sent a message – by SMS and Skype – that mail bounced. No mention of the reason,
A brief look into the system revealed that just the main process of PMAS was running – no worker processes. So I stopped PMAS – and got a number of error messages – mainly that logicals were not defined.
Next, I looked into the PMAS log files. The first strange thing was that there was a PMAS.LOG-2013-09-04, and a PMAS.LOG – dated 05-Sep-2013. I would expect files PMAS.LOG-2013-09-05, PMAS.LOG-2013-09-06 and PMAS.LOG-2013-09-07, but these were all missing.PMAS.LOG however ended with the last message at 22:26:34.06, a message being quarantined. But after that, it seems no new messages were received!
Looking into PMAS_STARTUP.LOG, that contains all messages also sent to OPCOM (and therefore, be logged in OPERATOR.LOG) revealed why:
5-SEP-2013 23:13:23.51 **
%PTSMTP-E-WORKERDIED, worker PTSMTP 005D (2126D8E0) terminated unexpectedly
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket - -SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from 62.175.202.16,17314

This specific address (62.175.202.16) continues sending messages at a high rate, until
** 7-SEP-2013 08:57:25.13 **
%PTSMTP-E-WORKERDIED, worker PTSMTP 022F (2126B0B5) terminated unexpectedly
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket - -SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from 62.175.202.16,50436

and after that, a number of other addresses took over – it might have been the same source using a anonymiser. These addresse included: 182.18.222.91, 201.240.162.58, 200.37.163.163, 123.16.92.3,
181.67.6.108, 113.172.23.195, and more.
At some point, the main program notices:
** 6-SEP-2013 00:12:25.18 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 9310 connections accepted; 0 workers currently active

This happens when someone tries to block the system for the same reason – as, for example in Jily, but the main process was able to restart a new workerprocess:
** 19-JUL-2013 02:12:24.88 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 4439 connections accepted; 0 workers currently active

** 19-JUL-2013 04:12:24.88 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 4441 connections accepted; 1 worker currently active
Pid Process Name State Connections Serviced
21257C5F PTSMTP 001D idle 2

and at some moments, the maximum of worker-processes (10) is reached, but it never caused a problem:
** 23-JUL-2013 23:05:44.23 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 4891 connections accepted; 10 workers currently active
Pid Process Name State Connections Serviced
21257F2D PTSMTP 0024 idle 226
212530A6 PTSMTP 0025 idle 27
21261FE7 PTSMTP 0026 exiting 1
21257068 PTSMTP 0027 exiting 1
2125FE69 PTSMTP 0028 exiting 1
2126236A PTSMTP 0029 exiting 1
2125D3ED PTSMTP 002A exiting 1
212602EE PTSMTP 002B exiting 1
2125D002 PTSMTP 002C exiting 1
2125F883 PTSMTP 002D exiting 1

since the surplus simply dies out.
But this aabuser had sent so many messages that the last ‘normal’ signal was written early 04-Sep-2013:
** 4-SEP-2013 04:12:25.17 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 8050 connections accepted; 2 workers currently active
Pid Process Name State Connections Serviced
2126DD69 PTSMTP 0055 idle 174
21265CA4 PTSMTP 0056 idle 16

After that, one of the server processes crashed, and a second was never brought into existence:

** 4-SEP-2013 06:12:25.17 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 8057 connections accepted; 1 worker currently active
Pid Process Name State Connections Serviced
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket - -SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from 117.217.5.97,10830

** 4-SEP-2013 06:12:25.17 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 8057 connections accepted; 1 worker currently active
Pid Process Name State Connections Serviced
21265CA4 PTSMTP 0056 idle 18

From that moment on, only one worker process remains – but when that one died, another was started. and when the bomb started, one after the other that was created went down:
** 5-SEP-2013 23:13:23.51 **
%PTSMTP-E-WORKERDIED, worker PTSMTP 005D (2126D8E0) terminated unexpectedly
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket - -SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from 62.175.202.16,17314

** 5-SEP-2013 23:13:34.03 **
%PTSMTP-E-WORKERDIED, worker PTSMTP 005A (2126A8DC) terminated unexpectedly
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket - -SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from 62.175.202.16,22436

** 5-SEP-2013 23:13:43.06 **
%PTSMTP-E-WORKERDIED, worker PTSMTP 005E (212733E4) terminated unexpectedly
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket - -SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from 62.175.202.16,24835

** 5-SEP-2013 23:14:35.74 **
%PTSMTP-E-WORKERDIED, worker PTSMTP 0060 (21272867) terminated unexpectedly
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket - -SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from 62.175.202.16,6114

** 5-SEP-2013 23:14:47.17 **
%PTSMTP-E-WORKERDIED, worker PTSMTP 0061 (2126C1EC) terminated unexpectedly
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket - -SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from 62.175.202.16,37761

until at some moment, none was active:
** 6-SEP-2013 00:11:58.20 **
%PTSMTP-E-WORKERDIED, worker PTSMTP 007A (2126ED0A) terminated unexpectedly
-PTSMTP-I-WORKEREXIT, worker exiting
-PTSMTP-E-SOCKETFAILED, failed to create socket - -SYSTEM-W-DEVALLOC, device already allocated to another user
-PTSMTP-I-WORKERCONN, while processing connection from 62.175.202.16,60929

** 6-SEP-2013 00:12:25.18 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 9310 connections accepted; 0 workers currently active

and it continues. The last statusmessage – that is normally sen every 2 hours – was found to be:
** 8-SEP-2013 14:12:25.19 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 10417 connections accepted; 0 workers currently active

until I stopped the program:
** 8-SEP-2013 16:09:23.24 **
%PTSMTP-I-CTRLEXIT, PTSMTP exiting
-PTSMTP-I-EXITREQUESTED, exit requested by operator

** 8-SEP-2013 16:09:23.24 **
%PTSMTP-I-DBSTATUS, status for configuration PMAS_ROOT:[DATA]PTSMTP.CONF:
Total of 10464 connections accepted; 0 workers currently active

In the end – when I stopped the service, it has ‘processed’ about 2500 message in 3 days ; these are the ones that actually came through. The number of SYSTEM-W-DEVALLOC messages in those days is about 850, and the number of messages that the service limit has been exceed, is about 90.
The real amount is found in the router logging: from the beginning until I stopped the service, the mail filter had to handle over 2400 message – of which over 1100 sent by 62.175.202.16, several every second.
Too much for PMAS to handle.
From a certain point, several other senders kick in but My guess is these are trying this for a long time; and because PMAS had been overloaded and is unable to start new worker processes, these encounter the same problem as other legitimate senders.

In hypewrSpy – on eof the tools in which I check the system performance – there is no real clue on what was going on. I still have to look into the T4 data but I doubt there will some clue.

In retrospect, last Friday I also noted that there was very little mail coming in – either deliverable of blocked by the system (well – the spam filter).

I have informed Process.com of the issue: That PMAS was unable – for some reason – to create a new process, although the system has slots available (the maximum seen is 67 where I have room for 110 processes), and that PMQAS should block any repeated requests over a certain amount from one address, like I have set for WASD; and signal – by email, for instance, that a site tries to block access to ail.
And, of course, the offending address – owned by ono.com in Spain – has been denied access in the router – and I would like to inform ONO.COM if only I found an address. But the site is already mentions a a potentially dangerous one. Probably trying to relay – but even that cannot be determined. The last log is of 4-Sep-2013 – so there hasn’t been an attempt since than – or could not be written?