Well ... getting closer ...
So, Apache communicates with
srwxr-xr-x 1 www-data www-data 0 Aug 21 08:56 /run/mailman3-web/uwsgi.sock
and apparently writes that socket fine,
but when reading it ...
HTTP/1.1 500 Internal Server Error
So, looks like somehow some kind of error between there and postorius.
Looks like postorius is successfully writing lots of:
HTTP/1.0 200 OK
responses ... but (presumably) those somehow aren't making it to Apache?
On Wed, Aug 21, 2024 at 3:50 AM Michael Paoli via BALUG-Test
<balug-test(a)lists.balug.org> wrote:
>
>
>
>
> ---------- Forwarded message ----------
> From: Michael Paoli <michael.paoli(a)berkeley.edu>
> To: BALUG-Test <balug-test(a)lists.balug.org>
> Cc:
> Bcc:
> Date: Wed, 21 Aug 2024 03:50:08 -0700
> Subject: [BALUG-Test] Re: "Oops" ... lists down for a bit (back by the time you see this?)
> Well ... at least that's good - looks like the mail part is probably
> working fine.
> I thought for a bit it wasn't with one quick test I ran ... but I may
> not have been
> patient enough with that or may have not done that test quite right.
> So, looks like just web/postorius interface I need to get working again.
> And by the way, the Django administration interface appears to be fine,
> so looks like issue is limited to postorius, and between postorius and Apache.
>
> So at least in theory, subscribe/unsubscribe via email should also be
> working fine too.
>
> On Wed, Aug 21, 2024 at 3:43 AM Michael Paoli via BALUG-Test
> <balug-test(a)lists.balug.org> wrote:
> >
> >
> >
> >
> > ---------- Forwarded message ----------
> > From: Michael Paoli <michael.paoli(a)berkeley.edu>
> > To: BALUG-Test <balug-test(a)lists.balug.org>
> > Cc:
> > Bcc:
> > Date: Wed, 21 Aug 2024 03:42:38 -0700
> > Subject: [BALUG-Test] "Oops" ... lists down for a bit (back by the time you see this?)
> > "Oops" ... lists down for a bit.
> > Drats ... and hopefully all better by the time this mail makes it to list.
> >
> > So ... all was fine and dandy until ...
> > managed to have a host booboo,
> > basically locked up solid, and I power cycled it
> > (this was physical host upon which the VM was running).
> > Shouldn't be any biggie ... after that all up and fine again ...
> > except the Mailman 3 lists.
> >
> > So, ... isolating and working towards fixing that.
> > I'm guestimating maybe issue with some kind of lock file or
> > the like that didn't get cleaned up upon (re)boot,
> > or possibly, since I perhaps didn't do as many reboots as I ought
> > to have made sure it would "always" come up clean, perhaps there was
> > some misconfiguration or the like that somehow snuck in, that would be
> > effectively a latent defect/issue, and that wouldn't show until a reboot or
> > attempted restart of the relevant service(s).
> >
> > At this point still troubleshooting to narrow down the issue.
> >
> > As far as I can tell so far, looks like web server hits some kind of issue
> > and generally gives or passes along a 500 response, that then ends up
> > with the relevant web page(s) failing. When I did deeper into the
> > backend service(s) ... notably postorius, looks like it gets requests okay
> > and responds to them okay ... but something goes wrong somewhere
> > between there and web server properly getting and passing that
> > along to client browser.
> > E.g.:
> > https://lists.balug.org/mailman3/postorius/lists/
> > I see in postorius bit returning (from strace(1)) including:
> > {\"display_name\": \"BALUG-Admin\", \"fqdn_listname\":
> > \"balug-admin(a)lists.balug.org\",
> > {\"display_name\": \"BALUG-Announce\", \"fqdn_listname\":
> > \"balug-announce(a)lists.balug.org\",
> > {\"display_name\": \"BALUG-Talk\", \"fqdn_listname\":
> > \"balug-talk(a)lists.balug.org\",
> > {\"display_name\": \"BALUG-Test\", \"fqdn_listname\":
> > \"balug-test(a)lists.balug.org\",
> > But somehow that doesn't make it to the web page the server serves ... so I'm
> > guestimating there's some issue somewhere between postorius and Apache.
> >
> >
> > ---------- Forwarded message ----------
> > From: Michael Paoli via BALUG-Test <balug-test(a)lists.balug.org>
> > To: BALUG-Test <balug-test(a)lists.balug.org>
> > Cc:
> > Bcc:
> > Date: Wed, 21 Aug 2024 03:42:38 -0700
> > Subject: [BALUG-Test] "Oops" ... lists down for a bit (back by the time you see this?)
> > _______________________________________________
> > BALUG-Test mailing list -- balug-test(a)lists.balug.org
> > To unsubscribe send an email to balug-test-leave(a)lists.balug.org
>
>
> ---------- Forwarded message ----------
> From: Michael Paoli via BALUG-Test <balug-test(a)lists.balug.org>
> To: BALUG-Test <balug-test(a)lists.balug.org>
> Cc:
> Bcc:
> Date: Wed, 21 Aug 2024 03:50:08 -0700
> Subject: [BALUG-Test] Re: "Oops" ... lists down for a bit (back by the time you see this?)
> _______________________________________________
> BALUG-Test mailing list -- balug-test(a)lists.balug.org
> To unsubscribe send an email to balug-test-leave(a)lists.balug.org
Well ... at least that's good - looks like the mail part is probably
working fine.
I thought for a bit it wasn't with one quick test I ran ... but I may
not have been
patient enough with that or may have not done that test quite right.
So, looks like just web/postorius interface I need to get working again.
And by the way, the Django administration interface appears to be fine,
so looks like issue is limited to postorius, and between postorius and Apache.
So at least in theory, subscribe/unsubscribe via email should also be
working fine too.
On Wed, Aug 21, 2024 at 3:43 AM Michael Paoli via BALUG-Test
<balug-test(a)lists.balug.org> wrote:
>
>
>
>
> ---------- Forwarded message ----------
> From: Michael Paoli <michael.paoli(a)berkeley.edu>
> To: BALUG-Test <balug-test(a)lists.balug.org>
> Cc:
> Bcc:
> Date: Wed, 21 Aug 2024 03:42:38 -0700
> Subject: [BALUG-Test] "Oops" ... lists down for a bit (back by the time you see this?)
> "Oops" ... lists down for a bit.
> Drats ... and hopefully all better by the time this mail makes it to list.
>
> So ... all was fine and dandy until ...
> managed to have a host booboo,
> basically locked up solid, and I power cycled it
> (this was physical host upon which the VM was running).
> Shouldn't be any biggie ... after that all up and fine again ...
> except the Mailman 3 lists.
>
> So, ... isolating and working towards fixing that.
> I'm guestimating maybe issue with some kind of lock file or
> the like that didn't get cleaned up upon (re)boot,
> or possibly, since I perhaps didn't do as many reboots as I ought
> to have made sure it would "always" come up clean, perhaps there was
> some misconfiguration or the like that somehow snuck in, that would be
> effectively a latent defect/issue, and that wouldn't show until a reboot or
> attempted restart of the relevant service(s).
>
> At this point still troubleshooting to narrow down the issue.
>
> As far as I can tell so far, looks like web server hits some kind of issue
> and generally gives or passes along a 500 response, that then ends up
> with the relevant web page(s) failing. When I did deeper into the
> backend service(s) ... notably postorius, looks like it gets requests okay
> and responds to them okay ... but something goes wrong somewhere
> between there and web server properly getting and passing that
> along to client browser.
> E.g.:
> https://lists.balug.org/mailman3/postorius/lists/
> I see in postorius bit returning (from strace(1)) including:
> {\"display_name\": \"BALUG-Admin\", \"fqdn_listname\":
> \"balug-admin(a)lists.balug.org\",
> {\"display_name\": \"BALUG-Announce\", \"fqdn_listname\":
> \"balug-announce(a)lists.balug.org\",
> {\"display_name\": \"BALUG-Talk\", \"fqdn_listname\":
> \"balug-talk(a)lists.balug.org\",
> {\"display_name\": \"BALUG-Test\", \"fqdn_listname\":
> \"balug-test(a)lists.balug.org\",
> But somehow that doesn't make it to the web page the server serves ... so I'm
> guestimating there's some issue somewhere between postorius and Apache.
>
>
> ---------- Forwarded message ----------
> From: Michael Paoli via BALUG-Test <balug-test(a)lists.balug.org>
> To: BALUG-Test <balug-test(a)lists.balug.org>
> Cc:
> Bcc:
> Date: Wed, 21 Aug 2024 03:42:38 -0700
> Subject: [BALUG-Test] "Oops" ... lists down for a bit (back by the time you see this?)
> _______________________________________________
> BALUG-Test mailing list -- balug-test(a)lists.balug.org
> To unsubscribe send an email to balug-test-leave(a)lists.balug.org
"Oops" ... lists down for a bit.
Drats ... and hopefully all better by the time this mail makes it to list.
So ... all was fine and dandy until ...
managed to have a host booboo,
basically locked up solid, and I power cycled it
(this was physical host upon which the VM was running).
Shouldn't be any biggie ... after that all up and fine again ...
except the Mailman 3 lists.
So, ... isolating and working towards fixing that.
I'm guestimating maybe issue with some kind of lock file or
the like that didn't get cleaned up upon (re)boot,
or possibly, since I perhaps didn't do as many reboots as I ought
to have made sure it would "always" come up clean, perhaps there was
some misconfiguration or the like that somehow snuck in, that would be
effectively a latent defect/issue, and that wouldn't show until a reboot or
attempted restart of the relevant service(s).
At this point still troubleshooting to narrow down the issue.
As far as I can tell so far, looks like web server hits some kind of issue
and generally gives or passes along a 500 response, that then ends up
with the relevant web page(s) failing. When I did deeper into the
backend service(s) ... notably postorius, looks like it gets requests okay
and responds to them okay ... but something goes wrong somewhere
between there and web server properly getting and passing that
along to client browser.
E.g.:
https://lists.balug.org/mailman3/postorius/lists/
I see in postorius bit returning (from strace(1)) including:
{\"display_name\": \"BALUG-Admin\", \"fqdn_listname\":
\"balug-admin(a)lists.balug.org\",
{\"display_name\": \"BALUG-Announce\", \"fqdn_listname\":
\"balug-announce(a)lists.balug.org\",
{\"display_name\": \"BALUG-Talk\", \"fqdn_listname\":
\"balug-talk(a)lists.balug.org\",
{\"display_name\": \"BALUG-Test\", \"fqdn_listname\":
\"balug-test(a)lists.balug.org\",
But somehow that doesn't make it to the web page the server serves ... so I'm
guestimating there's some issue somewhere between postorius and Apache.
The archives of
balug-test3(a)lists.balug.org
have been merged into those of
balug-test(a)lists.balug.org
and the
balug-test3(a)lists.balug.org
list has been removed.
Yeah, I notice at least by default
on at least the Mailman 3 hyperkitty archive,
it "leaks" information to gravatar.com
using per-member URLs for images from gravatar.com ... ugh.
Yeah, looks like I'm not the first to notice this:
https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/thread/…
Anyway, one more item to add to migration checklist of things to do,
and from the above, looks like it's probably not too difficult to manage to do.
Michael Paoli wrote:
> Mailman 2 --> Mailman 3 ... largely (but not entirely) done; holler if
> one notes any issues.
>
> Summary on status/progres can be found here:
> https://www.wiki.balug.org/wiki/doku.php?id=system:mailman3#mailman_2_--_...
>
> There were some glitches, notably with import21 ...
> and did find a work-around(/fix?) for that,
> and also had some issues with archive import+indexing,
> but seem to have made it past all that now (though still have
> some more validation checks to do, etc.).
>
> And note that thus far, the merging of archive from the
> (temporary) balug-test3(a)lists.balug.org to
> balug-test(a)lists.balug.org is still pending.
> In the meantime, those archives are available here:
> https://lists.balug.org/mailman3/hyperkitty/list/balug-test3@lists.balug....
> And the balug-test3(a)lists.balug.org is also mostly locked down
> to prevent changes (no new postings or subscription/membership changes).
Mailman 2 --> Mailman 3 ... largely (but not entirely) done; holler if
one notes any issues.
Summary on status/progres can be found here:
https://www.wiki.balug.org/wiki/doku.php?id=system:mailman3#mailman_2_--_3_…
There were some glitches, notably with import21 ...
and did find a work-around(/fix?) for that,
and also had some issues with archive import+indexing,
but seem to have made it past all that now (though still have
some more validation checks to do, etc.).
And note that thus far, the merging of archive from the
(temporary) balug-test3(a)lists.balug.org to
balug-test(a)lists.balug.org is still pending.
In the meantime, those archives are available here:
https://lists.balug.org/mailman3/hyperkitty/list/balug-test3@lists.balug.or…
And the balug-test3(a)lists.balug.org is also mostly locked down
to prevent changes (no new postings or subscription/membership changes).