<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>HacktionLabers,<br>
<br>
</p>
<div class="moz-forward-container">
<p>If you know a website is going to change significanctly or
disappear from the web then worth installing the Wayback Machine
browser plugin <br>
<a class="moz-txt-link-freetext"
href="https://addons.mozilla.org/en-GB/firefox/addon/wayback-machine_new/">https://addons.mozilla.org/en-GB/firefox/addon/wayback-machine_new/<br>
</a><a class="moz-txt-link-freetext" href="https://microsoftedge.microsoft.com/addons/detail/wayback-machine/kjmickeoogghaimmomagaghnogelpcpn">https://microsoftedge.microsoft.com/addons/detail/wayback-machine/kjmickeoogghaimmomagaghnogelpcpn</a><br>
<a class="moz-txt-link-freetext" href="https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak">https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak</a>
<br>
</p>
<p> that you can set to trigger saves to web.archive.org as you
browse the site, but be warned don't leave this feature on as
security as well a privacy risk, when you then browse to site
that store credentials in urls; yes, some still do this. <br>
</p>
<p>Also there are other browser plugins that save to archives such
as <a
href="https://web.archive.org/web/20220814004816/https://archive.is/">archive.is</a>
in addtion to the Wayback Machine.</p>
<p><br>
</p>
<p>If anyone wants to convert a Word Press site into static pages
using HUGO <a class="moz-txt-link-freetext"
href="https://gohugo.io/">https://gohugo.io/</a> I can share
my experiance of doing this.</p>
You can use script or a plugin to export from WP as MarkDown &
config ready for HUGO, but there is a way to convert WP db
backups.
<p></p>
<p>cheers</p>
<p>Micah aka sb</p>
<p><a class="moz-txt-link-freetext" href="https://J12.org/sb/">https://J12.org/sb/</a></p>
</div>
<div class="moz-cite-prefix">On 18/07/2023 11:59, Charlie Harvey
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:e1250a86-5cb8-4924-eb0f-83fe401698f7@newint.org">
<pre class="moz-quote-pre" wrap="">Hi,
For the sake of completeness, here are some other wget params that can
be useful:
--wait 1 to put a delay between page fetches (if killing your server
may be an issue)
-e robots=off to ignore robots.txt
-c to continue if you get halfway through and need to restart
--user-agent="Mozilla" if the site has cloudflare in front of it (they
block wget, curl et al by their UA name)
Cheers,
On 18/07/2023 11:14, Mike Harris wrote:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Thanks for the suggestions all. I will try the wget command first as
my need is to set up a new WP site for them, whilst providing a static
archive of their original site, and then they can link to their old docs.
The original site was built using some bespoke hosting company’s thing,
called “Webs” or similar, they then got bought by VistaPrint, and then
some of the little (maverick) sites like this one (a district
association of allotment associations) have been told their sites are
going dark with (apparently) no offer of an archive of their site or
anything … grrrr >:-(
Mike Harris
XtreamLab
W: <a class="moz-txt-link-freetext" href="https://XtreamLab.net">https://XtreamLab.net</a>
T: +44 7811 671 893
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">On 18 Jul 2023, at 10:22, Nick Sellen <a class="moz-txt-link-rfc2396E" href="mailto:hacktionlab@nicksellen.co.uk"><hacktionlab@nicksellen.co.uk></a>
wrote:
Also worth a mention of the webarch service to do this -->
<a class="moz-txt-link-freetext" href="https://archived.website/">https://archived.website/</a> (which uses httrack
<a class="moz-txt-link-freetext" href="https://www.webarchitects.coop/archiving">https://www.webarchitects.coop/archiving</a>)
------- Original Message -------
On Tuesday, July 18th, 2023 at 08:57, m3shrom <a class="moz-txt-link-rfc2396E" href="mailto:m3shrom@riseup.net"><m3shrom@riseup.net></a> wrote:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">This has some good content
<a class="moz-txt-link-freetext" href="https://www.stevenmaude.co.uk/posts/archiving-a-wordpress-site-with-wget-and-hosting-for-free">https://www.stevenmaude.co.uk/posts/archiving-a-wordpress-site-with-wget-and-hosting-for-free</a>
It's focused on wordpress but potentially relevant for other content.
Sample command I used for a wp network.
wget --page-requisites --convert-links --adjust-extension --mirror
--span-hosts
--domains=mcrblogs.co.uk,<a class="moz-txt-link-abbreviated" href="http://www.mcrblogs.co.uk,edlab.org.ukmcrblogs.co.uk/afrocats">www.mcrblogs.co.uk,edlab.org.ukmcrblogs.co.uk/afrocats</a>
nice one
mick
On 17/07/2023 23:23, Mike Harris wrote:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Hi all, but especially Mick,
Last year Mick gave a talk on recovering the old Schnews website and producing a static version of it by a certain clever use of curl or wget.
What’s the best command to get a complete functional static version of the entirety of a website for all linked to content?
I ask because I need to grab a site for someone that’s about to ‘go dark’ and no one can get the details to login and get to the file system side of things.
Cheers,
Mike.
Mike Harris
XtreamLab
W: <a class="moz-txt-link-freetext" href="https://XtreamLab.net">https://XtreamLab.net</a>
T: +44 7811 671 893
_______________________________________________
HacktionLab mailing list
<a class="moz-txt-link-abbreviated" href="mailto:HacktionLab@lists.aktivix.org">HacktionLab@lists.aktivix.org</a>
<a class="moz-txt-link-freetext" href="https://lists.aktivix.org/mailman/listinfo/hacktionlab">https://lists.aktivix.org/mailman/listinfo/hacktionlab</a>
</pre>
</blockquote>
</blockquote>
<br>
</blockquote>
</blockquote>
</blockquote>
<pre class="moz-signature" cols="72">--
--
<a class="moz-txt-link-freetext" href="https://j12.org/micah/">https://j12.org/micah/</a></pre>
</body>
</html>