<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>HacktionLabers,<br>

      <br>

    </p>

    <div class="moz-forward-container">

      <p>If you know a website is going to change significanctly or

        disappear from the web then worth installing the Wayback Machine

        browser plugin <br>

        <a class="moz-txt-link-freetext"

href="https://addons.mozilla.org/en-GB/firefox/addon/wayback-machine_new/">https://addons.mozilla.org/en-GB/firefox/addon/wayback-machine_new/<br>

        </a><a class="moz-txt-link-freetext" href="https://microsoftedge.microsoft.com/addons/detail/wayback-machine/kjmickeoogghaimmomagaghnogelpcpn">https://microsoftedge.microsoft.com/addons/detail/wayback-machine/kjmickeoogghaimmomagaghnogelpcpn</a><br>

<a class="moz-txt-link-freetext" href="https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak">https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak</a>

        <br>

      </p>

      <p> that you can set to trigger saves to web.archive.org as you

        browse the site, but be warned don't leave this feature on as

        security as well a privacy risk, when you then browse to site

        that store credentials in urls; yes, some still do this. <br>

      </p>

      <p>Also there are other browser plugins that save to archives such

        as <a

          href="https://web.archive.org/web/20220814004816/https://archive.is/">archive.is</a>

        in addtion to  the Wayback Machine.</p>

      <p><br>

      </p>

      <p>If anyone wants to convert a Word Press site into static pages

        using HUGO <a class="moz-txt-link-freetext"

          href="https://gohugo.io/">https://gohugo.io/</a> I can share

        my experiance of doing this.</p>

      You can use script or a plugin to export from WP as MarkDown &

      config ready for HUGO, but there is a way to convert WP db

      backups.

      <p></p>

      <p>cheers</p>

      <p>Micah aka sb</p>

      <p><a class="moz-txt-link-freetext" href="https://J12.org/sb/">https://J12.org/sb/</a></p>

    </div>

    <div class="moz-cite-prefix">On 18/07/2023 11:59, Charlie Harvey

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:e1250a86-5cb8-4924-eb0f-83fe401698f7@newint.org">

      <pre class="moz-quote-pre" wrap="">Hi,

For the sake of completeness, here are some other wget params that can

be useful:

--wait 1  to put a delay between page fetches (if killing your server

may be an issue)

-e robots=off to ignore robots.txt

-c  to continue if you get halfway through and need to restart

--user-agent="Mozilla"  if the site has cloudflare in front of it (they

block wget, curl et al by their UA name)

Cheers,

On 18/07/2023 11:14, Mike Harris wrote:

</pre>

      <blockquote type="cite">

        <pre class="moz-quote-pre" wrap="">Thanks for the suggestions all.  I will try the wget command first as

my need is to set up a new WP site for them, whilst providing a static

archive of their original site, and then they can link to their old docs.

The original site was built using some bespoke hosting company’s thing,

called “Webs” or similar, they then got bought by VistaPrint, and then

some of the little (maverick) sites like this one (a district

association of allotment associations) have been told their sites are

going dark with (apparently) no offer of an archive of their site or

anything … grrrr >:-( 

Mike Harris

XtreamLab

W: <a class="moz-txt-link-freetext" href="https://XtreamLab.net">https://XtreamLab.net</a>

T: +44 7811 671 893

</pre>

        <blockquote type="cite">

          <pre class="moz-quote-pre" wrap="">On 18 Jul 2023, at 10:22, Nick Sellen <a class="moz-txt-link-rfc2396E" href="mailto:hacktionlab@nicksellen.co.uk"><hacktionlab@nicksellen.co.uk></a>

wrote:

Also worth a mention of the webarch service to do this -->

<a class="moz-txt-link-freetext" href="https://archived.website/">https://archived.website/</a> (which uses httrack

<a class="moz-txt-link-freetext" href="https://www.webarchitects.coop/archiving">https://www.webarchitects.coop/archiving</a>)

------- Original Message -------

On Tuesday, July 18th, 2023 at 08:57, m3shrom <a class="moz-txt-link-rfc2396E" href="mailto:m3shrom@riseup.net"><m3shrom@riseup.net></a> wrote:

</pre>

          <blockquote type="cite">

            <pre class="moz-quote-pre" wrap="">This has some good content

<a class="moz-txt-link-freetext" href="https://www.stevenmaude.co.uk/posts/archiving-a-wordpress-site-with-wget-and-hosting-for-free">https://www.stevenmaude.co.uk/posts/archiving-a-wordpress-site-with-wget-and-hosting-for-free</a>

It's focused on wordpress but potentially relevant for other content.

Sample command I used for a wp network.

wget --page-requisites --convert-links --adjust-extension --mirror

--span-hosts

--domains=mcrblogs.co.uk,<a class="moz-txt-link-abbreviated" href="http://www.mcrblogs.co.uk,edlab.org.ukmcrblogs.co.uk/afrocats">www.mcrblogs.co.uk,edlab.org.ukmcrblogs.co.uk/afrocats</a>

nice one

mick

On 17/07/2023 23:23, Mike Harris wrote:

</pre>

            <blockquote type="cite">

              <pre class="moz-quote-pre" wrap="">Hi all, but especially Mick,

Last year Mick gave a talk on recovering the old Schnews website and producing a static version of it by a certain clever use of curl or wget.

What’s the best command to get a complete functional static version of the entirety of a website for all linked to content?

I ask because I need to grab a site for someone that’s about to ‘go dark’ and no one can get the details to login and get to the file system side of things.

Cheers,

Mike.

Mike Harris

XtreamLab

W: <a class="moz-txt-link-freetext" href="https://XtreamLab.net">https://XtreamLab.net</a>

T: +44 7811 671 893

_______________________________________________

HacktionLab mailing list

<a class="moz-txt-link-abbreviated" href="mailto:HacktionLab@lists.aktivix.org">HacktionLab@lists.aktivix.org</a>

<a class="moz-txt-link-freetext" href="https://lists.aktivix.org/mailman/listinfo/hacktionlab">https://lists.aktivix.org/mailman/listinfo/hacktionlab</a>

</pre>

            </blockquote>

          </blockquote>

          <br>

        </blockquote>

      </blockquote>

    </blockquote>

    <pre class="moz-signature" cols="72">-- 

--

<a class="moz-txt-link-freetext" href="https://j12.org/micah/">https://j12.org/micah/</a></pre>

  </body>

</html>