[Expo-tech] files to munge whilst converting VCS
Wookey
wookey at wookware.org
Tue Apr 14 04:11:26 BST 2020
On 2020-04-13 15:57 +0100, Philip Sargent (Gmail) wrote:
> A fine amount of work – and yes I see a couple of those were my fault, put in
> expoweb when I should have put them in expofiles.
>
>
>
> re Brendan’s guides PDFs
> documents/tunnel-guide.pdf
> documents/survex-guide.pdf
Turns out that these actually get bigger, not smaller,
using my simple reprocessor, so I left them alone.
But yes they should probably be in expofiles rather than the web repo
as non-editable blobs.
We don't have a clear policy on when to put things in the website
checkout and when to just reference them in expofiles. There are
arguments for both, and I often find it hard to decide myself. On the
one hand it's nice if a web checkout gets you 'the whole website',
which is currently both the cave info and all the handbook info. On
the other hand putting documents like this that are likely to get
updated over time by blob-replacement, into expofiles avoids bloating
the repo. For historical documents (e.g. reports for one year) I'm
more inclined to put them in the website as part of the historical
record.
> Can I suggest putting them in original size in expofiles/documents rather than
> re-packing them?
Done. (as I say the repacking turned out to be null).
> There may be other things like that, some of the “poster sized” images ? I
> wouldn’t know so I’m just guessing.
There are a few of these, although I found fewer than I expected.
> I have (somewhere) Brendan’s original PPT versions of these and it’s “on my
> list” to re-render these into the handbook as HTML so that they can be updated
> more easily when the software gets updated, or when our expo-specific workflow
> changes or gets better. Such as using *REF now or *QM in future.
That is useful, cheers.
> OK, we should store them as .odt or whatever,
(.odp for presentations)
> Otherwise I strongly agree with Mark. For the sake of a few megabytes we really
> do not want to be permanently deleting information.
I'm not permentently deleting anything. Just removing pointless copies
of things or having them at an appropriate web resolution when they
are website docs, or printer resolution when they are printable docs.
All the images are already in the photo bucket at full original
resolution. There is no need for them to be copied into (sometimes
multiple) PDFs at those resolutions when you can't even see it. And
_especially_ not in files that have in fact since been deleted from
the repository so they only exist to illustrate the accident of their
original checkin. (like Brendan's PDFs above are about to be).
And it's not just a few MB, it's hundreds, which everyone gets every time
they check out the repo. That website checkout really shouldn't be more than
1GB. The actual website is 388MB, a reasonable size for what it is.
> The cost of a MB is all in
> its management not the actual size, and shrinking a large PDF smaller does not
> affect the management cost.
Space on servers, and local machines, costs money and/or faff; there
is no point wasting it gratuitously. We just had to ask for more space
on the server which costs goodwill, rather than actual money at the
moment, but the principle is the same. Even locally where space is
cheap it's not irrelevant: I have 8 copies of expoweb on this laptop
for example, taking up 8.1G, and the disc is 99% full with 7.9G
free. Each of those copies being a few hundred MB smaller with no
visible loss of fidelity, and less cruft, is worth having.
We only get to tidy up some of our mess once every decade or two, so I
think it's worth taking the opportunity, and have done the work, so
there really isn't much point arguing about it.
> PS The whole troggle database is only 10 MB after importing everything.
> Shouldn’t we be using an in-memory database these days instead of MySQL ? Or is
> that not an option with django?
You can use postgres (best), mysql/maria (adequate) or sqlite (not
very robust due to single-threading). How those databases manage their
memory I don't know, but they may well keep lots in memory if they can
by default. There are no doubt knobs to tweak about that sort of
thing.
Wookey
--
Principal hats: Linaro, Debian, Wookware, ARM
http://wookware.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.wookware.org/pipermail/expo-tech/attachments/20200414/dfca510a/attachment.sig>
More information about the Expo-tech
mailing list