[Expo-tech] loser git conversion
Wookey
wookey at wookware.org
Sun Apr 19 17:09:32 BST 2020
On 2020-04-19 16:54 +0100, Mark Shinwell wrote:
> I've spent rather too long on this today and yesterday, so will continue next
> week.
These things always take way longer than you expect don't they?
> I've constructed a git repo which is mostly done, with tags placed after
> each year; it starts in 2001 from the CVS commits. However I think I've made a
> mistake during the rebasing with some of the top-level files which I need to
> fix before the whole dataset will process. I'm also going to try processing it
> at each of the tags (post-2001, post-2002 etc).
Good idea. I only started writing down the annual checkpoints recently
(after 2013) (in docs/lengthcheckpoints) but in theory there should be
one every year. May not be entirely obvious where...
> The import of the hg changesets was more problematic than expected and a lot of
> conflicts had to be fixed.
You mean the hg stuff conflicted with the CVS import? Odd, it
shouldn't have. The other three hg imports have been trivial, but then
we weren't adding stuff on the start in those cases.
> I have managed to move the ARGE 2012 updates, which previously weren't merged
> until after Expo 2013 (confusingly), so they are actually before Expo 2013.
> There may still be some places where changesets aren't in the correct
> by-year-chronological place but I don't think this should be on a large scale,
> and it isn't likely to be worth fixing I suspect (the other cases I've seen
> would involve splitting changesets, which is likely to have other
> ramifications). Going forward I suggest we try to keep data in the right place
> between year boundaries, but not worry about the exact ordering within a year.
Yeah. I only worry about there being a point we can tag _somewhere_
between expos as 'definitive' for that year (normally corresponding
with the reported lengths for the year).
> Keeping this as a linear history branch is probably the right thing to do, but
> I'm not completely certain yet. I also think we should carefully consider the
> proposed workflow and how to keep the repository in a good as state as possible
> automatically. This should also include discussing with other people (maybe
> just ARGE?) to ensure that they're on the same page.
ARGE (Thomas Holder in practice for keeping CUCC/ARGE data in sync)
are already keen on a move to git, and are just waiting for us to get round to it.
> Large imports in blocks
> from other sources aren't good for keeping things in order, as seen in the 2012
> ARGE example.
He and I do a reasonable job of merging our finds each year. I think
2012 was probably when Thomas took over (before that it was just me
getting a repo copy from ARGE and sucking their changes in). I've not
done 2019 yet as I was planning to do it after the repo conversion,
which of course has ended up later in the year than is ideal.
But rearranging history is _much_ easier with git so physical timing
is less of an issue. (And we may have a whoile extra year to finish this...)
> The combined git repo and working copy is about 80% of the size of the
> Mercurial one. I decided against filtering every changeset as I'll have to
> recreate all the tags again most likely, and it turns out the various useless
> files were mostly removed long ago (and I also prevented some of them getting
> in right at the start of the history).
As I say nuking them from the repo with bfg to get the space back (if
it's significant) is remarkably quick and
easy. https://rtyley.github.io/bfg-repo-cleaner/
But I don't think the cruft in loser is much of a problem as it's quite small.
I may have a look just to check this evg as I'm rapidly becoming an expert in
this game :-)
Wookey
--
Principal hats: Linaro, Debian, Wookware, ARM
http://wookware.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.wookware.org/pipermail/expo-tech/attachments/20200419/2c3dca4b/attachment.sig>
More information about the Expo-tech
mailing list