[Expo-tech] splays - survex legs, import bugs

Philip Sargent (Gmail) philip.sargent at gmail.com
Fri Jun 12 01:55:25 BST 2020


btw
The total lengths are just the totals of all the tape measurements, and the
troggle import has no understanding of the "*flags splays" directive which
is why the total lengths are too big.

- In fact I can't see that it understand that some legs are pitches either.
- Neither does it distinguish between surface and underground legs.
- It does ignore sections of data that are LRUD.

But quite apart from all that, it should be producing the same lengths when
the numbers are added up differently - and it doesn't. 

So there are a number of bugs to find in the parser yet. 
Of course this is a low priority as such, except that these bugs could
indicate something more serious (such as entirely ignored files).

We should really be getting the lengths of cave by parsing the output of
'cavern' for each svx file rather than calculating them ourselves in troggle
but that's a task for another day.

Philip
PS These are real import problems:
	caves - ! Entrance text (slug) 1623-2014-SD-01 missing in file
1623-274.html
	caves - ! 2 letter found, no more than 1 expected in file
1623-292.html
	caves - ! 2 entranceslug found, no more than 1 expected in file
1623-292.html
	caves - ! 0 entrance found, at least 1 expected in file
1626-359.html


-----Original Message-----
From: Philip Sargent (Gmail) [mailto:philip.sargent at gmail.com] 
Sent: 12 June 2020 01:03
To: expo-tech at lists.wookware.org
Subject: Changes to parsing survex blocks, legs and stations.

Tested and currently running on the server is a change to how survex data
is stored.

Survex legs are no longer stored in the database, instead the db only
stores the number of legs per block (every begin-end block of survey
stations) and their total length.

The output page (there is only one) has been modified to match:
http://expo.survex.com/experimental 
As a consequence the database is 30% smaller and importing survex files is
33% faster.

But from the data stored we will still be able to calculate the total
surveyed-length and number of stations by every caver in this (incomplete)
page: http://expo.survex.com/people
Storing all the legs seems to have been an idea which was implemented but
never actually used for anything.

As you will see from http://expo.survex.com/experimental there are some
data errors and parsing errors to track down. These are not new but we
haven't noticed them before.

The next step is to stop troggle storing every single survey station. This
is never used except for the 600 or so entrances which go on the
prospecting guide (which is currently not working). This should have a
much more substantial size reduction and speed improvement.

Philip


-----Original Message-----
From: Philip Sargent (Gmail) [mailto:philip.sargent at gmail.com] 
Sent: 11 June 2020 17:47
To: 'Wookey'
Cc: 'Sam Wenham'; 'Michael Sargent'
Subject: good feeling about this..

Ahem. Results of a bit of grepping:

The 'survexleg' data is actually *used* in only *one* place in the whole
troggle system, when all the tape lengths are added up and displayed here:
http://expo.survex.com/experimental
and presumably was intended to be added up and used for the length
explored by each caver here:
http://expo.survex.com/people

The *only* place that the SurvexStation objects are *used* is to plot the
x/y positions on the prospecting map. Which is about 600 points in
core/views_other.py . These are all *entrances* - not underground
surveystations - plus a few fixed points.

So We can save storing 36,000 x 2 table rows (stations and legs) in the
database and retain all current functionality just by keeping a running
total of lengths on the import. Which will also reduce the size of the
database very significantly.

I will test this first of course.
This should cut the import time on my machine from about 30 mins to about
6 mins. Most of the time saving will be in the SQL transaction of putting
the data into the database.

The earlier refactoring I did (creating the class MapLocations) when I
reviewed the 3d positions turns out to be helpful.

Philip
PS I do recommend the book "Refactoring" by Martin Fowler.




More information about the Expo-tech mailing list