InterLap: Discontinuous Ranges
Table of Contents generated with DocToc
Data Structures
-
Interlap
, a derivative of JSArray
- IOW a list
- whose elements are in turn
Segment
s, another derivative of JSArray
- segments are pairs
[ lo, hi, ]
where both are finite or infinite numbers - invariant
s.lo <= s.hi
always holds for alls instanceof Segment
-
s.lo == s.hi
denotes a single point (a segment withs.size == 1
)
[ [ 4, 6 ], [ 7, 7, ], [ 11, 19, ], ]
-
segments may be merged into an
Interlap
instance in any order usingunion()
-
this will return a new
Interlap
instance with the minimum of segments needed to cover all and only the points covered by the union of all the segments -
likewise, an
Interlap
instance may be merged with (the segments of) anotherInterlap
instance -
the segments of an
Interlap
instance are always ordered:- when comparing two segments
a
,b
, the segments with the lowerlo
comes first - in case
a.lo == b.lo
, then the one with the lowers.hi
comes first - in segments of
Interlap
instances it always suffices to compare segments for inequality of their lower bounds
- when comparing two segments
-
Discontinuous ranges are expressed by
Interlap
instances ('interlaps') -
that contain
Segment
instances -
they are basically just lists (
Array
instances) but- validated (segments must be pairs of numbers and so on) and
- frozen (so validity itself becomes invariant as long as identity holds)
-
therefore
- we can always turn interlaps into lists, and/but
- we can turn some suitably shaped lists into interlaps
- therefore, although interlaps look like lists and quack like lists, they are not just lists
- hence,
equals my_list, my_interlap
must always befalse
- at best,
equals ( as_list my_interlap ), my_list
or some kind ofis_equivalent my_list, my_interlap
(with a suitably definedis_equivalent()
method) can hold
Coding Principles
Note—The below are some points I have been wanting to write down for some time; they are rather about the
implementation pattern used in interlap/main.coffee
in general rather than Interlap
objects in
particular, though lasp
and segments
are used as examples.
- Classes, instances are largely 'passive'
- interesting methods all in stateless library with pure functions
- shallow extensions of standard types (
Array
in this case) Note this will probably change in a future version; see To Do, below -
therefore can always be replaced w/ standard objectsconversion from and to standard types possible (segments
can be turned into lists of two elements[ s.lo, s.hi, ]
;laps
: can be turned into (possibly empty) lists ofsegments
) - observe that multiple meaningful and information preserving casts per target data type are always
possible, for example,
new Segment [ 0x4e01, 0x9fff, ]
could be turned into any of[ 19969, 40943, ]
,{ lo: 19969, hi: 40943, }
,{ first: 19969, last: 40943, }
,'U+4e01-U+9fff'
,/[丁-鿯]/
. There's no one true and unique representation, although there may be some preferred form(s) and some forms that are only supported by some methods - in case users should wish to use methods like
[].join()
,[].map()
,[].reduce()
and so forth onsegments
orlaps
, they should be aware that for all their arrayish listfulness,segments
andlaps
are not lists. From the outset, none of the mutating methods ([].sort()
,[].push()
) can be used with their usual semantics becausesegments
andlaps
are immutable. In the case ofsegments
,segment.push()
does not make sense because each segment must always have exactly to elements; that we have decided to implemented as elements with indexes0
and1
is an implementation detail. In the case oflaps
,lap.push segment
is conceivable and could return a new lap—but that should be no different from the output ofLAP.union lap, segemt
and would thus add duplication instead of usefulness to the API. Moreover, whileLAP.union()
is a somewhat unfortunate name for a function (since it is a noun, not a verb),lap.push()
is considerably worse (it looks like it could mutatelap
, which it can't, and it sounds like it would takc something to the end oflap
, which it won't). - validation by (implicit) instantiation
- instances are immutable (frozen)
- duties of instances:
- carriers of a few standard attributes (
d.size
in this case) - serve as caching mechanism (instances may hold references to objects that implement functionalities)
- 'being an instance of a class' serves as 'product certification label'; given we allow only valid
inputs to build expected structures (and assuming absence of bugs), then—since instances are frozen—we
can be sure at any later point in time that a
d
for whichd instanceof D
holds is also valid; there is no change management
- carriers of a few standard attributes (
Related
- drange (used to perform InterLap range arithmetics)
@scotttrinh/number-ranges
- drange-immutable
To Do
- [ ] implement
intersection()
- [ ] consider to not base
Segment
,Interlap
onArray
(instead, use no particular class or else maybeMultimix
); this would rid the API of all the spurious methods tacked onto what is intended to be pure data objects; observe thatd = new Interlap(); d.push 42
will throw an error in strict mode becaused
is frozen; other methods might return surprising/meaningless results; manipulation oflaps
,segments
at any rate intended to happen through library functions, not object methods