Summary: The new version of Uniplate has several wrappers for working with abstract data types, such as Map from the containers package. These wrappers let you transform abstract data types without breaking their invariants.Abstract data types, such as
Map from the containers package, hide their internal structure so they can maintain invariants necessary for their correct operation. Generic programming, such as the Data class used by SYB, allows programs to generically traverse data types, without detailed knowledge of their structure. The two are somewhat incompatible - if
Map allows manipulating it's internal structure, the invariants could be broken causing
Map to perform incorrectly.
Using the
Data class, a
Map String Int claims it has no constructors, but if you operate on it with
gfoldl is behaves as though it were
[(String,Int)]. When Uniplate traverses the
Map, such as when searching for an
Int, it first analyses the available constructors to see if a
Map String Int can possibly contain an
Int. Since
Map has no constructors, it concludes that
Map cannot contain an
Int, and Uniplate fails to operate on the contained
Int's.
For people who use the Data-style Uniplate module, using the new version of Uniplate you can now correctly operate over
Map String Int, provided you use the newtype
Map wrapper from
Data.Generics.Uniplate.Data.Instances. When you transform over
Bool (which does not touch the
Map) it will ignore the
Map and take
O(1). When you transform over
Int it will reconstruct the
Map in
O(n), and if you transform
String or
Char it will reconstruct the
Map in
O(n log n). Regardless of what operations you do, it will work efficiently and correctly. As an example:
$ import qualified Data.Map as Map
$ import Data.Char
$ import Data.Generics.Uniplate.Data
$ import Data.Generics.Uniplate.Data.Instances
$ fromMap $ transformBi toUpper $ toMap $ Map.fromList [("haskell",12),("test",18)]
fromList [("HASKELL",12),("TEST",18)]
There are two approaches for dealing with the
Map problem in Uniplate. For users of the Direct-style Uniplate module, there is a function
plateProject, which has been available for some time. For users of the Data-style Uniplate module, or for users of the SYB package, there is a new module
Data.Generics.Uniplate.Data.Instances in uniplate-1.6.4 (released today) which provides three types with special
Data instances (
Hide,
Trigger,
Invariant). Using these three types we can construct wrappers providing Data instances for abstract types, and Uniplate includes wrappers for several of the types in the containers package (
Map,
Set,
IntMap,
IntSet).
plateProjectThe plateProject function helps you define Direct Uniplate instances for abstract types. Instead of defining how to examine the data type, you instead define how to transform the data type into one you can examine, and how to transform it back. As an example:
instance Biplate (Map.Map [Char] Int) Char where
biplate = plateProject Map.toList Map.fromList
If the types ensure that any operations will not change the keys we can optimise and use the
fromDistictAscList function to reconstruct the
Map:
instance Biplate (Map.Map [Char] Int) Int where
biplate = plateProject Map.toAscList Map.fromDistinctAscList
HideThe
Hide data type is useful for wrapping values that you want to ignore with Uniplate. The type is defined as:
data Hide a = Hide {fromHide :: a}
But has a
Data instance that pretends it is defined using the extension
EmptyDataDecls as:
data Hide a
As an example of avoiding particular values, you can write:
transformBi (+1) (1, 2, Hide 3, Just 4) == (2, 3, Hide 3, Just 5)
As a result of having no constructors, any calls to the methods
toConstr or
gunfold will raise an error.
TriggerThe
Trigger data type is useful for determining when a value was constructed with the
Data methods. It is defined as:
data Trigger a = Trigger {trigger :: Bool, fromTrigger :: a}
But the
Data instance pretends that it is defined as:
data Trigger a = Trigger a
However, whenever
gfoldl or
gunfold constructs a new value, it will have the
trigger field set to
True. The trigger information is useful to indicate whether any invariants have been broken, and thus need fixing. As an example:
data SortedList a = SortedList (Trigger [a]) deriving (Data,Typeable)
toSortedList xs = SortedList $ Trigger False $ sort xs
fromSortedList (SortedList (Trigger t xs)) = if t then sort xs else xs
This data type represents a sorted list. When constructed the items are initially sorted, but operations such as
gmapT could break that invariant. The
Trigger type is used to detect when the Data operations have been performed, and resort the list.
The
Trigger type is often used in conjunction with
Invariant, which fixes the invariants.
InvariantThe
Invariant data type is useful for ensuring that an invariant is always applied to a data type. It is defined as:
data Invariant a = Invariant {invariant :: a -> a, fromInvariant :: a}
But the
Data instance pretends that it is defined as:
data Invariant a = Invariant a
Whenever a
gfoldl constructs a new value, it will have the function in the
invariant field applied to it. As an example:
data SortedList a = SortedList (Invariant [a]) deriving (Data,Typeable)
toSortedList xs = SortedList $ Invariant sort (sort xs)
fromSortedList (SortedList (Invariant _ xs)) = xs
Any time an operation such as
gmapT is applied to the data type, the
invariant function is applied to the result. The
fromSortedList function can then rely on this invariant.
The
gunfold method is partially implemented - all constructed values will have an undefined value for all fields, regardless of which function is passed to
fromConstrB. If you only use
fromConstr (as Uniplate does) then the
gunfold method is sufficient.
MapUsing the
Hide,
Trigger and
Invariant types, we can define a wrapper for the containers
Map type as:
newtype Map k v = Map (Invariant (Trigger [k], Trigger [v], Hide (Map.Map k v)))
deriving (Data, Typeable)
The
Map type is defined as an
Invariant of three components - the keys, the values, and the underlying
Map. We use
Invariant to ensure that the keys/values/map always remain in sync. We use
Trigger on the keys and values to ensure that whenever the keys or values change we rebuild the
Map, but if they don't, we reuse the previous value. The function to extract a containers
Map from the wrapper requires only simple pattern matching:
fromMap (Map (Invariant _ (_,_,Hide x))) = x
The function to wrap a containers
Map is slightly harder, as we need to come up with an invariant restoring function:
toMap :: Ord k => Map.Map k v -> Map k v
toMap x = Map $ Invariant inv $ create x
where
create x = (Trigger False ks, Trigger False vs, Hide x)
where (ks,vs) = unzip $ Map.toAscList x
inv (ks,vs,x)
| trigger ks = create $ Map.fromList $ zip (fromTrigger ks) (fromTrigger vs)
| trigger vs = create $ Map.fromDistinctAscList $ zip (fromTrigger ks) (fromTrigger vs)
| otherwise = (ks,vs,x)
The
create function creates a value from a
Map, getting the correct keys and values. The
inv function looks at the triggers on the keys/values. If the keys trigger has been tripped, then we reconstruct the
Map using
fromList. If the values trigger has been tripped, but they keys trigger has not, we can use
fromDistinctAscList, reducing the complexity of constructing the
Map. If nothing has changed we can reuse the previous value.
The end result is that all Uniplate (or SYB) traversals over
Map result in a valid value, which has had all appropriate transformations applied.