EDUCE: Enhanced Digital Unwrapping for Conservation and Exploration of Inaccessible Texts
This work addresses digital unwrapping of a class of damaged and fragile objects that are impenetrable for conventional imaging equipment and cannot easily be opened physically for a clear digitization. Examples include ancient scrolls, layered manuscripts etc. Many of these objects may carry precise contents which still remain a mystery. They have been sitting on shelves behind other more accessible things in conservators' archive and have largely been ignored. In most cases physical restoration is not considered to be a solution because it is too risky and unpredictable. This dilemma is well suited for advanced computer vision techniques to provide a safe and efficient solution. This research intends to develop a general approach that enables access to those impenetrable objects without the need to open or touch them.
We developed an approach to this restoration problem based on a 3D voxel set obtained from non-invasive penetrating scans, followed by a processing framework that is necessary to derive images of document surfaces revealing features such as ink and fiber. The post-scan processing addresses two key problems: three dimensional segmentation and surface unwrapping/flattening. It consists of a 2 1/2 dimensional active contour model, that assists with the three dimensional segmentation of the object model from the CT volumetric data, and a physically-based mass-spring model, that is constrained to simulate the dynamics of unwrapping the document model. We have implemented the key algorithms in the main framework and tested the approach on a series of experimental objects.
Most recently, we performed an experimental scan of a real manuscript from University of Michigan Library. The manuscript dates to 15th century and is known as a handwritten copy of a portion of Ecclesiastes -- a Biblical text -- written in Hebrew. The manuscript was cut into strips and glued together to form a binding used as the spine of a book. This process created a stack of multiple layers that are all stuck together. Without careful physical intervention by a conservator, it is impossible to read text on the inner layers. Our results successfully reveal texts on a whole inner layer of the manuscrpt without physically splitting the layers.