openn.library.upenn.edu
:0284
), underscore, and a serial number (e.g., 0003
). Each of thefiles that share a base name is a different version of the same image.Serial numbers are in a natural order, such as book page order. Forexample, if an entire book has been imaged including cover, then thefirst serial number (0000
) is assigned to the outside front cover,the second serial number (0001
) to the inside front cover, and so on.<facsimile>
. See below for more information ondocument descriptions..tif
or JPEG .jpg
. There arethree derivative types. They are:web
for the WEB JPEG, and thumb
for the thumbnailJPEG. The master image has no tag.0284
:.xmp
extension:<facsimile>
section. Note this fragment fromljs168_TEI.xml:ljs319
), which contains package metadata, and the data itself, foundhere in the directory ljs319/data
. The data
directory containsthe manuscript description and the image files and theirmetadata. Each of these is described below.data/master
, data/thumb
,and data/web
directories. All of these images are listed in the<facsimile>
section of the TEI manuscript description. Any otherfiles provided with the document, like color and ruler referenceshots, are included in the data/extra
directory in master
,thumb
, and web
sub-directories.data
directory and the packagemetadata.manifest-sha1.txt
andversion.txt
. The first lists each file in the data directory withits SHA-1 checksum. The second provides information for the packageversion.manifest-sha1.txt
file that lists each file in the package's data directory with itsSHA-1 checksum.manifest-sha1.txt
follows the format of the outputof the GNU sha1sum
program:sha1sum
or a similar command-line utility.sha1sum
on a file will print its checksum and name:data/ljs319_TEI.xml
by sha1sum
is identical to the one listed in the above excerpt fromthe manifest-sha1.txt
file.Sha1sum
can also be used with the -c
flag to check an entiremanifest:version.txt
file in its top-leveldirectory:version.txt
file for LJS 319.version
: three-part semantic version number; e.g., 1.0.0
,1.0.1
, or 1.1.0
.date
: timestamp of this version's creationid
: database identifier of this versiondocument
: database identifier of the package documentdescription
: the reason for this version1.0.0
to 1.0.1
) indicates apatch or correction that does not add or remove data or metadata.The package remains compatible with applications built on the previousversion of the package. An example of a patch change would be aspelling correction in metadata.1.0.0
to 1.1.0
), indicates theaddition of new data or metadata. The package will be work withapplications built on the previous version. An example of a minorchange would be the addition of new metadata to the document'smanuscript description or the addition of new images to the dataset. While the new version will work as before, it may be desirable toupdate software to take advantage of new data.1.1.0
to 2.0.0
) indicates theremoval of data or metadata or other substantive change that willlikely cause this version to not work with software built on aprevious version of the package.ljs319_TEI.xml
provides descriptive and structuralmetadata for each document. The file is stored and named as follows:.xmp
sidecar file for each image:creator
-- person or organization responsible for creating theimagedate
-- date of the creation of this version of the image,including metadatadescription
-- brief description of the image contentformat
-- MIME type of the image, either image/tiff
orimage/jpeg
identifier
-- unique identifier for the master image and itsderivativespublisher
-- person or organization responsible for publication ofthe imagerelation
-- a related resourcerights
-- access rightssubject
-- a list of subjectstitle
-- the title of the imagetype
-- the resource type, always 'image'Source
-- the source of the image contentMarked
-- whether this is a rights-managed resource; 'False' ifPublic Domain, 'True' otherwiseUsageTerms
-- a description of the terms of usage for thisresourcetitleStmt
contains the description title.publicationStmt
contains the publisher and licensinginformation.notesStmt
contains general notes about the document.msIdentifier
contains identification information. Eachdocument is primarily identified by its repository and call number.summary
element contains a long form description of thedocument.textLang
element contains information about the document'slanguages.msContents/msItem
element containsdetailed description of the contents of the document as a whole. Thisinformation includes the document title, authors, other contributors(scribe, artist, etc.), and colophon.msItem
elements after the first msItem
contain section andchapter titles. These elements can be distinguished from the generaldocument-level msItem
by the presence of the @n
attribute andchild locus
element.msItem/@n
attribute corresponds to the facsimile/surface
element with the same @n
attribute.supportDesc
element contains information about thedocument's support, including support material, collation information,extent, foliation (or pagination), and watermark.layoutDesc
contains a description of the document's layout.scriptNote
element contains a description of the document'sscript.decoDesc
element contains descriptions of decorative andfigurative features of the document. A decoNote
without an @n
attribute provides a general description of decorative features. AdecoNote
with an @n
attribute corresponds to the facsimile/surface
element with the same @n
attribute.bindingDesc
element contains a description of the document'sbinding.history
element contains information about the document'shistory including its date and place of origin and provenance history.keywords
elements contain genre and subject information aboutthe document.facsimile
element lists the imaged parts of the document, inorder, with their names, linked to the document's images. Thesurface/@n
attribute contains the part's name or page/folio number.setup*.exe
and choose'Install from Internet'. Follow the prompts until you are asked tochoose a download site for cygwin. Choose any site and continue.Follow the prompts again, until you get to the 'Select Packages' page.Click the + next to Web (you may need to scroll down), then clickdirectly on 'Skip' and select the first box next to 'wget: Utility toretrieve files from the WWW via HTTP and FTP'. Click next, accept anydependencies. Download and installation may take a few minutes.$
:wget
command will download a single file into the directoryyou are in. Sowget
= use the wget program-np
= 'no parent', this means do not download any files that arein the folders containing the 0001 folder-r
= 'recursive', this means download files directly in the0001 folder, and also download any files that arein folders inside that folder (without this command, you would onlyget those files directly inside the 0001 folder)http://openn.library.upenn.edu/Data/0001/
=start download from this locationwget
= use the wget program-nd
= 'no directory', this means do not use the directorystructure from OPenn, put all the files into a folder specified byme-np
= 'no parent', see above-r
= 'recursive', see above-A.jpg
= 'accept list', accept only .jpg files-A.xml
= 'accept list', accept only .xml files-P openn/ljs225
= 'directory prefix', the folder to which all the files will be downloadedhttp://openn.library.upenn.edu/Data/0001/ljs225/
= start download from this locationdata/web
.wget
= use the wget program-nd
= 'no directory', see above-np
= 'no parent', see above-r
= 'recursive', see above-A.xml
= 'accept list', accept only .xml files-P openn/msDesc
= 'directory prefix', see abovehttp://openn.library.upenn.edu/Data/
= start download from thislocationman rsync
).setup*.exe
and choose'Install from Internet'. Follow the prompts until you are asked tochoose a download site for cygwin. Choose any site and continue.Follow the prompts again, until you get to the 'Select Packages' page.Click the + next to Net (you may need to scroll down), then clickdirectly on 'Skip' and select the first box next to 'rysnc'. Clicknext, accept any dependencies. Download and installation may take afew minutes.$
:/
character after Data
and0001
.
character on the first and second lines is used to break up the long line. If entered on the command line the
must be the last character on the line and cannot be followed by spaces.--delete
option will delete any files /var/www/html
notfound on OPenn. This command can be run regularly to keep anup-to-date local copy of OPenn on your system. In production, youwould want to fine tune this command to your situation.