Skip to Main Content

FL-Islandora Guide: Content

A guide for FL-Islandora users.

Introduction to FL-Islandora Content

Content Objects

A content object is any object that is not a collection.  Content objects include documents, images, music, video, and other digital content. In FL-Islandora, there are three ways to create a new content object:

  1. online ingest
  2. offline ingest
  3. batch import (also called the “zip file importer”)

Content objects generally include four (or more) datastreams, such as MODS metadata, Dublin Core metadata, a text file called “RELS-EXT”, and a file stream containing the content file like a PDF or JPG. Some content objects include two, three or even more file streams. The RELS-EXT is a file required by Islandora that records important information about the object, like its name and to which collection(s) it belongs.

Objects ingested into a site must have the namespace prefix assigned to the site. The namespace is used as a prefix to the Fedora PID (system identifier) which has the format [namespace-prefix]:[name].  The [name] portion of the PID for content objects is assigned by the system and is always a sequentially assigned number. By policy, names for collection objects should be assigned by the operator creating the collection, and should always begin with an alphabetic character.

NOTE:  Content objects within FL-Islandora cannot be batch modified, so please use extreme care in loading objects.

Objects created using batch import (the zip file importer), book batch, or offline ingest must have metadata submitted as part of the batch. All of these methods will accept valid MODS or MARCXML.

Content Access and Ownership

Note that an object "owner" is by default set to the user who created (ingested or submitted for ingest) the object. You may see a migration-assisted object with an Owner of 'Digitool Migration Assistant' or 'fedoraAdmin'. The owner of any object can be changed by those with sufficient permissions.

Each FL-Islandora site has its own lists of allowed IP ranges and its own Embargo policies (what to show when an unauthorized user tries to display an embargoed item, whether to receive notification of pending embargo releases).

Online Ingest

Basic Workflow:

  1. Ensure that the collection is set up to ingest the type of content you are uploading (corresponding content models need to be enabled for the collection). To check that the appropriate content models are enabled, browse to the collection that you will upload to, and click "Manage". On the Manage screen, along the left hand side, select the link "Manage collection policy."   On this screen, you will be able to confirm that the checkboxes are selected for the content models you intend to ingest.
  2. If you are not already within the desired collection, browse to the collection into which you want to upload the object and click "Manage". You will be brought to the Overview screen by default. Click “+ Add an object to this collection”.
  3. You will be prompted to select the content model for the new object from a pull-down list of all content models allowed in that collection. Select one content model, then click “Next”.
  4. You will be presented with radio buttons for the metadata forms to choose from. Generally, either "MODS Simple Entry" which is a shorter form, or "Full MODS Form" which is a longer form. Select one of these forms, then click “Next”.
  5. You will be presented with the option to upload an existing MARCXML record. This can be helpful if you have existing metadata, for example, a MARC XML file exported from a catalog record or another digital library platform. If you have an existing MARCXML record (filetype “.xml”) for the object in a file on your hard drive or local network, you can find it and upload it here. It will be used to pre-populate the metadata form you selected in the previous step. NOTE: You should validate the MARCXML file prior to loading, as Islandora does not perform MARCXML validation. If you don't have a MARCXML, just click “Next”, and you will be taken to a blank metadata form. 
  6. Your selected metadata form will appear if it has been pre-populated with data from MARCXML or from a template used by your institution. Enter metadata for your new object and click “Ingest”.
  7. You will then be provided with a form to upload your content:

What happens next depends upon the type of object you are creating:

  • Unitary Content Objects (single content item, such as a PDF file or single image)

  • Compound Content Objects (multiple content items connected together)

  • Paged Content Objects (complex view item, such as a book, serial, or newspaper)

Unitary Content Objects/ Single Content Item

Unitary content objects are complete in one primary datastream or file. There may be associated derivative files stored as datastreams, but there is only one primary file.  Content models that support unitary content objects are:  Basic Image, Large Image, PDF, Audio, Video, and Binary Object.  Although the ingest process may vary slightly, all of these content types have essentially the same ingest workflow:

  1. First, select the content model and create metadata as described above. You will then be provided with an upload form.
  2. Click "Upload File", browse to the select the file on your desktop, and click the “Upload” button. When the file is uploaded, the “Upload” button changes to a “Remove” button you can click to delete the file if you made a mistake. Once the correct file is uploaded, click the “Ingest” button. Upon completion, the successfully ingested object will be displayed to you.
    1. The Audio and Binary Object upload processes allow an extra step -- there is a prompt to optionally provide a thumbnail to be used in the search results, Summary and Full Description displays of the object. If a thumbnail is not uploaded, a default icon will display.
    2. The PDF upload process allows an extra step -- there is an option to optionally provide a file of full text for indexing. While full text is extracted from PDF files and stored as a datastream by the system during the upload process by default, in some cases an institution may have already created better full text, for example, by off-shore keying. If the operator uploads a file of full text it must have the “.txt” file extension.  This .txt file will now be stored as a full text datastream instead of an automatically extracted text.
Compound Content Objects/ Multiple Content Items Connected Together

Compound objects are sets of two or more related objects, of any content type, that always display together. They are implemented in FL-Islandora following the Compound Object content model as a parent object consisting of metadata only connected to any associated child objects (regardless of content model type).  Ironically, the child objects are created first.

Creating a compound object is a three-step process:

  1. Ingest all the child objects.
  2. Create a parent object by using the Islandora Compound Object Content Model:
    1. If you are not already within the desired collection, browse to the collection that you want the object to be in, and click "Manage".  The Overview screen will appear by default. Click “+ Add an object to this collection”.  Select the Compound Object content model, and click “Next”. Provide complete metadata describing the compound object and click “Ingest”. When ingest is complete the metadata will display.
  3. Associate the children objects within the parent object record. NOTE:  although it’s possible to associate a child record with a parent from within the child object it is not recommended and can cause problems. On the parent object metadata page, click the "Manage" tab then click “Compound”. The form for associating child objects appears:

Child Object Pid/Label: Type the title or PID of the child object(s) to be part of the compound object. The data entry field will autocomplete, so you should be able to select the desired object after entering just a few characters.

Enter the title or PID for the first child object and click “Submit”. Repeat for every child object.

Paged Content Objects

Paged objects are hierarchically organized content that consists of individual pages at the lowest level, like books and newspapers. These objects make use of multiple content models, similar to the Islandora Compound Object Content Model with parent objects consisting of metadata only. 

  • Books are created by adding pages to the parent object.
  • Newspapers are created by adding issues (metadata only) to the parent object and then adding pages to the issue object(s).
  • Serials are created by 

With release 7.x-1.6 of Islandora there is the new ability to upload PDF objects to book parents or newspaper issues from which individual page image files can be extracted. In both cases the end result is a book or newspaper issue with individual page images.

NOTE: As of April 2019, there is a new, optional feature that sends uploaded .zip files of Book pages and Newspaper Issue pages to Offline Batch Ingest. This allows users to continue working in the GUI after the .zip file is uploaded, and queues the pages for loading. Loads can then be tracked via your institution's Offline Batch Ingest admin GUI.  It is recommended that anyone using this zip page feature set-up the offline batch ingest feature by contacting help@flvc.org.

Books

To create a book, navigate to the collection that the book will be part of. Click the "Manage" tab. The Overview screen will appear by default. Click “+ Add an object to this collection”. Select the Islandora Internet Archive Book Content Model. Complete the metadata for the book and click “Ingest”. When ingest is complete, the Internet Archive BookReader view will appear, showing a book with a title but no content:


Next you will need to add pages to the book parent object. Click the "Manage" tab, then click “Book”. The resulting screen will offer two options, “+ Add page” and “+ Add zipped pages”.

To add pages one at a time, click “+Add Page”. Find the page file (TIFF or JP2) and upload it. Click the “Ingest” button. When ingest is complete, you will see the single page display.

NOTE: this page view has a "Manage" tab, but adding pages is a function of the Book, not the Page.

To add another page you must return to the book level by clicking “Return to Book View”. On the Book View, click the "Manage" tab, then click “Book”. Now you can repeat the steps from clicking “+ Add Page” to add another page.

Uploading pages one by one can be tedious, so there is also a function to add a number of pages bundled together in a single zip file. To use this, at the Book level, click the "Manage" tab, then click “Book”, then click “+ Add Zipped Pages”.

Language: Select the language the text is written.

Last Sequence Number: If there are no pages already ingested into the book, this number will default to “0”, otherwise it defaults to the count of pages already ingested into the book. Page numbering will start after the page number entered here. In this example, the first page ingested from the zipped page file will be numbered “2”.

Compressed Images File: Users can browse to locate the zip file of pages and upload it.

Select the language the text is written in from the pull-down (this will default to English so you may not need to adjust).  You may want to leave the last sequence number alone (this will default to zero or the current last page of the book), unless there were pages missing from the book or there is some other reason to change the page numbering.   Locate the zip file of pages and upload it. Click “Add files to book”.

When the ingest is complete, you will get an updated version of the same screen. Any errors encountered while loading the pages will appear at the top. The “Last sequence number” field will now be updated to include the count of the number of pages ingested into the book.

There is a "Book" training exercise zip file at the bottom of this area.

Newspapers

Newspapers make use of three Content Models: Newspaper (title level), Issue, and Page. Title-level newspaper objects must be created manually via the User Interface. After title-level objects are created, issues can be added to the title (parent) object and then pages can be loaded to the issue.

Creating a Newspaper Title (Parent) Record

To create a new newspaper, navigate to the collection the newspaper will be part of and click the "Manage" tab. The Overview screen will appear by default. Click “+ Add an object to this collection”. Select the Newspaper Content Model. Select a metadata edit form and create metadata at the title level. Note that “Type of resource” should be “text”, and “Issuance” should be “serial”.

Creating an Issue Record

To add an issue, go to the newspaper title record. (This will be your default location after creating a newspaper object. Click the "Manage" tab and click “Add issue”. Select a metadata form and fill it out.

Title: Include the date or enumeration of the issue. E.g. if the newspaper title is “The Globe” the issue title should be “The Globe, January 1, 1882” or “The Globe, v.1 no.2”.

Type of Resource: text

Issuance: single unit

Date Issued: yyyy-mm-dd. It is critical to enter Date Issued in this format to get the correct newspaper tree display.

Adding Pages to a Newspaper Issue

Pages can be added to an issue either one page at a time, or by loading a .zip file containing page images for an entire issue.

  • To add a single page, go to the issue level object.
    • Click the "Manage" tab and “Add page”.
    • Upload the JPG, TIFF or JP2 page image and click “Submit”.
  • To add a .zip file of all pages for a single issue, go to the issue level object.
    • Click the "Manage" tab and then click the "Issue" button.
    • Click "+ Add Zipped Pages" and "Choose a file" to select a .zip file of page images for that issue.
    • Click "Add files".

Offline Ingest

Offline batch ingest provides an offline alternative to online ingest and online batch import via the FL-Islandora user interface. At this time, six content models are supported by offline ingest: Basic Image, Large Image, PDF, Book, Newspaper Issues, and Video. This frees up your online connection to FL-Islandora via the user interface for other work and extends FL-Islandora loading and processing capabilities by performing many load operations on the separate FL-Islandora load server.

  • With offline ingest, the content to be ingested into the system is FTP-ed to the FL-Islandora load server.
  • Each object submitted to offline batch ingest must be contained within a single package (directory) which must adhere to certain requirements. (See details below.)
  • A program on the FL-Islandora load server watches for new content and automatically moves it to an ingest queue when found. (Note that the ingest queue is shared by all FL-Islandora users.)
  • The offline ingest program then loads content in the order in which it was submitted to the shared ingest queue.
  • Results from the load are posted to an offline batch ingest reporting interface that is unique to each FL-Islandora site, where load results for all submitted packages are recorded.

Basic Workflow

Prerequisites: a new user must request an FTP account to the FL-Islandora load servers (test and production). Accounts are issued to individual users from an institution and are IP restricted. Please provide help@flvc.org with a list of individuals who will be using offline batch ingest at your institution, along with their IP addresses. Florida Library Services will create user accounts and also set up an offline batch ingest reporting user interface for your institution. The individual accounts/logins will provide you with access to only your institution's FTP directory on the FL-Islandora load server.

The offline batch ingest workflow is as follows:

  1. Create an Islandora package per the package requirements detailed below.
  2. Log into the offline ingest FTP/load server using your individual login. (The test server's hostname is “ftpes://tlhlxftp01-tst.flvc.org”. The production server's hostname is “ftpes://tlhlxftp01-prd.flvc.org”.) NOTE: We recommend that you use the FileZilla client, using port 21.
  3. Upload your packages into your institution's /incoming/ directory. All individual logins from your institution will share this /incoming directory.
  4. The offline batch ingest process checks all /incoming/ directories every 5 minutes and moves new packages into the /processing directory and queues them for loading.
  5. Packages are moved from your institution's /processing directory after they are processed for loading.
  6. Packages that load but encounter load warnings are moved into the /warnings/ directory for your review. (This step may be eliminated in future if users don’t find it to be useful, but during beta testing this could be very useful for FLVC troubleshooting.)
  7. Packages that fail to load will be moved to the /errors/ directory. You can retrieve your package from that directory and make corrections and then resubmit the package.
  8. Load results are recorded in a load database and made available for viewing in your site’s Ingest Reports interface. The URL to that interface is http://[your site root/code].admin.digital.flvc.org (for example: http://islandora-test.admin.digital.flvc.org is the URL to the Ingest Reports interface for the https://islandora-test.digital.flvc.org site, and http://fsu.admin.digital.flvc.org is the URL to the Ingest Reports interface for the FSU FL-Islandora production site.).

Package Requirements

Rules for creating packages for offline batch ingest:

  1. Each package (directory) must have the same name as the IID (item identifier) of the object. Allowable characters in the IID are: alphanumeric characters, hyphens, underscores. Note that spaces are not allowed.
  2. The package must contain a) metadata (a valid MODS file), b) a manifest file, and c) content file(s). The name of the MODS file must match the directory/folder name of the package. See the examples below.
  3. If the package is for a Book, a METS file containing a structMap and fileSec must be supplied. A Table of Contents will be created from the METS structMap and fileSec information.
  4. If the package if for a Newspaper Issue, a METS file is optional. Without a METS file, the load program will assemble pages in ASCII sort order, case sensitive.
  5. The MODS record file must have the same filename as the package and the file type .xml. E.g. if the package name is “UF12345678” the MODS file must be named “UF12345678.xml”.
  6. The manifest must be named “manifest.xml”. (See below for manifest requirements.)

Creating a METS File

The METS Editor (SobekCM) can be downloaded and installed and used to create a METS file for your Book and Newspaper Issue packages.

Examples of Packages

The following examples of package structures would all be valid packages (assuming of course that all files within them are valid). Note that there is no naming requirements for content filenames, but pages of newspaper issues without METS files will load in ASCII sort order so that they load in correct page order, e.g., p001.jpg through p234.jpg, etc.:

A PDF

  • /UF12345678_00001/ (UF12345678_00001 is the directory or folder name)
    • manifest.xml
    • UF12345678_00001.xml (the MODS metadata file)
    • happy_trails.pdf

A Large Image

  • /GC_accession_1322/ (GC_accession_1322 is the directory name)
    • manifest.xml
    • GC_accession_1322.xml (the MODS metadata file)
    • fits_1322_1.jp2

A Book

  • /FA00000032/ (FA00000032 is the directory name)
    • manifest.xml
    • FA00000032.xml (the MODS metadata file)
    • mets.xml
    • cover1.jp2
    • cover2.jp2
    • p001.jp2
    • etc. (more pages)

A Newspaper Issue

  • /CF00004312_0001/ (CF00004312_0001 is the directory name)
    • manifest.xml
    • CF00004312_0001.xml (the MODS metadata file)
    • mets.xml <- Note that the METS file for a newspaper issue is optional
    • page01.jpg
    • page02.jpg
    • page40.jpg
    • etc. (more pages)

MODS Requirements

  • The MODS record must validate against the MODS schema. Packages containing invalid MODS files will fail to load and will record validation errors.
  • The MODS record must NOT use namespace prefixes on the MODS elements.
    • Do NOT use the namespace definition mods:xmlns=”....” and do NOT create a record like the one below:
<mods xmlns="http://www.loc.gov/mods/v3" mods:xmlns=”http://www.loc.gov/mods/v3”>
    <mods:titleInfo>
   <mods:title>This is an example of a bad MODS record.</mods:title>
    </mods:titleInfo>
etc.
  • A correct MODS record will look like this:
<mods xmlns=”http://www.loc.gov/mods/v3”>
   <titleInfo>
     <title>This is an example of a good MODS record.</title>
   </titleInfo>
etc.
  • The MODS record should not include an <flvc> extension. Any sub-elements of the <flvc:flvc> tag will be ignored, removed from the MODS record, and replaced with an <flvc> extension block created by Offline Ingest.
  • The MODS record must contain exactly one identifier of type “IID”, and the identifier value must match the name of the package.
  • For newspaper issue packages the MODS must contain an <originInfo><dateIssued> element, with an encoding attribute of “w3cdtf” and the date expressed in the format YYYY-MM-DD. For example:
<originInfo>
  <dateIssued encoding=”w3cdtf”>1881-05-07</dateIssued>
 </originInfo>
  • Newspaper issue packages that do not contain this element and attribute will not be loaded.

Manifest Requirements

The manifest is an XML file containing instructions to the batch ingest process.

An example of a valid manifest file for a Large Image:

<?xml version="1.0" encoding="UTF-8"?>
<manifest xmlns="info:flvc/manifest">
    <contentModel>islandora:sp_large_image_cmodel</contentModel>
    <owningUser>Sally Staff</owningUser>
    <collection>fau:photos</collection>
    <owningInstitution>FAU</owningInstitution>
</manifest>

Example of a Book object's manifest:

 <?xml version="1.0" encoding="UTF-8"?>
 <manifest xmlns="info:flvc/manifest">
    <contentModel>islandora:bookCModel</contentModel>
    <owningUser>Sally Staff</owningUser>
    <collection>ucf:floridabooks</collection>
    <owningInstitution>UCF</owningInstitution>
 </manifest>
 

Example of a Newspaper Issue object's manifest. Note that the collection element must contain the PID of the newspaper title object so that the loader can identify the newspaper title to which the issue is to be attached:

 <?xml version="1.0" encoding="UTF-8"?>
 <manifest xmlns="info:flvc/manifest">
    <contentModel>islandora:newspaperIssueCModel</contentModel>
    <owningUser>Jane Jones</owningUser>
    <collection>fscj:1234</collection>
    <owningInstitution>FSCJ</owningInstitution>
 </manifest>

The following elements are required in the manifest file:

  • collection (required, repeatable)
    • A collection which the object will be a member of. Multiple collections can be specified, but only one per <collection> element.
      • Collection names must be in PID format [namespace]:[name], e.g. fsu:football50
    • For newspaper issue packages, the collection element must contain the PID of the parent newspaper title object, e.g., usf:445, instead of the PID of an Islandora collection object. For example:
      • <collection>usf:445</collection>
  • contentModel (required, not repeatable)
    • The content model to be used for this object. Allowable content model names are:
      • islandora:bookCModel (for Books)
      • islandora:sp_basic_image (for Basic Images)
      • islandora:sp_large_image_cmodel (for Large Images)
      • islandora:sp_pdf (for PDFs)
      • islandora:newspaperIssueCModel (for newspaper issues)
      • islandora:sp_videoCModel (for videos)
  • owningInstitution (required, not repeatable)
    • The institution code of the institution owning the object. This will be validated against a list of institution codes known to the system, and inserted into the local FLVC extension of the MODS record.
  • owningUser (required, not repeatable)
    • The Islandora userid of the operator submitting the package for offline ingest. This will be validated against userids known to the system, and inserted into the local FLVC extension of the MODS record.


The following is a complete list of elements permitted in the manifest.xml file:

  • collection (required, repeatable)
    • A collection which the object will be a member of. Multiple collections can be specified, but only one per <collection> element.
      • Collection names must be in PID format [namespace]:[name], e.g. fsu:football50
    • For newspaper issue packages, the collection element must contain the PID of the parent newspaper title object, e.g., usf:445, instead of the PID of an Islandora collection object. For example:
      • <collection>usf:445</collection>
  • contentModel (required, not repeatable)
    • The content model to be used for this object. Allowable content model names are:
      • islandora:bookCModel (for Books)
      • islandora:sp_basic_image (for Basic Images)
      • islandora:sp_large_image_cmodel (for Large Images)
      • islandora:sp_pdf (for PDFs)
      • islandora:newspaperIssueCModel (for newspaper issues)
      • islandora:sp_videoCModel (for videos)
  • embargo (not required, not repeatable)
    • Information about an embargo to place on the ingested object, supplied in the attributes rangeName and endDate. For example:
      • <embargo rangeName=”FSU campus” endDate=”2014-07-01”>
    • If the embargo is indefinite (no end date) omit the endDate attribute.
  • identifier (not required, repeatable)
    • An identifier to be supplied to the metadata for the object. This function is not fully implemented at this time and the <identifier> element should not be used.
  • label (not required, not repeatable)
    • A label to be supplied to the object. This function is not fully implemented at this time and the <label> element should not be used.
  • otherLogo (not required, repeatable)
    • A logo to be displayed in addition to the owning institution’s logo, which displays by default. This will be inserted into the local FLVC extension of the MODS record.
  • owningInstitution (required, not repeatable)
    • The institution code of the institution owning the object. This will be validated against a list of institution codes known to the system, and inserted into the local FLVC extension of the MODS record.
  • owningUser (required, not repeatable)
    • The Islandora userid of the operator submitting the package for offline ingest. This will be validated against userids known to the system, and inserted into the local FLVC extension of the MODS record.
  • pageProgression (not required, not repeatable)
    • Use <pageProgression>rl</pageProgression> if the pages in a book are read right to left, as with Hebrew. Omit otherwise.
  • submittingInstitution (not required, not repeatable)
    • The institution code of the institution submitting the object for offline ingest. If not provided in the manifest, this will default to the same as the owning institution.

Package Submission

To submit packages for loading via the offline batch ingest process:

  • Log into the test FTP server using your individual FTP server login. (The test FTP server’s IP address is islandload-tst.flvc.org, and all packages submitted there will load into your institution’s test site.)
  • Upload your packages into the /incoming directory for your site. (Your login will take you directly to your institution’s FTP directory.)

Viewing Results

To view the results of your load, point a browser at your site’s Ingest Reports interface. The URL will be: [institution’s islandora site code].admin.digital.flvc.org.

e.g., https://islandora-test.admin.digital.flvc.org.

You will see a page that lists all materials loaded into your site by offline batch ingest:

You can filter the results list by date, load status, package name/ID, title, collection, or content type (content model).

By clicking on the Status link you’ll see the full load report that includes a direct link to the Islandora object, if the object was loaded. Otherwise you’ll see an error report and can expect that the referenced package has been moved to the /errors directory in your site’s FTP space.

In the upper right-hand corner of the Ingest Report page is a CSV download button that will download the search results.

Interpreting Results

  • The green "success" status means that your package was successfully loaded. Click on the "success" status link for details of the load and links to the loaded object.
  • The red "error" status means that your package was not loaded. Click on the "error" status link for details about the problem. To resolve errors you must correct the problem and re-upload the package to the /incoming directory for re-processing. The two most common errors are:
    • That the package name, MODS file name and <IID> element in the MODS file are not identical.
    • That an IID is not unique. (In FL-Islandora the MODS <IID> element must be unique.) This means that there is an object on your site that has the same IID as the object you have submitted for loading. You must determine which IID is correct and should be retained.
  • The yellow "warning" status means that your package was loaded, but there were anomalies in the package that you might want to be aware of. It is up to you to determine if you want to delete the loaded package and reload after correcting the anomalies.

Dealing with Errors

All packages that fail to load will be moved to your site’s /errors directory, in alphabetically coded sub-directories. You can download the package from there, correct it locally, and can resubmit it, or you can delete the problem package.

  • If you get the following error: RestClient::RequestTimeout Request Timeout
  • That means the server was restarted while the package was processing. This error is caused by the server restart, and the package may be perfectly fine. Move the package back to /incoming/ using your FTP server login and client.

Batch Import (also called the “zip file importer”)

The zip file importer can be used to ingest a batch of objects at once. It can be used to load:

  • MODS metadata and content files

  • Content files without metadata

  • MODS metadata without content files

Objects ingested via the zip file importer have the operator name of the submitting operator as the owning user, and will default to “Active” or “Inactive” state accordingly.

NOTE: For loading content into the Binary Object Content Model you must use the Binary Object Zip Importer instead of the ZIP File Importer.
NOTE: Adding an embargo to an object via Zip importer will not work. Instead use Offline ingest.

Preparing the zip file

The objects to be loaded must be zipped together into a single file of filetype .zip. All objects in the zip file must use the same content model and be intended for the same collection.  It is recommended that you use an open source software such as 7-zip to create your Zip files for Islandora instead of the proprietary Microsoft NTFS compression available from a Windows context menu/right click.

Files to be associated with each other (e.g. MODS metadata with content files) are matched by filename, so they must have the same filename and different filetype extensions. The filetype extension for MODS files must be .xml and the filetype extension for full text to be indexed must be .txt.  Because the Zip Importer will ingest metadata without content and content without metadata, you should take care that metadata and corresponding content files have exact matching names -- a typo in the filename will cause the two files to be ingested separately.  

For example, Large Image objects with metadata might be named:

   file1.jp2
   file1.xml
   file2.jp2
   file2.xml 

PDF files with metadata and full text for indexing might be named:

   1443587.pdf
   1443587.xml
   1443587.txt
   1439_2.pdf
   1439_2.xml
   1439_2.txt

A single zip file can also contain a mixture of pairs of metadata and content files, standalone metadata, and/or standalone content files. A zip file containing:

   program_1.pdf
   program_1.xml
   4433256.pdf
   solo.xml

will create 3 objects: one PDF with metadata, one standalone PDF, and one standalone MODS file.

NOTE: the ZIP File Loader allows for upload/creation of objects in only one Content Model per .zip file

Metadata Requirements

  • The MODS file provided should be valid MODS. Florida Library Services strongly recommends validation of MODS files prior to loading with the Zip Importer, as the Zip Importer does not perform validation against the MODS schema during loading. NOTE: The ExceltoMODS Transformer, http://exceltomods.flvc.org, validates against the MODS schema during transformation of Excel spreadsheets into MODS, and so MODS prepared in this way has been validated.

  • Minimum requirements are that it must include:

    • a unique IID in the element <identifier type="IID">

    • a valid owning institution code in the element <extension> <flvc:flvc> <flvc:owningInstitution>

    • a title in the element <titleInfo><title>

  • Submitting Institution (<submittingInstitution>) and Other Logo (<otherLogo>) can also be put in the <flvc> extension.

  • The namespace for the <flvc> extension must be included in the MODS header information:

    • xmlns:flvc="info:flvc/manifest/v1"

  • The example below shows a valid minimal MODS record for the Zip File Importer:

<?xml version="1.0" encoding="UTF-8"?>
<mods xmlns="http://www.loc.gov/mods/v3"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xmlns:flvc="info:flvc/manifest/v1"   
     xmlns:xlink="http://www.w3.org/1999/xlink”
     xsi:schemaLocation="http://www.loc.gov/mods/v3
     http://www.loc.gov/standards/mods/v3/mods-3-4.xsd”    
     version="3.4">
    <extension>
         <flvc:flvc>
              <flvc:owningInstitution>FSU</flvc:owningInstitution>
         </flvc:flvc>
    </extension>
    <titleInfo>     
         <title>Strawberries</title>
    </titleInfo>  
    <identifier type="IID">FS3518756</identifier>
</mods>

  • NOTE the supplied MODS record must NOT use namespace prefixes on the MODS elements. I.e., do NOT use the namespace definition mods:xmlns=”....” and do NOT create a record like the below:

<mods xmlns="http://www.loc.gov/mods/v3" mods:xmlns=”http://www.loc.gov/mods/v3”>
 <mods:titleInfo>
   <mods:title>This is an example of a bad MODS record.</mods:title>
 </mods:titleInfo>

etc.

If any of these requirements are not met, the supplied metadata will be ignored, and the Zip File Importer will create a skeleton MODS record that contains only the filename as title:

<mods xmlns="http://www.loc.gov/mods/v3">
 <titleInfo>
   <title>[filename]</title>
 </titleInfo>
</mods>

The skeleton record will not be sent to Mango but it will display in Islandora. The first time an operator tries to update the skeleton record online, the MODS Forms will force the creation of required fields.

Doing the import

Navigate to the collection to which the objects should be added. Click "Manage". This should default to the “Overview” screen.

Click "Collection" to get the collection management screen. Click "+ Batch import objects".

Take the default importer, “ZIP File Importer”. Click "Next".

NOTE: Although the Binary Object Content Model appears when you select the ZIP File Importer, to batch load Binary Objects you must use the Binary Object Zip Importer to batch load Binary Objects (the ZIP File Importer will not load the content files).

Fill out the form. Find the zip file to import. Click the content model the import should use.

Select the default namespace, which should be correct. Click "Import".

Importing content without metadata

This is a useful feature for institutions that want to load quantities of objects and then add metadata interactively online. 

Content files can be imported without metadata. Any content file in a zip file that has no matching metadata file will be ingested and a minimal MODS record will be created with only the <titleInfo><title> element supplied from the filename of the file.

E.g. if a standalone file named “Flavorcrest.jpg” is imported, this MODS record will be created:

<mods xmlns="http://www.loc.gov/mods/v3">
  <titleInfo>
    <title>Flavorcrest</title>
  </titleInfo>
</mods>

Because the metadata has no IID identifier or owning institution code, these required fields will have to be added the first time the record is edited online.

Importing metadata without content

Metadata files can be imported without corresponding content files. The importer will generate warning messages that derivative files could not be created, according to the content model associated with the import. These messages can be ignored.  This option may be used when an institution's workflow is such that one staff person creates and loads metadata only, and after the metadata has been finalized another staff person uploads and adds content datastreams.

Viewing results

After the content of a zip file has been successfully imported, the operator will get a response screen with the message:

   Batch complete!  View/download simple results or see the watchdog log for details.

The “simple results” link provides a list of all objects ingested with their title and PID, and a link to the content in Islandora.

info: Ingested "Viking Motel Acquisition, June 1988" as islandora:1858. Link: islandora:1858. info: Ingested "Naésa zemlja" as islandora:1859. Link: islandora:1859. info: Ingested "Flavorcrest" as islandora:1860. Link: islandora:1860. info: Ingested "Peaches" as islandora:1861. Link: islandora:1861.

The “watchdog log” link provides a view of log messages associated with the import. The entire log can be also be viewed from the Administrative menu / Reports / Recent log messages.

Batch Import to Offline Ingest (Books, Newspaper Issues, and Videos)

How does Batch Import to Offline Batch Ingest work?

To a large extent these new features are transparent to users, because the initial steps to either add a .zip file of Book pages or a .zip file of Newspaper Issue pages, or a .zip file containing one or more video files and their associated MODS files remain the same. The differences appear after the .zip files are uploaded:

  • After uploading the .zip file using one of the current methods, the new code (if enabled for your site) creates an Offline Batch Ingest package from the .zip file and

  • moves the package to your site's Offline Batch Ingest load queue for batch loading.

  • The GUI provides a link to your site's Offline Batch Ingest administrative interface (http://site code.admin.flvc.org) where you can track the progress of the load.

  • Packages that have been successfully transferred are noted as Status "queued". After processing you will see the regular "success", "warning" or "error" status.

An example of loading a .zip file of Book pages via the GUI to Offline Batch Ingest process

1. As usual, first create a book title object and edit its metadata.

2. To add a .zip file of pages, click Manage -> Book -> +Add zipped pages from within the parent Book object. You can load a .zip file of pages to a newly created Book parent object, or you can add pages to an existing Book parent object with existing book pages. Pages will be added to the end of the book.

3. Upload the .zip file of pages, and click "Add files"

4. With the new feature, your .zip file will upload and be passed to Offline Batch Ingest, and you'll receive a message that indicates that your pages will be loaded via offline batch ingest: 

5. You'll see in the Offline Batch Ingest admin GUI that your package of pages is queued for loading. Note that the "title" of the load will be the IID of the book object to which pages are being added, along with the date and timestamp of the load (not the title of the book):

6. Once the pages load you'll see a "success" status and details about the load, including a link to the book parent object and a list of page PIDs and links:

 

Content Management

Content objects can be moved from one collection to another or shared with another collection. This can be done at either the object level or the collection level. Moving or sharing from the objects itself is useful if you have only one or few objects to move, or if you are already on the object for another reason, for example, reviewing its metadata.

Move or share a single object

Navigate to the object you want to move or share.  Click the "Manage" tab. The default Overview display gives you prompts:

"+ Migrate this object to another collection"
"+ Share this object with another collection"
Moving Objects in a Collection
Instructions on how to move one or more objects from a collection’s Manage tab are given here.  There are two ways to move (migrate) a content object from one collection to another collection:

   1) from the object’s "Manage" tab,
   2) from the collection’s "Manage" tab.

Moving objects from a collection’s Manage tab is useful if you want to move all or several members of one collection to another collection. For example, say you have a collection called “Maps” containing maps of North and South Carolina, and you want to break it up into two collections: “North Carolina Maps” and “South Carolina Maps”. You could create a new collection called “South Carolina Maps” and move all the South Carolina maps from the Maps collection into it. Then you could rename the original collection “Maps” (which now contains only North Carolina maps) to “North Carolina Maps”.

Navigate to the collection whose members you want to move. Click the "Manage" tab.

Click “Collection” on the top menu bar and then “Migrate Members” on the left-hand sidebar. You will get the migration form. 

Migrate members to collection: this is a pull-down menu with the names of all collections known to your site. Select the collection you want to migrate objects into.

To migrate all objects in the collection, click the box next to "LABEL"  to select all and then click the button “Migrate All Objects” at the bottom of the screen.

To migrate selected objects, use the checklist of the titles of all objects in the current collection. You can cherry-pick objects or select a screen at a time by clicking the box before “LABEL”. Note that this won’t select all objects in the collection, only the ones listed on that page. Then click “Migrate Selected Objects”.

Migrated objects will be detached from the current collection and moved to the selected collection.

Sharing Objects in a Collection

Sometimes you want an object to appear in multiple collections at the same time. For example, you might want a historical state map to appear in both your “Maps” collection and your “State History” collection. You can make an object a member of two or more collections by ingesting it into one collection and then sharing it with the other collections.  Instructions on how to share one or more objects from a collection’s "Manage" tab are given here.  There are two ways to share a content object with another collection:

   1) from the object’s "Manage" tab,
   2) from the collection’s "Manage" tab.

Navigate to the collection containing the objects you want to share. Click "Manage".

Select “Collection” from the top menu bar and “Share Members” from the left-hand sidebar.

The Share members function is implemented like the migrate function. Select the collection you want to share members with from the “Share members with collection” pull-down. To share all objects in the collection, click the button “Share All Objects” at the bottom of the screen.  To share selected objects, use the checklist of the titles of all objects in the current collection. Select the member(s) you want to share from the checkbox list. To select all listed on a page, click the checkbox in front of “LABEL”. Remember this will select only those titles listed on the page, not all members of the collection. Click “Share Selected Objects”. They will now appear in both collections.

Content objects can be shared with collections on other sites. In particular, objects from institutional collections can be shared with PALMM using the instructions above.

Collection(s) with a Collection

1) Make sure the collection you are creating a new collection is set-up with the "Islandora Collection Content Model (islandora:collectionCModel)".  To check this, browse to the collection you will create a collection in, then click "Manage", then click "Collection." Here on the left hand side you will see "Manage collection policy."  If the box for "Islandora Collection Content Model (islandora:collectionCModel)" is selected then the collection is set-up to accept this content type and you can proceed to the next step.

2) If you are not already within the appropriate collection, browse to the collection that you will create your new collection. Click the "Manage" tab, then click "Add an object to this Collection".  As a reminder, Islandora labels both content items and collections as "objects."

3) Under "Collection PID", enter a PID for your new collection. The Collection's PID has to start with your institution's namespace. The namespace is the part of your site's URL that comes before ".digital.flvc.org". (For example, https://fau.digital.flvc.org has the namespace fau and https://ucf.digital.flvc.org has the namespace ucf .) The namespace has to be lowercase. If you enter a wrong namespace, then you get an error message on the very last screen. If you don't enter a "Collection PID" then Islandora will automatically assign a random number, which is not recommended practices and impacts search engine optimization.

4) Uncheck the box for "Inherit collection policy". This opens up a bunch of checkbox options. Pick the ones for the content that you plan to upload into this new collection. If you select the wrong ones now you can go back and change this setting later.

5) Click "Next".

6) You can ignore MARCXML file. Click "Next" without doing anything on the MARCXML screen.

7) Enter a "Collection Title". This will show up to people browsing the site. You can also change the tilte at a later time.

8) Click "Ingest".

Reorder Child Objects for Compound Content

After creating a compound object, you may want to reorder the children so that they display in the order you want. To do this, navigate to the compound object parent, click the "Manage" tab, then click “Compound”. The compound object form will now include a block allowing you to remove child objects, and a link to reorder the child objects.

Click the “REORDER” link and a list of child objects will display.

Drag the crossed arrows in front of the object titles to put them in the order you want, then click “Save Changes”.

Inactive Queue

At this time, it is not recommended to use the "inactive" state to suppress a collection from public view, as this method also suppresses objects from staff view.  Instead, best practices suggest using Object Policies to suppress objects from public view and retrieval. 
An Inactive object is removed from the search indexes. Objects in this state must be retrieved from the Inactive Queue.  It will not appear to any user in browses or in search results. However, any user can also access the object directly by PID.

NOTE: The Inactive Queue function (Simple Workflow Module) has not been enabled in production. Currently, an Inactive Queue is shared by all sites, so institutional sites do not have their own Inactive Queue. 

Operators can have roles (sets of permissions) that cause any objects ingested by that operator to be created with “Inactive” state by default. This feature exists so lower-level staff like student assistants can have their work reviewed by supervisors before it displays to the public.
Content objects can be rendered “inactive” directly, by clicking to "Manage", then clicking "Properties" and changing the object status to “inactive”, or indirectly using the workflow module. Collection objects can be made inactive only by changing the object status to “inactive”.

The “inactive queue” allows reviewers to find these suppressed (inactive) items more easily. Staff with the authority to review the inactive queue will see a link on the left-hand sidebar to “Manage inactive objects”. Clicking the link provides a view of the inactive queue.

At this time, only the title of inactive objects displays in the inactive queue. (A suggested enhancement will expand the data available). NOTE: both content objects and collection objects can be inactive, and both will display by title (label) in the inactive queue.

Click the title to display the object. Click “Manage” to display the Properties page for the object. The state can be changed to “Active” from the Properties page.

Adding, Replacing and Deleting Datastreams

Normally datastreams are added to an object at the time of ingest, and you manage them when necessary through other functions. However, occasionally you might want to add, replace or delete a datastream. For example, say a PDF object has been created and full text of the PDF was extracted by the system and stored as the FULL_TEXT datastream. The extracted text is not very good and you want to improve it. To do this you would have to

a) download the FULL_TEXT datastream and edit the text
b) delete the existing FULL_TEXT datastream
c) import the edited text as a new FULL_TEXT datastream.

These can all be done from the Datastreams screen. Navigate to the object in question and click the "Manage" tab. Click “Datastreams”.  The screen will list all datastreams for the object. To download a datastream, click the “download” link next to the name of that datastream. To delete a datastream, click the “delete” link. To upload a datastream, click “+ Add a datastream” (and be careful to know your "Datastream ID" since this is a controlled field that determines how the datastream will be used).

Add a Datastream: Provide the name of the Datastream as defined by the content model. If you are replacing a datastream named “FULL_TEXT” the name you provide here must match that exactly.

Datastream Label: This is the label that displays after the Datastream name whenever datastreams are listed. This can be anything you want but it is best to use the same label as the datastream you are replacing.

Upload Document: This allows you to find and upload the new datastream. Click “Add Datastream”.

Running OCR on Book and Newspaper pages

  1. Click on the Book or Newspaper that you want to run OCR on.
  2. Click on “Pages” just below the “View All Items “on the left side of the screen.
  3. Click on one of the pages in the window that appears.
  4. Click on the “Manage” button just below the words “ADVANCED SEARCH". Click on Datastreams to see if under the “ID” column there is the letters OCR. If not click on the “Page” button up in the same area that you clicked the “Datastreams button.
  5. Click on “Perform OCR” on the left had side of the window.
  6. Click on the “Perform OCR” button a little below “Language”.
  7. When it is done a “Successfully performed OCR.” Appears next to a check mark in the window.
  8. Click the black circle with a white X in the right side of the pop up window to close it.

How to create a PDF from a Book

  1. Go to the book that you want to create a PDF of all the pages and click on the book.
  2. Click on the “Pages” button in mid screen.
  3. Click on the “Manage” button.
  4. Click on the “Book” button on the upper right of screen.
  5. Click under the “DPI – Dots Per Inch” dropdown box and select 300.
  6. Click the “Create PDF” button.
  7. You will see a status bar.
  8. When it finishes, you will be back to the window with the “Create PDF” and a “Created PDF with .. pages. “ message. Click on the “Datastreams” button.
  9. At the bottom of the window that pops up, you should see “PDF”. On the PDF line click on “download” to save the pdf with the pages of the book to your computer.
  10. Select the “Save File” and click “OK” button.
  11. Close the window. You now should have the PDF of the Book pages save to your computer. The pdf will just be called PDF.pdf. Rename the PDF.pdf to the name of the book.