GEDCOM Import System Manual
System Lead: Core Team
Related Documents: System Architecture, Database System
1. System Overview
The GEDCOM Import System is the core data-entry mechanism for LineagePress. It is designed to parse standard GEDCOM files and populate the plugin’s database tables with genealogical data.
The system can handle very large files (1GB+) on standard web hosts without timing out, by processing the file in manageable “chunks” via AJAX.
The import process is a two-stage workflow orchestrated by two main controllers.
Key Responsibilities
- File Upload & Configuration: Provides a user interface for uploading a GEDCOM file and selecting import settings.
- Chunked Processing: Reads the GEDCOM file in small segments to avoid server timeouts and memory limits.
- Data Parsing & Processing: Interprets GEDCOM tags and maps the data to the corresponding database tables.
- Data Migration: Handles conversions from older GEDCOM versions (e.g., 5.5.1) to a modern standard where necessary.
- Validation: Performs checks on the data to ensure integrity.
- Variable Import Modes: Allows users to replace, update, or add to existing data.
2. Architectural Flow
The import process is a hand-off between two controllers, separating the initial setup from the actual processing. This allows a dedicated and robust processing environment.
Stage 1: The Setup (HP_Import_Controller)
- Entry Point: The user submits the main import form from the
lineagepress-import-exportadmin page. - Controller:
\LineagePress\Admin\Controllers\HP_Import_Controllerhandles this submission. - Actions:
- Validates the request and handles the file upload, moving the GEDCOM file to a temporary location:
wp-content/uploads/heritagepress/imports/. - Gathers all selected import options (Tree ID, import mode, event handling, etc.) from
$_POST. - Packages these parameters into a URL.
- Performs a redirect to the chunked import page:
admin.php?page=hp-chunked-import, passing parameters in the URL.
- Validates the request and handles the file upload, moving the GEDCOM file to a temporary location:
Stage 2: The Workhorse (HP_Chunked_Import_Controller)
- Entry Point: Browser is redirected to the
hp-chunked-importpage. - Controller:
\LineagePress\Admin\Controllers\HP_Chunked_Import_Controller. - Actions:
- Reads all import parameters from the URL (
$_GET). - Renders a user interface with progress bar, statistics, and control buttons (Start, Pause, Stop).
- Localizes import parameters for frontend JavaScript (
chunked-import.js).
- Reads all import parameters from the URL (
- AJAX Processing:
- Frontend JavaScript makes AJAX calls to
hp_process_chunk. - Each call is handled by
ajax_process_chunk()inHP_Chunked_Import_Controller. - Delegates work to
\LineagePress\Import\Handlers\HP_Chunked_Import_Handler.
- Frontend JavaScript makes AJAX calls to
Stage 3: The Core Logic (HP_Chunked_Import_Handler and Processors)
- Handler:
\LineagePress\Import\Handlers\HP_Chunked_Import_Handlerreads a chunk of the GEDCOM file. - Processors: Iterates through lines in the chunk, using Processor classes in
includes/Import/Processors/(e.g.,Simple_Individual_Processor,Simple_Family_Processor) to handle GEDCOM records (INDI,FAM). - Database: Each processor transforms data for the database and writes it using Database System classes, respecting the selected import mode.
Architecture Diagram:
[User Submits Form]
|
v
[HP_Import_Controller] -> (Handles upload, gathers settings)
|
v
(Redirect)
|
v
[HP_Chunked_Import_Controller] -> (Renders UI, handles AJAX)
|
v (AJAX call)
|
[HP_Chunked_Import_Handler] -> (Reads file chunk)
|
v
[Processors (Individual, Family, etc.)] -> (Parses records)
|
v
[Database Classes] -> (Writes to DB)
3. Import Modes
The system supports three distinct import modes:
| Mode | Form Value | SQL Strategy | Behavior |
|---|---|---|---|
| Replace Tree | yes | REPLACE INTO | Deletes all data for the selected tree first, then imports all records as new. Overwrites manual edits. |
| Update Tree | match | INSERT ... ON DUPLICATE KEY UPDATE with COALESCE | Adds new records and fills empty fields on existing records. Preserves manual edits. |
| Add Only | '' (empty) | INSERT IGNORE | Only adds records that do not already exist. Never modifies existing records. |
The import mode is passed through the chain and checked by processors for SQL behavior.
4. Key Classes and Directories
admin/controllers/class-hp-import-controller.php– Handles the initial form and redirects.admin/controllers/class-hp-chunked-import-controller.php– Manages AJAX-based chunked processing.includes/Import/Handlers/class-hp-chunked-import-handler.php– Core logic for reading chunks and delegating.includes/Import/Processors/– Classes for each GEDCOM record (e.g.,INDI,FAM,SOUR).includes/Import/Migration/– Converts data from older GEDCOM versions.includes/Import/Validation/– Validates GEDCOM file structure and data.public/js/chunked-import.js– Frontend JS for AJAX and progress bar.
5. How to Add a New GEDCOM Tag Processor
- Identify Record Type: Determine which record the tag belongs to (e.g.,
INDI). - Locate Processor: Open the relevant processor (e.g.,
class-simple-individual-processor.php). - Add Logic: Add a case/condition in the main processing loop for the new tag.
- Process & Save: Parse the tag data and save via Database System classes. Respect
$this->import_mode.
✅ This version uses:
- Clear heading hierarchy
- Clean spacing and indentation
- Manual-style callouts and emphasis
- Easy-to-read tables and code blocks
