Data Formats¶
This document describes the data models and formats used by almanac, which are built using Pydantic for type validation and structured data handling.
Data Model Architecture¶
almanac uses three main Pydantic data models to represent SDSS survey data:
Exposure: Metadata for individual exposures/observationsFPSTarget: Target information for SDSS-V Fiber Positioning System observationsPlateTarget: Target information for SDSS-IV plate-based observations
These models provide type validation, automatic conversion, and structured access to survey data.
Exposure Model¶
The Exposure model (almanac.data_models.Exposure) represents the metadata for a single astronomical exposure.
Basic Information¶
Field |
Type |
Description |
|---|---|---|
|
|
Observatory name (“apo” or “lco”) |
|
|
Modified Julian Date |
|
|
Exposure number (≥ 1) |
|
|
Raw exposure basename prefix (“apR”, “asR”) |
Exposure Metadata¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
Plugged plate name or configuration identifier |
|
|
|
|
Number of detector reads (≥ 0) |
|
|
|
Image type (see ImageType literals) |
|
|
|
Observer comments |
Survey Identifiers¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
|
Map identifier |
|
|
|
Cartridge ID |
|
|
|
Plate ID (SDSS-IV) |
|
|
|
Field identifier |
|
|
|
Design identifier |
|
|
|
Configuration ID (SDSS-V) |
Observing Conditions¶
Field |
Type |
Description |
|---|---|---|
|
|
Seeing in arcseconds |
Instrument State¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
Focus position |
|
|
|
Collimator piston |
|
|
|
Collimator pitch |
|
|
|
|
Dither offset in pixels |
|
|
|
Quartz lamp state (-1, 0, 1) |
|
|
|
ThAr lamp state (-1, 0, 1) |
|
|
|
UNe lamp state (-1, 0, 1) |
Computed Properties¶
Property |
Type |
Description |
|---|---|---|
|
|
Formatted exposure string for file paths |
|
|
Whether this is from the FPS era |
|
|
Whether exposure is flagged as bad |
|
|
Bitmask indicating available chips |
Type Definitions¶
Observatory: Literal["apo", "lco"]
Prefix: Literal["apR", "asR"]
ImageType: Literal["blackbody", "dark", "object", "domeflat", "arclamp", "twilightflat", "internalflat", "quartzflat", "missing"]
FPSTarget Model¶
The FPSTarget model (almanac.data_models.FPSTarget) represents targets observed with the SDSS-V Fiber Positioning System.
Target Identification¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
SDSS unique identifier |
|
|
|
Catalog identifier |
|
|
|
|
2MASS designation |
|
|
Target category |
|
|
|
Observing cadence |
|
|
|
Primary carton name |
|
|
|
Survey program |
Positioner Information¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
|
Positioner ID |
|
|
|
Hole identifier |
|
|
|
Hole type |
|
|
|
Planned hole type |
|
|
|
Fiber type |
|
|
Target assigned to fiber |
Status Flags¶
Field |
Type |
Description |
|---|---|---|
|
|
Fiber positioned on target |
|
|
Fiber is disabled |
|
|
Valid coordinate conversion |
|
|
Positioner was decollided |
Coordinates¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
|
Right Ascension (degrees) |
|
|
|
Declination (degrees) |
|
|
|
Observed altitude (degrees) |
|
|
|
Observed azimuth (degrees) |
Position Coordinates¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
|
X coordinate in wok frame |
|
|
|
Y coordinate in wok frame |
|
|
|
Z coordinate in wok frame |
|
|
|
X coordinate in focal plane |
|
|
|
Y coordinate in focal plane |
Positioner Angles¶
Field |
Type |
Description |
|---|---|---|
|
|
Alpha angle of positioner arm |
|
|
Beta angle of positioner arm |
Wavelength Information¶
Field |
Type |
Description |
|---|---|---|
|
|
Design wavelength |
|
|
Effective wavelength |
|
|
Coordinate epoch |
Instrument Identifiers¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
|
Spectrograph ID |
|
|
|
Fiber ID |
Position Offsets¶
Field |
Type |
Description |
|---|---|---|
|
|
RA offset applied to fiber |
|
|
Dec offset applied to fiber |
Target of Opportunity¶
Field |
Type |
Description |
|---|---|---|
|
|
Target of opportunity flag |
|
|
TOO identifier |
|
|
TOO program name |
PlateTarget Model¶
The PlateTarget model (almanac.data_models.PlateTarget) represents targets observed with SDSS-IV plates.
Target Identification¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
SDSS unique identifier |
|
|
|
Catalog identifier |
|
|
|
Computed 2MASS designation |
|
|
|
|
2MASS ID |
|
|
|
Target ID string |
|
|
|
Target category |
Survey Information¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
Observatory for mapping fixes |
|
|
|
|
Hole type |
|
|
|
Planned hole type |
|
|
|
Object type |
|
|
Assignment flag |
Status Flags¶
Field |
Type |
Description |
|---|---|---|
|
|
Fiber conflict flag |
|
|
Ran out of targets flag |
|
|
Outside survey area flag |
Coordinates¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
Right Ascension (degrees) |
|
|
|
Declination (degrees) |
|
|
|
|
X focal plane coordinate |
|
|
|
Y focal plane coordinate |
|
|
Default X focal coordinate |
|
|
|
Default Y focal coordinate |
Wavelength Information¶
Field |
Type |
Description |
|---|---|---|
|
|
Effective wavelength |
|
|
Z offset |
Instrument Identifiers¶
Field |
Type |
Alias |
Description |
|---|---|---|---|
|
|
|
Spectrograph ID |
|
|
|
Fiber ID |
|
|
|
Planned fiber ID |
|
|
Throughput value |
Plate-Specific Fields¶
Field |
Type |
Description |
|---|---|---|
|
|
Plate input ID |
|
|
Pointing number |
|
|
Offset value |
|
|
Block number |
|
|
Guide flag |
|
|
Blue fiber flag |
|
|
Chunk number |
|
|
Final flag |
|
|
MJD when plate was plugged |
|
|
Fiber mapping fix indicator |
Physical Properties¶
Field |
Type |
Description |
|---|---|---|
|
|
Target diameter |
|
|
Buffer size |
|
|
Target priority |
Type Definitions¶
Category Types¶
Category: Literal["", "science", "sky_apogee", "sky_boss", "standard_apogee", "standard_boss", "open_fiber"]
Hole Types¶
HoleType: Literal["object", "coherent_sky", "guide", "light_trap", "alignment", "quality", "manga", "manga_single", "manga_alignment", "acquisition_center", "acquisition_offaxis", "apogee", "center", "trap", "boss", "apogee_shared", "apogee_south", "bosshalf", "boss_shared", "fps"]
Object Types¶
ObjType: Literal["galaxy", "qso", "star_bhb", "star_carbon", "star_brown_dwarf", "star_sub_dwarf", "star_caty_var", "star_red_dwarf", "star_white_dwarf", "redden_std", "spectrophoto_std", "hot_std", "rosat_a", "rosat_b", "rosat_c", "rosat_d", "serendipity_blue", "serendipity_first", "serendipity_red", "serendipity_distant", "serendipity_manual", "qa", "sky", "na"]
Data Validation and Processing¶
Automatic Data Cleaning¶
The Pydantic models include automatic validation and cleaning:
Empty strings to integers: Convert empty strings to default integer values
Lamp states: Convert ‘F’/‘T’ strings to 0/1 integers
Float validation: Handle invalid values by setting to NaN
Case normalization: Automatic lowercase conversion for categorical fields
Comment sanitization: Clean observer comments and handle None values
Model Validators¶
Exposure Model¶
Validates prefixes based on observatory
Sanitizes observer comments
Converts lamp states from boolean strings
Automatically detects twilight flats from comments
FPSTarget Model¶
Fixes early fiber mapping duplicates for spectrograph 2
Standardizes 2MASS designations
PlateTarget Model¶
Applies historical fiber mapping fixes for specific MJD ranges
Translates plate-era categories to FPS-era equivalents
Handles observatory-specific fiber corrections
Computed Properties¶
Models include computed properties that provide additional derived information:
Exposure strings: Formatted identifiers for file paths
Era detection: Automatic FPS/plate era classification
Quality flags: Bad exposure detection
2MASS designations: Standardized target identifiers
HDF5 Output Structure¶
When saving data, almanac organizes the Pydantic model data into HDF5 format:
filename.h5
├── {observatory}/ # Observatory (apo/lco)
│ └── {mjd}/ # Modified Julian Date
│ ├── exposures # Exposure model data
│ └── targets/ # Target model data
│ ├── fps/ # FPSTarget data by config_id
│ └── plates/ # PlateTarget data by plate_id
Data Access¶
Python Example:
from almanac.data_models import Exposure, FPSTarget, PlateTarget
# Create exposure from file headers
exposure = Exposure.from_path("/path/to/exposure/file")
# Access computed properties
print(f"FPS era: {exposure.fps}")
print(f"Exposure string: {exposure.exposure_string}")
# Access targets (automatically loaded)
for target in exposure.targets:
if isinstance(target, FPSTarget):
print(f"SDSS ID: {target.sdss_id}, Fiber: {target.fiber_id}")
elif isinstance(target, PlateTarget):
print(f"SDSS ID: {target.sdss_id}, Fiber: {target.fiber_id}")
Data Quality and Validation¶
Missing Value Handling¶
Numeric fields: Default to appropriate values (NaN for floats, -1 for IDs)
String fields: Default to empty strings
Boolean fields: Default to False
Optional fields: Can be None
Type Safety¶
Pydantic ensures type safety throughout the data pipeline:
Automatic type conversion where possible
Validation errors for incompatible data
Consistent data structures across the application
Quality Assurance¶
Bad exposure flagging: Automatic detection of known problematic exposures
Fiber mapping corrections: Historical fixes for known fiber mapping errors
Coordinate validation: Ensures valid sky coordinates
Identifier consistency: Validates survey identifiers and relationships