Database scheme
Note : all dates and duration are stored in an integer manner (number of seconds since the EPOCH).
accounting
Fields |
Types |
Descriptions |
---|---|---|
window_start |
INT UNSIGNED |
start date of the accounting interval |
window_stop |
INT UNSIGNED |
stop date of the accounting interval |
accounting_user |
VARCHAR(20) |
user name |
accounting_project |
VARCHAR(255) |
name of the related project |
queue_name |
VARCHAR(100) |
queue name |
consumption_type |
ENUM(“ASKED”, “USED”) |
“ASKED” corresponds to the walltimes specified by the user. “USED” corresponds to the effective time used by the user. |
consumption |
INT UNSIGNED |
number of seconds used |
- Primary key
window_start, window_stop, accounting_user, queue_name, accounting_project, consumption_type
- Index fields
window_start, window_stop, accounting_user, queue_name, accounting_project, consumption_type
This table is a summary of the consumption for each user on each queue. This increases the speed of queries about user consumptions and statistic generation.
Data are inserted through the command oaraccounting (when a job is treated the field accounted in table jobs is passed into “YES”). So it is possible to regenerate this table completely in this way :
Delete all data of the table:
DELETE FROM accounting;Set the field accounted in the table jobs to “NO” for each row:
UPDATE jobs SET accounted = "NO";Run the oaraccounting command.
You can change the amount of time for each window : edit the oar configuration file and change the value of the tag ACCOUNTING_WINDOW.
schema
Fields |
Types |
Descriptions |
---|---|---|
version |
VARCHAR(255) |
database schema version number |
name |
VARCHAR(255) |
optional name |
This table is used to store the version of the database schema.
So the oar-database command be used to automatically upgrade the schema from any version with:
oar-database --setup
admission_rules
Fields |
Types |
Descriptions |
---|---|---|
id |
INT UNSIGNED |
id number |
rule |
TEXT |
rule written in Perl applied when a job is going to be registered |
- Primary key
id
- Index fields
None
You can use these rules to change some values of some properties when a job is submitted. So each admission rule is executed in the order of the id field and it can set several variables. If one of them exits then the others will not be evaluated and oarsub returns an error.
The rules can be added with the following command:
oaradmissionrules -n
Some examples are better than a long description:
Specify the default value for queue parameter
if (not defined($queue_name)) { $queue_name="default"; }Avoid users except oar to go in the admin queue
if (($queue_name eq "admin") && ($user ne "oar")) { die("[ADMISSION RULE] Only oar user can submit jobs in the admin queue\n"); }Restrict the maximum of the walltime for interactive jobs
my $max_walltime = OAR::IO::sql_to_duration("12:00:00"); if ($jobType eq "INTERACTIVE"){ foreach my $mold (@{$ref_resource_list}){ if ( (defined($mold->[1])) and ($max_walltime < $mold->[1]) ){ print("[ADMISSION RULE] Walltime to big for an INTERACTIVE job so it is set to $max_walltime.\n"); $mold->[1] = $max_walltime; } } }Specify the default walltime
my $default_wall = OAR::IO::sql_to_duration("2:00:00"); foreach my $mold (@{$ref_resource_list}){ if (!defined($mold->[1])){ print("[ADMISSION RULE] Set default walltime to $default_wall.\n"); $mold->[1] = $default_wall; } }How to perform actions if the user name is in a file
open(FILE, "/tmp/users.txt"); while (($queue_name ne "admin") and ($_ = <FILE>)){ if ($_ =~ m/^\\s*$user\\s*$/m){ print("[ADMISSION RULE] Change assigned queue into admin\n"); $queue_name = "admin"; } } close(FILE);How to automatically add a job type depending of the walltime and an estimation of the number of resources of the job
foreach my $e (estimate_job_nb_resources($dbh_ro, $ref_resource_list, $jobproperties)){ #print("AREA: $e->{nbresources} x $e->{walltime} = ".$e->{nbresources} * $e->{walltime}."\n"); if ($e->{nbresources} * $e->{walltime} > 24*3600*1){ print("[ADMISSION RULE] Your job is of the 'big' type\n"); push(@{$type_list},"big"); last; } }
You can print all the admission rules with:
oaradmissionrules -S -f
event_logs
Fields |
Types |
Descriptions |
---|---|---|
event_id |
INT UNSIGNED |
event identifier |
type |
VARCHAR(50) |
event type |
job_id |
INT UNSIGNED |
job related of the event |
date |
INT UNSIGNED |
event date |
description |
VARCHAR(255) |
textual description of the event |
to_check |
ENUM(‘YES’, ‘NO’) |
specify if the module NodeChangeState must check this event to Suspect or not some nodes |
- Primary key
event_id
- Index fields
type, to_check
The different event types are:
“PING_CHECKER_NODE_SUSPECTED” : the system detected via the module “finaud” that a node is not responding.
“PROLOGUE_ERROR” : an error occurred during the execution of the job prologue (exit code != 0).
“EPILOGUE_ERROR” : an error occurred during the execution of the job epilogue (exit code != 0).
“CANNOT_CREATE_TMP_DIRECTORY” : OAR cannot create the directory where all information files will be stored.
“CAN_NOT_WRITE_NODE_FILE” : the system was not able to write file which had to contain the node list on the first node (/tmp/OAR_job_id).
“CAN_NOT_WRITE_PID_FILE” : the system was not able to write the file which had to contain the pid of oarexec process on the first node (/tmp/pid_of_oarexec_for_job_id).
“USER_SHELL” : the system was not able to get informations about the user shell on the first node.
“EXIT_VALUE_OAREXEC” : the oarexec process terminated with an unknown exit code.
“SEND_KILL_JOB” : signal that OAR has transmitted a kill signal to the oarexec of the specified job.
“LEON_KILL_BIPBIP_TIMEOUT” : Leon module has detected that something wrong occurred during the kill of a job and so kill the local bipbip process.
“EXTERMINATE_JOB” : Leon module has detected that something wrong occurred during the kill of a job and so clean the database and terminate the job artificially.
“WORKING_DIRECTORY” : the directory from which the job was submitted does not exist on the node assigned by the system.
“OUTPUT_FILES” : OAR cannot write the output files (stdout and stderr) in the working directory.
“CANNOT_NOTIFY_OARSUB” : OAR cannot notify the oarsub process for an interactive job (maybe the user has killed this process).
“WALLTIME” : the job has reached its walltime.
“SCHEDULER_REDUCE_NB_NODES_FOR_RESERVATION” : this means that there is not enough nodes for the reservation and so the scheduler do the best and gives less nodes than the user wanted (this occurres when nodes become Suspected or Absent).
“BESTEFFORT_KILL” : the job is of the type besteffort and was killed because a normal job wanted the nodes.
“FRAG_JOB_REQUEST” : someone wants to delete a job.
“CHECKPOINT” : the checkpoint signal was sent to the job.
“CHECKPOINT_ERROR” : OAR cannot send the signal to the job.
“CHECKPOINT_SUCCESS” : system has sent the signal correctly.
“SERVER_EPILOGUE_TIMEOUT” : epilogue server script has time outed.
“SERVER_EPILOGUE_EXIT_CODE_ERROR” : epilogue server script did not return 0.
“SERVER_EPILOGUE_ERROR” : cannot find epilogue server script file.
“SERVER_PROLOGUE_TIMEOUT” : prologue server script has time outed.
“SERVER_PROLOGUE_EXIT_CODE_ERROR” : prologue server script did not return 0.
“SERVER_PROLOGUE_ERROR” : cannot find prologue server script file.
“CPUSET_CLEAN_ERROR” : OAR cannot clean correctly cpuset files for a job on the remote node.
“MAIL_NOTIFICATION_ERROR” : a mail cannot be sent.
“USER_MAIL_NOTIFICATION” : user mail notification cannot be performed.
“USER_EXEC_NOTIFICATION_ERROR” : user script execution notification cannot be performed.
“BIPBIP_BAD_JOBID” : error when retrieving informations about a running job.
“BIPBIP_CHALLENGE” : OAR is configured to detach jobs when they are launched on compute nodes and the job return a bad challenge number.
“RESUBMIT_JOB_AUTOMATICALLY” : the job was automatically resubmitted.
“WALLTIME” : the job reached its walltime.
“REDUCE_RESERVATION_WALLTIME” : the reservation job was shrunk.
“SSH_TRANSFER_TIMEOUT” : node OAR part script was too long to transfer.
“BAD_HASHTABLE_DUMP” : OAR transfered a bad hashtable.
“LAUNCHING_OAREXEC_TIMEOUT” : oarexec was too long to initialize itself.
“RESERVATION_NO_NODE” : All nodes were detected as bad for the reservation job.
event_log_hostnames
Fields |
Types |
Descriptions |
---|---|---|
event_id |
INT UNSIGNED |
event identifier |
hostname |
VARCHAR(255) |
name of the node where the event has occured |
- Primary key
event_id
- Index fields
hostname
This table stores hostnames related to events like “PING_CHECKER_NODE_SUSPECTED”.
files
Fields |
Types |
Descriptions |
---|---|---|
idFile |
INT UNSIGNED |
|
md5sum |
VARCHAR(255) |
|
location |
VARCHAR(255) |
|
method |
VARCHAR(255) |
|
compression |
VARCHAR(255) |
|
size |
INT UNSIGNED |
- Primary key
idFile
- Index fields
md5sum
frag_jobs
Fields |
Types |
Descriptions |
---|---|---|
frag_id_job |
INT UNSIGNED |
job id |
frag_date |
INT UNSIGNED |
kill job decision date |
frag_state |
ENUM(‘LEON’, ‘TIMER_ARMED’ , ‘LEON_EXTERMINATE’, ‘FRAGGED’) DEFAULT ‘LEON’ |
state to tell Leon what to do |
- Primary key
frag_id_job
- Index fields
frag_state
What do these states mean:
“LEON” : the Leon module must try to kill the job and change the state into “TIMER_ARMED”.
“TIMER_ARMED” : the Sarko module must wait a response from the job during a timeout (default is 60s)
“LEON_EXTERMINATE” : the Sarko module has decided that the job time outed and asked Leon to clean up the database.
“FRAGGED” : job is fragged.
gantt_jobs_resources
Fields |
Types |
Descriptions |
---|---|---|
moldable_job_id |
INT UNSIGNED |
moldable job id |
resource_id |
INT UNSIGNED |
resource assigned to the job |
- Primary key
moldable_job_id, resource_id
- Index fields
None
This table specifies which resources are attributed to which jobs.
gantt_jobs_resources_visu
Fields |
Types |
Descriptions |
---|---|---|
moldable_job_id |
INT UNSIGNED |
moldable job id |
resource_id |
INT UNSIGNED |
resource assigned to the job |
- Primary key
moldable_job_id, resource_id
- Index fields
None
This table is the same as gantt_jobs_resources and is used by visualisation tools. It is updated atomically (a lock is used).
gantt_jobs_predictions
Fields |
Types |
Descriptions |
---|---|---|
moldable_job_id |
INT UNSIGNED |
job id |
start_time |
INT UNSIGNED |
date when the job is scheduled to start |
- Primary key
moldable_job_id
- Index fields
None
With this table and gantt_jobs_resources you can know exactly what are the decisions taken by the schedulers for each waiting jobs.
- note
The special job id “0” is used to store the scheduling reference date.
gantt_jobs_predictions_visu
Fields |
Types |
Descriptions |
---|---|---|
moldable_job_id |
INT UNSIGNED |
job id |
start_time |
INT UNSIGNED |
date when the job is scheduled to start |
- Primary key
job_id
- Index fields
None
This table is the same as gantt_jobs_predictions and is used by visualisation tools. It is made up to date in an atomic action (with a lock).
jobs
Fields |
Types |
Descriptions |
---|---|---|
job_id |
INT UNSIGNED |
job identifier |
array_id |
INT |
array identifier |
array_index |
INT |
index of the job in the array |
initial_request |
TEXT |
oarsub initial arguments |
job_name |
VARCHAR(100) |
name given by the user |
cpuset_name |
VARCHAR(255) |
name of the cpuset directory used for this job on each nodes |
job_type |
ENUM(‘INTERACTIVE’, ‘PASSIVE’) DEFAULT ‘PASSIVE’ |
specify if the user wants to launch a program or get an interactive shell |
info_type |
VARCHAR(255) |
some informations about oarsub command |
state |
ENUM(‘Waiting’,’Hold’, ‘toLaunch’, ‘toError’, ‘toAckReservation’, ‘Launching’, ‘Running’ ‘Suspended’, ‘Resuming’, , ‘Finishing’, ‘Terminated’, ‘Error’) |
job state |
reservation |
ENUM(‘None’, ‘toSchedule’, ‘Scheduled’) DEFAULT ‘None’ |
specify if the job is a reservation and the state of this one |
message |
VARCHAR(255) |
readable information message for the user |
job_user |
VARCHAR(255) |
user name |
command |
TEXT |
program to run |
queue_name |
VARCHAR(100) |
queue name |
properties |
TEXT |
properties that assigned nodes must match |
launching_directory |
TEXT |
path of the directory where to launch the user process |
submission_time |
INT UNSIGNED |
date when the job was submitted |
start_time |
INT UNSIGNED |
date when the job was launched |
stop_time |
INT UNSIGNED |
date when the job was stopped |
file_id |
INT UNSIGNED |
|
accounted |
ENUM(“YES”, “NO”) DEFAULT “NO” |
specify if the job was considered by the accounting mechanism or not |
notify |
VARCHAR(255) |
gives the way to notify the user about the job (mail or script ) |
assigned_moldable_job |
INT UNSIGNED |
moldable job chosen by the scheduler |
checkpoint |
INT UNSIGNED |
number of seconds before the walltime to send the checkpoint signal to the job |
checkpoint_signal |
INT UNSIGNED |
signal to use when checkpointing the job |
stdout_file |
TEXT |
file name where to redirect program STDOUT |
stderr_file |
TEXT |
file name where to redirect program STDERR |
resubmit_job_id |
INT UNSIGNED |
if a job is resubmitted then the new one store the previous |
project |
VARCHAR(255) |
arbitrary name given by the user or an admission rule |
suspended |
ENUM(“YES”,”NO”) |
specify if the job was suspended (oarhold) |
job_env |
TEXT |
environment variables to set for the job |
exit_code |
INT DEFAULT 0 |
exit code for passive jobs |
job_group |
VARCHAR(255) |
not used |
- Primary key
job_id
- Index fields
state, reservation, queue_name, accounted, suspended
Explications about the “state” field:
“Waiting” : the job is waiting OAR scheduler decision.
“Hold” : user or administrator wants to hold the job (oarhold command). So it will not be scheduled by the system.
“toLaunch” : the OAR scheduler has attributed some nodes to the job. So it will be launched.
“toError” : something wrong occurred and the job is going into the error state.
“toAckReservation” : the OAR scheduler must say “YES” or “NO” to the waiting oarsub command because it requested a reservation.
“Launching” : OAR has launched the job and will execute the user command on the first node.
“Running” : the user command is executing on the first node.
“Suspended” : the job was in Running state and there was a request (oarhold with “-r” option) to suspend this job. In this state other jobs can be scheduled on the same resources (these resources has the “suspended_jobs” field to “YES”).
“Finishing” : the user command has terminated and OAR is doing work internally
“Terminated” : the job has terminated normally.
“Error” : a problem has occurred.
Explications about the “reservation” field:
“None” : the job is not a reservation.
“toSchedule” : the job is a reservation and must be approved by the scheduler.
“Scheduled” : the job is a reservation and is scheduled by OAR.
job_dependencies
Fields |
Types |
Descriptions |
---|---|---|
job_id |
INT UNSIGNED |
job identifier |
job_id_required |
INT UNSIGNED |
job needed to be completed before launching job_id |
- Primary key
job_id, job_id_required
- Index fields
job_id, job_id_required
This table is feeded by oarsub command with the “-a” option.
moldable_job_descriptions
Fields |
Types |
Descriptions |
---|---|---|
moldable_id |
INT UNSIGNED |
moldable job identifier |
moldable_job_id |
INT UNSIGNED |
corresponding job identifier |
moldable_walltime |
INT UNSIGNED |
instance duration |
- Primary key
moldable_id
- Index fields
moldable_job_id
A job can be described with several instances. Thus OAR scheduler can choose one of them. For example it can calculate which instance will finish first. So this table stores all instances for all jobs.
job_resource_groups
Fields |
Types |
Descriptions |
---|---|---|
res_group_id |
INT UNSIGNED |
group identifier |
res_group_moldable_id |
INT UNSIGNED |
corresponding moldable job identifier |
res_group_property |
TEXT |
SQL constraint properties |
- Primary key
res_group_id
- Index fields
res_group_moldable_id
As you can specify job global properties with oarsub and the “-p” option, you can do the same thing for each resource groups that you define with the “-l” option.
job_resource_descriptions
Fields |
Types |
Descriptions |
---|---|---|
res_job_group_id |
INT UNSIGNED |
corresponding group identifier |
res_job_resource_type |
VARCHAR(255) |
resource type (name of a field in resources) |
res_job_value |
INT |
wanted resource number |
res_job_order |
INT UNSIGNED |
order of the request |
- Primary key
res_job_group_id, res_job_resource_type, res_job_order
- Index fields
res_job_group_id
This table store the hierarchical resource description given with oarsub and the “-l” option.
job_state_logs
Fields |
Types |
Descriptions |
---|---|---|
job_state_log_id |
INT UNSIGNED |
identifier |
job_id |
INT UNSIGNED |
corresponding job identifier |
job_state |
ENUM(‘Waiting’, ‘Hold’, ‘toLaunch’, ‘toError’, ‘toAckReservation’, ‘Launching’, ‘Finishing’, ‘Running’, ‘Suspended’, ‘Resuming’, ‘Terminated’, ‘Error’) |
job state during the interval |
date_start |
INT UNSIGNED |
start date of the interval |
date_stop |
INT UNSIGNED |
end date of the interval |
- Primary key
job_state_log_id
- Index fields
job_id, job_state
This table keeps informations about state changes of jobs.
job_types
Fields |
Types |
Descriptions |
---|---|---|
job_type_id |
INT UNSIGNED |
identifier |
job_id |
INT UNSIGNED |
corresponding job identifier |
type |
VARCHAR(255) |
job type like “deploy”, “timesharing”, … |
type_index |
ENUM(‘CURRENT’, ‘LOG’) |
index field |
- Primary key
job_type_id
- Index fields
job_id, type
This table stores job types given with the oarsub command and “-t” options.
resources
Fields |
Types |
Descriptions |
---|---|---|
resource_id |
INT UNSIGNED |
resource identifier |
type |
VARCHAR(100) DEFAULT “default” |
resource type (used for licence resources for example) |
network_address |
VARCHAR(100) |
node name (used to connect via SSH) |
state |
ENUM(‘Alive’, ‘Dead’ , ‘Suspected’, ‘Absent’) |
resource state |
next_state |
ENUM(‘UnChanged’, ‘Alive’, ‘Dead’, ‘Absent’, ‘Suspected’) DEFAULT ‘UnChanged’ |
state for the resource to switch |
finaud_decision |
ENUM(‘YES’, ‘NO’) DEFAULT ‘NO’ |
tell if the actual state results in a “finaud” module decision |
next_finaud_decision |
ENUM(‘YES’, ‘NO’) DEFAULT ‘NO’ |
tell if the next node state results in a “finaud” module decision |
state_num |
INT |
corresponding state number (useful with the SQL “ORDER” query) |
suspended_jobs |
ENUM(‘YES’,’NO’) |
specify if there is at least one suspended job on the resource |
scheduler_priority |
INT UNSIGNED |
arbitrary number given by the system to select resources with more intelligence |
switch |
VARCHAR(50) |
name of the switch |
cpu |
INT UNSIGNED |
global cluster cpu number |
cpuset |
INT UNSIGNED |
field used with the JOB_RESOURCE_MANAGER_PROPERTY_DB_FIELD |
besteffort |
ENUM(‘YES’,’NO’) |
accept or not besteffort jobs |
deploy |
ENUM(‘YES’,’NO’) |
specify if the resource is deployable |
expiry_date |
INT UNSIGNED |
field used for the desktop computing feature |
desktop_computing |
ENUM(‘YES’,’NO’) |
tell if it is a desktop computing resource (with an agent) |
last_job_date |
INT UNSIGNED |
store the date when the resource was used for the last time |
available_upto |
INT UNSIGNED |
used with compute mode features to know if an Absent resource can be switch on |
- Primary key
resource_id
- Index fields
state, next_state, type, suspended_jobs
State explications:
“Alive” : the resource is ready to accept a job.
“Absent” : the oar administrator has decided to pull out the resource. This computer can come back.
“Suspected” : OAR system has detected a problem on this resource and so has suspected it (you can look in the event_logs table to know what has happened). This computer can come back (automatically if this is a “finaud” module decision).
“Dead” : The oar administrator considers that the resource will not come back and will be removed from the pool.
This table permits to specify different properties for each resources. These can be used with the oarsub command (“-p” and “-l” options).
You can add your own properties with oarproperty command.
These properties can be updated with the oarnodesetting command (“-p” option).
Several properties are added by default:
switch : you have to register the name of the switch where the node is plugged.
cpu : this is a unique name given to each cpus. This enables OAR scheduler to distinguish all cpus.
cpuset : this is the name of the cpu on the node. The Linux kernel sets this to an integer beginning at 0. This field is linked to the configuration tag JOB_RESOURCE_MANAGER_PROPERTY_DB_FIELD.
resource_logs
Fields |
Types |
Descriptions |
---|---|---|
resource_log_id |
INT UNSIGNED |
unique id |
resource_id |
INT UNSIGNED |
resource identifier |
attribute |
VARCHAR(255) |
name of corresponding field in resources |
value |
VARCHAR(255) |
value of the field |
date_start |
INT UNSIGNED |
interval start date |
date_stop |
INT UNSIGNED |
interval stop date |
finaud_decision |
ENUM(‘YES’,’NO’) |
store if this is a system change or a human one |
- Primary key
None
- Index fields
resource_id, attribute
This table permits to keep a trace of every property changes (consequence of the oarnodesetting command with the “-p” option).
assigned_resources
Fields |
Types |
Descriptions |
---|---|---|
moldable_job_id |
INT UNSIGNED |
job id |
resource_id |
INT UNSIGNED |
resource assigned to the job |
- Primary key
moldable_job_id, resource_id
- Index fields
moldable_job_id
This table keeps informations for jobs on which resources they were scheduled.
queues
Fields |
Types |
Descriptions |
---|---|---|
queue_name |
VARCHAR(100) |
queue name |
priority |
INT UNSIGNED |
the scheduling priority |
scheduler_policy |
VARCHAR(100) |
path of the associated scheduler |
state |
ENUM(‘Active’, ‘notActive’) DEFAULT ‘Active’ |
permits to stop the scheduling for a queue |
- Primary key
queue_name
- Index fields
None
This table contains the schedulers executed by the oar_meta_scheduler module. Executables are launched one after one in the specified priority.
challenges
Fields |
Types |
Descriptions |
---|---|---|
job_id |
INT UNSIGNED |
job identifier |
challenge |
VARCHAR(255) |
challenge string |
ssh_private_key |
TEXT DEFAULT NULL |
ssh private key given by the user (in grid usage it enables to connect onto all nodes of the job of all clusers with oarsh) |
ssh_public_key |
TEXT DEFAULT NULL |
ssh public key |
- Primary key
job_id
- Index fields
None
This table is used to share a secret between OAR server and oarexec process on computing nodes (avoid a job id being stolen/forged by malicious user).
For security reasons, this table must not be readable for a database account given to users who want to access OAR internal informations(like statistics).